Category Archives: fracking

Earthquakes: Reverse-Geocoded Files Posted to frackingdata.info/downloads

As I promised earlier, I’ve downloaded earthquakes from NCEDC’s web site (1898 to date), reverse-geocoded them via GeoNames and K-D Trees (thereby obtaining their country, state, county, and city/village values), archived the resulting files via 7-ZIP and uploaded both the CSV and SQLite datasets to:

I have authored a program in Python 3 that reverse-geocodes (via GeoNames and K-D Trees) the lat/longs into their respective countries, states, counties, and cities/villages.  I will post a link to the open-source project shortly once I’ve vetted its license and repository.  The program processes nearly 3 million rows in approximately 240 seconds.

Advertisements

Status Update 2016-05-02: FrackingData_FracFocusRegistry 2016-04 Files Uploaded

As of 02 May 2016, various files (e.g. SQlite, CSV, and PgSQL) derived from FracFocus.org’s March 2016 FracFocusRegistry have been downloaded, extracted, transformed, loaded, archived, and uploaded to the frackingdata.info/downloads site and their respective links also posted to FrackingData’s FracFocus Data Page .

The substantial delay between the last posting of the transformed FracFocusRegistry download in early March and this one in May was mostly due to FracFocus NOT posting anything until 26 April 2016.  This tardiness on FracFocus’s part is becoming a pattern.

Once again, of significance this time was that the download of the files from the FracFocus.org website and their subsequent extract, transform, load, archiving, and exporting to CSV, SQLite, and PostgreSQL files was performed by a Windows batch script without human intervention. This automated method shaved hours from the extract, transform, load, archive, and export process.  In addition, the batch script now uses WinSCP to automatically upload the files in question to the http://frackingdata.info/downloads page.

When this Windows batch file is sufficiently stable, and I’ve soft-coded the data-cleansing views into the script itself,  I’ll post a link to it in the Source Code section of this blog.  Soft-coding of the data-cleansing views is the last hurdle to publishing this script.

Khepry Quixote 2016-05-02

Breaking the “Fracking Wall”

This post describes why I’ve resolved to break the “fracking wall” surrounding the data sources of oil well locations, fracking chemical disclosures, and earthquake sources.

BACKGROUND

I am a software/database/systems developer/designer/analyst with over thirty (30) of IT experience in a variety of domains: petrochemical plant applications, tax appraisal, county-level governmental agencies, law enforcement applications, point-of-sale systems, data warehousing and analysis, insurance, near-realtime aircraft/vessel dispatch and tracking, mapping applications, search engines, desktop and web applications, health care extraction, transformation, loading (ETL) and analysis. In short, there’s not a lot I haven’t done over my career.

SELF-EDUCATION

One of my continuing challenges is to self-educate on emerging languages, databases, and software on a frequent basis. This I do as a “night job” a few nights each week, every week, every month of every year. Having enjoyed applications involving mapping the most, in the Spring of 2012 I decided on a course of self-education with variety of mapping packages, but covering a single domain with free information: earthquakes. I choose this domain for no other reason than the data was freely available and of modest size, sources of publicly-available data being just a few million records.

EARTHQUAKES

And so I merrily went about my self-education on various mapping packages using the free source of earthquake data, enjoying positive results and pretty graphics along the way. Then, quite to my surprise, “swarms” of earthquakes began to materialize on the various maps I was creating. Interestingly enough, some of those swarms were in Oklahoma, a state of the union in which I had the privilege of living in from 1979 through 1982. What struck me as interesting was that I didn’t recall that many, actually relatively few, earthquakes during those three years I lived in that state. Needless to say, my interest was piqued.

SWARMS EMERGE

So, I began to plot out the earthquakes for Oklahoma on a wider scale, on a year-by-year basis, and I could discern that there were “swarms” of earthquakes materializing in places where there had been very few in the preceding decades. As I have a B.S. in Zoology, a scientific bend to my mind and an absolute passion for the discernment of emerging patterns, mental alarm bells went off that I was seeing an emerging pattern that might have a more anthropogenic than natural origin. Casually, as this was a “night job,” I began searching the Internet for possible causalities and ran across the hypothesis that hydro-fracturing a.k.a. “fracking” operations, specifically the injection of water and chemicals into “fracking” and underground “disposal” wells, was causing the emerging swarms of earthquakes.

EARTHQUAKE SWARMS vs. OIL WELL LOCATIONS

At a magnitude of 5.6, the largest earthquake ever recorded in Oklahoma up to that time struck on November 5, 2011, being preceded by 4.7 through 5.0 foreshocks earlier in the day. It was this earthquake and its foreshocks that really raised my interest as I was mapping not only the locations of the earthquakes on the map but also their intensity via color, the more reddish the stronger the quake. To me, where there’s smoke there’s fire, and that being said I resolved to start mapping out well locations as well. It was my quest for well locations on a state-by-state basis that turn self-education into an avocation of sorts, and introduced me to the “fracking wall.”

HITTING THE “FRACKING WALL”

In an effort to obtain the locations of oil wells, I contacted various state agencies of the State of Oklahoma with virtually no results. It wasn’t that the data wasn’t available, it’s that what data was available was not easily downloaded and most importantly did not contain the latitudes and longitudes of the wells. In other words, I could roll the cigar between my fingers but I could neither light nor puff upon it. I was told by one state official, emphatically, that such location data was not available. Agency-by-agency, I wrote and/or called the appropriate personnel, and although most of the employees were polite, they were also equally unhelpful. It took me several months to find out where the data sets containing oil well location data had been posted. There was one, I repeat one, mention of a link to Oklahoma’s oil well location datasets in an obscure forum in a backwater of the Internet. This was the clue I needed, and finally I was able to plot the oil well locations against the occurrence of earthquakes and confirm that “where there’s smoke, there’s fire.” The refusal of the State of Oklahoma to point me to the location(s) of oil well location data was my first experience with the “fracking wall,” and it wouldn’t be my last although the “fracking wall” would be manifested by different states and agencies in different ways.

BREAKING THE “FRACKING WALL”

Because of the State of Oklahoma’s behavior and lack of cooperation, I resolved to break the “fracking wall” for both myself and all others needing access to the same type of data. In an effort to collect all of the oil well location and fracking chemical disclosure hyperlinks in one place, as well as offer curated datasets of the aforementioned data, I created the frackingdata.org website with curated FracFocus data extracts, chemical toxicities and their datasets, state-by-state sources of well location data, and the source code used to extract the datasets into more usable forms. In short, I created frackingdata.org to be a one-stop shop for anyone wishing to conduct analysis of fracking-related data.

FUTURE INITIATIVES

+ Link earthquake data to oil well locations in a manner convenient to anyone wishing to analyze such data (In progress)

+ Automate the download, extraction, transformation, and loading of FracFocus.org data into datasets more suitable for use by analysts or citizen-scientists. (Done)

+ Transform the FracFocus.org GUID keys into more user-friendly integer keys that also reduce storage by over 25% (Done)

+ Push the curated data sets to ODATA repositories, e.g. Google Fusion Tables, so that analysts can more easily access the data via packages like Tableau, SAS, or R. (In progress)

+ Push more of the source code used to do this extraction, transformation, and loading to repositories like GitHub so that all may share in its presence and perhaps even contribute to its maintenance. (Partially done)

As I have a “day job,” progress is painful but the results are worth it.

Khepry Quixote
11 March 2016

Fracking Hell: Oklahoma, Earthquakes, Injection Wells, and Data Accessibility

Attempting to mash the earthquake and underground data into a cohesive user-interface has proved to be, to put it mildly, daunting.  It was much easier to find sources of earthquake data than it was to find any source of well data with any fields relevant to my needs.

The earthquake data was relative easy to come by, for example I found the following sources:

I downloaded the entire earthquake dataset from the Advanced National Seismic System (ANSS) beginning in 1898 through the present day and imported the data into an Apache Lucene index.  In short order I had a searchable earthquake index lacking but a few location-centric fields:

  • The country in which the earthquake occurred.
  • The state in which the earthquake occurred.
  • The county in which the earthquake occurred.

In order to associate the above needed fields with the earthquake data, I downloaded two ESRI-formatted shapefiles from the National Atlas:

And one shapefile from Mapping Hacks:

I then wrote a Java program that would read each earthquake record and link it to its associated country, state, and county available from the respective shapefile of each.  To do this I used a Java library at GeoTools-8.0-M3-bin.zip from GeoTools.org.

Well data was much more difficult to come by, especially with any fields relevant to my needs, for example:

  • The type of well, for example “oil”, “gas”, “inj” (for injection) was available as data, just not available as a field upon which one could query.  In other words, I could not query for just underground injection wells (“inj”).
  • The date each well became active, let alone its filing date, was not available via the web interface.
  • The location of each well, in latitude and longitude, was not available either.
  • Given the lack of the above information, I didn’t even concern myself with the lack of well depth information.

As an exercise in personal fortitude, I downloaded the wells for each county in the State of Oklahoma from the Oklahoma Corporation Commission’s Well Data System into one Excel spreadsheet per county.  I then wrote a Java program that read the well data within each county’s Excel spreadsheet and posted it to an Apache Lucene index.  I then zipped the Apache Lucene index and pushed it a web site so that it could be queried and viewed using Apache Solr’s VelocityResponseWriter browser interface.  The results of this effort can be viewed and queried here.

So, in concluding this post, I find the earthquake data adequate for my present needs but the well data lacking any useful date or location information to allow me to associate the earthquakes to the wells by either location or time.  As I am a persistent researcher, my next post will detail my further attempts at locating and downloading well data.

Bill of Rights for Fracking Information

  1. That all of the data and its documentation:
    1. Should be
      1. in a machine-readable form
      2. Suitable for aggregation
      3. And downloadable in a compact form (e.g. ZIP, 7z).
    2. Should be suitable for its nominal purposes of research and reporting by
        1. Reporters
        2. Data Analysts
        3. Citizen Scientists
        4. Regulators
    3. Should be released
      1. In a frequent and timely manner.
      2. With “delta” datasets available, with “delta” being differences between the current and previous releases.
        1. The “delta” datasets should contain the following machine-readable “images”:
          1. “Previous” image.
          2. “Current” image.
          3. “Changed” image with only the values that are different being reported.
    4. Should NOT reside:
      1. Behind a pay-wall.
      2. Behind a registration-wall.
    5. Should be accessible:
      1. Interactively.
      2. ReST-fully via an API.
    6. Should be curated in a manner consistent with:
      1. The norms of professional, responsible data-warehousing.
        1. For example, the elimination of extraneous TAB, LINEFEED, or DIACRITIC characters that should NOT appear within a column.
        2. The resolution of disparate geographical projections (e.g. NAD27, NAD83) into a unified geographical projection (WGS84) suitable for mapping via geographic information systems or platforms such as Google Maps (WGS84).
      2. The needs of others to reliably export the data to alternative formats (e.g. CSV, XML, JSON).