I have authored a Python 3.6 program that extracts the various CSV files from within FracFocus.org’s new CSV compressed archive and then combines them into one CSV file in the order, based on date-time, that they were generated by FracFocus.
The GitHub project for this can be found at: https://github.com/Frackalyzer/PyZip2Src2tgt.
I will be pushing the resulting combined CSV file up to the FrackingData.org site shortly, but it’ll likely be tomorrow before that happens.
I have just noticed that FracFocus is now outputting the FracFocusRegistry data in the more import-friendly CSV format. That being said, it would be nice if FracFocus named the files with a leading zero where applicable as needed, for example:
- FracFocusRegistry_1.csv would be better named as FracFocusRegistry_01.csv
This would allow any file-combining programs to combine the files in the order in which they were generated, as right now the file reading order is as follows:
and so on.
Changing any file having a single digit to that with a leading zero would change the file input order to a more professional standard, e.g. _1.csv to a _01.csv
Please note that I’m working on a program now that will combine all of these CSVs into one file, and in the order that they were likely generated. In addition, there will be some data cleansing undertaken as some rows have returns and other strange characters within their columns.
As I promised earlier, I’ve downloaded earthquakes from NCEDC’s web site (1898 to date), reverse-geocoded them via GeoNames and K-D Trees (thereby obtaining their country, state, county, and city/village values), archived the resulting files via 7-ZIP and uploaded both the CSV and SQLite datasets to:
- FrackingData Reverse-Geocoded Earthquakes from 1898-01-01 to date (updated on an approximately monthly basis).
I have authored a program in Python 3 that reverse-geocodes (via GeoNames and K-D Trees) the lat/longs into their respective countries, states, counties, and cities/villages. This is the link to the open-source Python 3 reverse-geocoder project. The program processes nearly 3 million rows in approximately 240 seconds.
19 May 2017
As of 18 May 2017, various files (e.g. SQlite, CSV) derived from FracFocus.org’s May 2017 FracFocusRegistry have been downloaded, extracted, transformed, loaded, archived, and their respective links posted to FrackingData’s FracFocus Data Page .
This time, FracFocus posted their SQL Server backup on 15 May 2017, about 4 weeks later than its previous posting of 17 Apr 2017.
Once again, of significance this time was that the download of the files from the FracFocus.org website and their subsequent extract, transform, load, archiving, and exporting to CSV, SQLite, and PostgreSQL files was performed by a Windows batch script without human intervention. This automated method shaved hours from the extract, transform, load, archive, and export process.
When this Windows batch file is sufficiently stable, and I’ve soft-coded the data-cleansing views into the script itself, I’ll post a link to it in the Source Code section of this blog. Soft-coding of the data-cleansing views is the last hurdle to publishing this script.
Khepry Quixote 2017-05-18
Presently, some states have a downloadable Address Points file (e.g. Arkansas), while others don’t. An Address Points dataset, coupled with a Well Location database would allow spatial queries such as “Fetch those addresses with X feet of any well’s location.” This would be a useful feature in the auditing of flowlines that may pass near a any residence’s location, allowing an agency, auditor, journalist, or non-governmental organization (NGO) to verify that audits of the flowlines have been done on a periodic basis as specified by law or regulation, provided that the audit files are made public as well.
As this is a “night-job” task for me, it’s going to take a while for it to happen as my “day-job” allows.
Well Data Sources page has been updated for Arkansas hyperlinks that have been changed at some time in the past.
Because of the recent events in Colorado due to an uncapped flowline feeding gas from a well 178 feet away into the basement of a house and the subsequent explosion resulting in the deaths of two and the injury of one, there’s renewed interest in the locations of oil wells in the State of Colorado, especially those that might be near residential buildings (or any buildings for that matter).
To this end, I’ve extracted a shapefile from the COGCC’s GIS downloads site, tweaked it a bit to add an explanation column for the status of the well, and then exported the resulting datasets to both a CSV and SQLITE file. These CSV and SQLITE files were then compressed into a 7-ZIP archive suitable for download and inflation.
The 7-ZIP archive can be found at the following URL:
You can use: