Get the peer-reviewed paper and data for free at PLOS ONE.

In the summer of 2016, researchers at Dr. Emily Bernhardt‘s lab at Duke University’s Department of Biology approached me with a question: could I derive a geospatial dataset depicting the annual locations of surface coal mining in the Central Appalachian region of the United States? Leading a team comprised of staff from Duke, SkyTruth, Appalachian Voices, and Google, I published the resulting novel dataset in July 2018. The open-access data show surface coal mining’s extent in each year from 1985 through 2015; my collaborators at SkyTruth have continued to update this dataset annually. Since publication, over 30 academic papers have cited our work. To explore the data yourself, view this interactive web map on SkyTruth’s website.
What’s so special about this work? Surface coal mines are big and easy to spot by looking at satellite imagery. While a previous dataset had mapped surface mining, it only depicted mines at a decadal interval from 1975 through 2005 and was not peer-reviewed. My dataset addressed these limitations by creating an annual dataset and providing open, peer-reviewed methods—and by offering open-source code so that the data may be updated over time. The annual interval dramatically helps researchers, governments, non-profits, and the public who want to correlate historical environmental measurements or human health indices with exact mine locations.
What did we find? While our project’s main goal entailed creating the open dataset and methods, we did run some summary statistics. From 1985 through 2015, we found approximately 2,900 km2 (1,100 mi2) of land had ever been converted into a surface mine. When incorporating mine data from the previously-published dataset, that figure was approximately 5,900 km2 (2,300 mi2)—about 1.2x as large as the state of Delaware, or 97% the size of Everglades National Park. We also showed that each metric ton of coal that has ever been produced by surface mining in this region is associated with 12 m2 (130 ft2) of mined land. While surface mining continues today, our data demonstrate that the rate of new mining has fallen since approximately 2008.
How did we do it? In short: Google Earth Engine (GEE). In broad terms, we analyzed over three decades of publicly-available satellite imagery using GEE, which is a free tool Google offers to conduct landscape-scale remote sensing analyses by using Google’s immense computer processing power, to find likely mine locations. We used a simple binary classification: Appalachia’s natural landscape is dense forest (other than water bodies, cities, or roads, which we excluded from our analysis,) so we labeled locations with no forest cover as likely mines. In more-technical language: We wrote multiple scripts using GEE’s JavaScript API to a) collect and aggregate Landsat images for the entire region on an annual basis; b) determine on a pixel-by-pixel basis the “greenest” pixel per year to create annual cloud-free, leaf-on composite images; c) calculate the normalized difference vegetation index (NDVI) of each pixel in each greenest composite image; and d) algorithmically set local NDVI thresholds under which a given pixel would be classified as likely mine for that year. As a great example of its processing power, GEE took our input Landsat dataset of well over 4 terabytes of images and produced the 31-year output dataset in a matter of minutes—after we did the hard work of writing the processing scripts!
Media coverage for the paper includes: Duke Today, UPI, Allegheny Front, Smithsonian Magazine, Yale Environment 360, Inside Climate News, and Think Progress, among others.
Last updated February 20, 2021.