Skip to content

oungk/IndyStarHazmat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

IndyStar Hazmat Investigation

As a Data Reporter on the IndyStar's Investigations team, I led the data-driven reporting process for the IndyStar's investigation of hazardous material transportation accidents in the Midwest.

I gathered data sources, asked research questions of the data, conducted exploratory data analysis and data integrity checks, merged the data with other relevant datasets, and conceptualized and created data visualizations. I fielded data-related questions from my collaborators and also engaged in shoe-leather reporting and interviewing.

This repo includes the final Rmd file that generates the statistics found in the final published project and also generates data that is correctly shaped for visualizations.

Methodology and Challenges

I studied a decade's worth of data from the Pipeline and Hazardous Materials Safety Administration about the movement of hazardous chemicals across Indiana, Illinois, Kentucky, Michigan and Ohio. I also utilized the Bureau of Economic Statistics GDP Chained Price Index, the National Center for Health Statistics Urban-Rural classification scheme and other urban-rural classifications.

Read about the full methodology behind this project here.

  • Challenge 1 | The Scope of the Project: Many of the preliminary decisions I had to make involved defining the scope of the investigation. I learned to balance competing priorities of feasiblity, deadline considerations, comprehensiveness, and relevance to readers. For example, I worked with my editor and spoke to PHMSA about the relevance of pipeline accidents to this project and ultimately decided that the data collection methodology and structure of pipeline accidents did not fit with the other modes of transportation.

  • Challenge 2 | Data Inconsistencies: I found and documented instances of missing and incorrect data through quantitative and qualitative measures. Based on my contextual research and communication with PHMSA, I identified a major error in PHMSA's data concerning damages in the East Palestine train derailment and ensured the information that the IndyStar and collaborators reported was accurate.

  • Challenge 3 | Demographic Information: Based on conversations from sources, I considered it incredibly important to study how communities of different demographics are affected by hazardous material transportation accidents. I tested Census Bureau race and ethnicity data against the PHMSA dataset, but found that the results were inconclusive due to data quality. I also tackled the question of what the best way to catagorize urban versus rural communities would be. I spoke to subject experts about different catagorizations of urban and rural and tested 3 different catagorizations to ensure the results with similar.

Explore The Reporting and Visualizations

Releases

No releases published

Packages

No packages published