Environmental Protection Agency


M-13-13 Milestone 9 - November 30th 2015

OMB Review Complete: OMB has completed the agency review for this milestone. Agencies should contact their OMB desk officer if anything looks incorrect.

Leading Indicators

These indicators are reviewed by the Office of Management and Budget

Review Status complete
Reviewer Rebecca Williams
Last Updated February 9, 2016, 8:24 pm EST by Justin Grimes

Assessment Summary

The Environmental Protection Agency should be applauded for their increase in licensing documentation (though 'Other License Provided' should be updated with specific licensing information), having 100% valid data.json, and successfully pointing to Data.gov for a human-readable list of all datasets at [agency].gov/data.

The Environmental Protection Agency needs to address the following issues in their Enterprise Data Inventory: * Provides a public EDI file, but fails to: provide it in the correct format, provide it as a dataset, (a minor point: the metadata format is incorrect, listing it as HTML on Data.gov: http://catalog.data.gov/dataset/u-s-environmental-protection-agencys-enterprise-data-inventory) * Fails to organize data assets in to Collections * Fails to document all non-public and restricted datasets, redactions, restrictive licenses and provides an explanation for non-disclosure (in the rights field) * Of note: only 1 Bureau and 5 Programs are cataloged and non-public and restricted public cataloging is much slower than other agencies.

Inventory Composition

Public Dataset Status

Dataset Link Quality

Status Indicator Automated Metrics
Overall Progress this Milestone
Inventory Updated this Quarter
3184 Number of Datasets
2344 Number of APIs
Schedule Delivered Crawl details
1 Bureaus represented
5 Programs represented
3134 Number of public datasets
11 Number of restricted public datasets
42 Number of non-public datasets
Inventory > Public listing
26% Percentage growth in records since last quarter
6,820 Spot Check - datasets listed by search engine
Agency provides a public Enterprise Data Inventory on Data.gov
100% License specified Crawl details
Status Indicator Automated Metrics
Overall Progress this Milestone
3184 Number of Datasets Crawl details
Number of Collections Crawl details
2531 Number of Public Datasets with File Downloads Crawl details
2344 Number of APIs Crawl details
4843 Total number of access and download links Crawl details
Quality Check: Links are sufficiently working Crawl details
1154 Quality Check: Accessible links Crawl details
3454 Quality Check: Redirected links Crawl details
6 Quality Check: Error links Crawl details
106 Quality Check: Broken links Crawl details
26% Percentage growth in records since last quarter
100% Valid Metadata Crawl details
/data exists Crawl details
/data.json Crawl details
Harvested by data.gov
2203 Views on data.gov for the quarter
Status Indicator Automated Metrics
Overall Progress this Milestone
Description of feedback mechanism delivered Crawl details
Data release is prioritized through public engagement
Feedback loop is closed, 2 way communication
See below Link to or description of Feedback Mechanism
Status Indicator Automated Metrics
Overall Progress this Milestone
Data Publication Process Delivered Crawl details
Information that should not to be made public is documented with agency's OGC
Status Indicator Automated Metrics
Overall Progress this Milestone
greene.ana@epa.gov Open Data Primary Point of Contact
POCs identified for required responsibilities
Status Indicator Automated Metrics
Overall Progress this Milestone
Identified 5 data improvements for this quarter
See below Primary Uses
Agency data is used on various third-party analyses, especially using Web based and mobile applications. Much of EPA’s inventory and source data are used to identify regulated facilities or other sources that are near users’ property, schools, hospitals, and any area of interest. It is also used to examine and understand pollution sources and their impacts on the environment. For datasets like the FRS Power Plant Map Service, which combines data from serval agencies, users can identify which power plants are burning natural gas, coal, and petroleum or identify which power plants are solar, wind, geothermal or nuclear.
Value or impact of data
See below Primary data discovery channels
The Agency has a wide variety of mechanisms through which users learn about EPA data. These include: Environmental Dataset Gateway (EDG) website MyEnvironment; Various EPA Program System websites and applications including AirNOW, Waters, Enforcement and Compliance History Online (ECHO), EJScreen, Clean Ups in My Community Envirofacts website; EnviroMapper; Facility Registry Service (FRS) website; National Geospatial Program website EPA GeoPlatform websites, EPA GeoPlatform press releases and EPA GeoPlatform listserv; EPA GeoPlatform training classes and webinars; EPA GeoPlatform Exhibit booths at various conferences; EPA GeoPlatform (GIO) weekly newsletter; Various EPA GeoPlatform working groups; EPA GeoPlatform blog entries; Error correction tool – located on numerous websites allows the public to submit corrected information about specific data points. EPA prize competitions tap the ingenuity of people outside the agency, across the U.S. and the world using EPA data to provide solutions to important issues. EPA outlines the specifications and criteria for a problem, and the public can submit ideas and proposals for successful solutions. Information on recent challenges can be found at: http://www2.epa.gov/innovation/examples-epa-prize-competitions Two EPA Challenges which are currently open are: 1. Nutrient Sensor Challenge Enters Phase 2! The Nutrient Sensor Challenge is launching the next phase of the effort to develop affordable, accurate, and reliable nutrient sensors. Now accepting submissions for final verification testing of nitrogen and phosphorus sensors, applications due December 18, 2015. Apply now! and the 2. Visualize Your Water Challenge Coming in January 2016! A mapping challenge for high school students in the Great Lakes basin and Chesapeake Bay watersheds in collaboration with Department of Education, U.S. Geological Survey, the Great Lakes Observing System and ESRI. Find out more EPA also regularly participates in non-agency sponsored hackathon events where users can learn more about our data and use our data to build applications. Over the summer the EPA collaborated with the Water Innovation Project which hosted the 2015 Water/Energy Nexus Hackathon. The Hackathon bridged a gap between technology, water and energy – allowing students, professionals and technology enthusiasts the opportunity to showcase their talents and innovation over a 36 hour period. The hackathon focused on leveraging data to help us better understand the dynamics of the water/energy nexus, using the latest software available. Some of the goals of the 2015 Water/Energy Nexus Hackathon were to: Engage a variety of water industry stakeholders in a collaborative learning environment with one another. Introduce individuals with a technology background to water issues and engage them in understanding the issues through competitive problem solving. Provide a high level of awareness within the industry to showcase the value of innovation and collaboration Develop software/hardware that may be commercially viable to encourage entrepreneurial endeavors
See below User suggestions on improving data usability
There has been suggestion from developers to improve documentation on APIs and datasets and to improve browsing and searching of available APIs and datasets. We are working to address these comments by developing a Shared Service Catalog which features improve search and browse capabilities for finding APIs. In addition, users have asked EPA to make it easier to integrate EPA GeoPlatform system (including this EPA FRS Power Plant Map service) with various other EPA projects and provide a user-friendly interface to visualize the available EPA data hosted within the EPA GeoPlatform.
See below User suggestions on additional data releases
Improved formats for Air Quality System data, Chemical Data Reporting data, and Facilities Registry Services data (released as JSON or RDF format). Improved access to pesticide data in the form of a Web service. Users have asked for access to Drinking Water intake locations. Some EPA data services have been published on EPA systems that are not easy to find or easy to use. Users have asked EPA to make these additional EPA data services easier to find and easier to use from a user-friendly system like the EPA GeoPlatform system.
Digital Analytics Program on /data

Automated Metrics

These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot

Expected Data.json URL http://www.epa.gov/data.json (From USA.gov Directory)
Resolved Data.json URL https://edg.epa.gov/data.json
Number of Redirects 3 redirects
HTTP Status 200
Content Type application/json
Valid JSON Valid
Datasets with Valid Metadata 100%(3184 of 3184)
Valid Schema Valid
Datasets 3184
Number of Collections 0
Datasets with Distribution URLs 79.5% (2531 of 3184)
Datasets with Download URLs 78.5% (2498 of 3184)
Total Distribution URLs 4842
Total Download URLs 2498
Total APIs 2344
Public Datasets 3132
Restricted Public Datasets 11
Non-public Datasets 41
Bureaus Represented 1
Programs Represented 5
License Specified 100% (3184 of 3184)
Datasets with Redactions 0.0% (0 of 3184)
Redactions without explanation (rights field) 0.0% (0 of 3184)
File Size 6.57MB
Last modified Wednesday, 25-Nov-2015 02:11:26 EST
Last crawl Monday, 30-Nov-2015 23:11:33 EST
Analyze archive copies Analyze archive from 2015-11-30
Nearby Daily Crawls
/data page
Expected /data URL http://www.epa.gov/data (From USA.gov Directory)
Resolved /data URL http://developer.epa.gov/category/data/
Redirects 3 redirects
HTTP Status 200
Content Type text/html; charset=UTF-8
Last crawl Monday, 30-Nov-2015 23:10:48 EST
Expected /digitalstrategy.json URL http://www.epa.gov/digitalstrategy.json (From USA.gov Directory)
Resolved /digitalstrategy.json URL http://www2.epa.gov/sites/production/files/2015-05/digitalstrategy.json
Redirects 2 redirects
HTTP Status 200
Content Type text/plain
Valid JSON Invalid Check a JSON Validator
Last modified Friday, 29-May-2015 18:15:34 EDT
Last crawl Monday, 30-Nov-2015 23:10:48 EST