Environmental Protection Agency

http://www.epa.gov/

Milestone 15 - May 31st 2017

OMB Review Complete: OMB has completed the agency review for this milestone. Agencies should contact their OMB desk officer if anything looks incorrect.

Leading Indicators

These indicators are reviewed by the Office of Management and Budget

Review Status complete
Reviewer Rebecca Williams
Last Updated December 19, 2017, 11:20 am EST by Rebecca Williams

Assessment Summary

Note: Link quality test did not functioning properly; results could not be assessed (last quarter, 25.4% of links were broken)

Inventory Composition

Public Dataset Status

Status Indicator Automated Metrics
Overall Progress this Milestone
Inventory Updated this Quarter
2445 Number of Datasets
1726 Number of APIs
1 Bureaus represented
100.0% Percentage of bureaus represented
19 Programs represented
Percentage of programs represented
2386 Number of public datasets
4 Number of restricted public datasets
55 Number of non-public datasets
15.6% Percentage growth in records since last quarter
To a great extent (50-75%) To what extent is your agency’s Enterprise Data Inventory (EDI) complete?
See below What steps have you taken to ensure your Enterprise Data Inventory is complete
In March 2015, the EPA CIO signed the EPA Enterprise Information Management Policy (EIMP), the EIMP Cataloguing Data Resources Procedure, and the EIMP Minimum Metadata Standards. The EIMP requires all EPA Organization officials, employees, and individuals, or non-EPA organizations, if applicable, to ensure information is:Catalogued and or labeled with metadata, including geographic references, as appropriate, in EPA and Federal-wide registries, repositories or other information systems.The EIMP Cataloguing Data Resources Procedure states:EPA Environmental Data Gateway (EDG) was enhanced to create and maintain the enterprise data inventory as required by Open Data EO 1346. EPA datasets under the purview of the EIMP must be registered in the EDG to maintain the inventory;The Agency internal metadata catalog, the Environmental Dataset Gateway (EDG) was established in 2006. The EDG Team has worked with data owners since its inception to catalog Agency datasets. In response to Project Open Data requirements the EPA CIO issued an Agency-wide data call asking all EPA organizations to register their datasets in EDG. The EDG Team worked with the Agency Information Management Officers (IMOs) and the EDG Stewardship Network and other key data owners to ensure that as many Agency datasets as possible were identified for registration. In addition, EPA registry for IT systems, READ, was reviewed to ensure that all possible data owners were contacted.Since that time, the EDG team has established an ongoing relationship with the IMOs and has increased its network of stakeholders to ensure that any datasets not identified during the 2013 data call are registered in EDG. Quarterly meetings and training sessions are held with these groups to educate them on Open Data requirements and metadata best practices as well as to encourage them to continue cataloguing their datasets. Targeted outreach, based on new entries in READ are conducted to ensure that all datasets are listed in the EDI. This includes working with Offices that have Confidential Business Information to ensure that we have a full registration of all data not shared with the public.Tools are being planned to catalog datasets that have been posted on Drupal websites. And finally, EPA is also developing an API Strategy to help API enable Agency datasets currently available only as downloads as well as bring to light additional datasets that can be made a part of the EDI.
Agency provides a public Enterprise Data Inventory on Data.gov
Agency provided updated Enterprise Data Inventory to OMB
100.0% License specified Crawl details
Number of datasets with redactions
0.0% Percent of datasets with redactions
Status Indicator Automated Metrics
Overall Progress this Milestone
2445 Number of Datasets Crawl details
Number of Collections Crawl details
2234 Number of datasets not contained in a collection Crawl details
2140 Number of Public Datasets with File Downloads Crawl details
1726 Number of APIs Crawl details
1675 Number of public APIs Crawl details
Number of restricted public APIs Crawl details
51 Number of non-public APIs Crawl details
4022 Total number of access and download links Crawl details
Quality Check: Links are sufficiently working Crawl details
Quality Check: Accessible links Crawl details
Quality Check: Redirected links Crawl details
Quality Check: Error links Crawl details
Quality Check: Broken links Crawl details
Quality Check: Percentage of download links in correct format as specified in metadata Crawl details
Quality Check: Percentage of download links in HTML Crawl details
Quality Check: Percentage of download links in PDF Crawl details
Percentage growth in records since last quarter
100% Valid Metadata Crawl details
/data exists Crawl details
Provides datasets in human-readable form on /data
/data.json Crawl details
Harvested by data.gov
2386 Number of public datasets Crawl details
4 Number of restricted public datasets Crawl details
55 Number of non-public datasets Crawl details
Percent growth of public datasets
Percent growth of restricted public datasets
Percent growth of non-public datasets
Percent datasets licensed as U.S. Public Domain
Percent datasets licensed as Creative Commons Zero
Percent datasets with other licenses
Percent datasets with no license
Status Indicator Automated Metrics
Overall Progress this Milestone
Description of feedback mechanism delivered Crawl details
Data release is prioritized through public engagement
Provided narrative evidence of data improvements based on public feedback this quarter
Feedback loop is closed, 2 way communication
See below Link to or description of Feedback Mechanism
https://developer.epa.gov/forums/forum/dataset-qa/
Provides valid contact point information for all datasets
Status Indicator Automated Metrics
Overall Progress this Milestone
Data Publication Process Delivered Crawl details
Information that should not to be made public is documented with agency's OGC
See below Describe the agency's data publication process
EPA has a number of policies and procedures concerning the publication of Agency data. The Enterprise Information Management Policy requires all EPA Organization officials, employees, and individuals or non-EPA organizations, if applicable, to ensure information is cataloged and or labeled with metadata. This includes geographic references, as appropriate, in EPA and Federal-wide registries, repositories or other information systems. The EPA GeoPlatform Publishing Workflow Standard Operating Procedure and the EPA Environmental Dataset Gateway (EDG) Governance Structure and Standard Operating Procedure outline the details of EPA data publishing. EPA provides a range of tools and registry content (e.g. Reusable Component Services, Environmental Dataset Gateway, and Data Element Registry) through its System of Registries located at: www.epa.gov/sor. EPA is continuing efforts to document APIs through the development of an Agency-wide API Strategy. The proposed strategy is based on 18F’s API standards. The proposal encourages the use of api.data.gov’s API management platform. In addition, APIs produced by the EPA should be described using one of the common API definition formats (such as Swagger, API Blueprint and RAML). The strategy is being finalized and an Agency-wide communication plan is being developed. This communication plan will include Standard Operating Procedures (SOPs) that require API developers to register dataset APIs in the EPA’s Environmental Dataset Gateway, which will allow these APIs to become part of the EPA's EDI/PDL. All other APIs will be registered in EPA's Reusable Components Services (RCS).
Status Indicator Automated Metrics
Overall Progress this Milestone
See below Open Data Primary Point of Contact
Ana Greene; Greene.Ana@epa.gov
POCs identified for required responsibilities
Chief Data Officer (if applicable)
Status Indicator Automated Metrics
Overall Progress this Milestone
Provided narrative evidence of open data impacts for this quarter
Digital Analytics Program on /data
Views on data.gov for this quarter
Percentage growth in views on data.gov for this quarter
Views on agency /data page for this quarter

Automated Metrics

These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot

data.json
Expected Data.json URL http://www.epa.gov/data.json (From USA.gov Directory)
Resolved Data.json URL https://edg.epa.gov/data.json
Number of Redirects 3 redirects
HTTP Status 200
Content Type application/json
Valid JSON Valid
Datasets with Valid Metadata 100%(2445 of 2445)
Valid Schema Valid
Datasets 2445
Number of Collections 19
Number of datasets not in a collection 2234
Datasets with Distribution URLs 87.5% (2140 of 2445)
Datasets with Download URLs 80.1% (1959 of 2445)
Total Distribution URLs 4022
Total Download URLs 2296
Total APIs 1726
Public APIs 1675
Restricted Public APIs 0
Non-public APIs 51
Public Datasets 2386
Restricted Public Datasets 4
Non-public Datasets 55
Bureaus Represented 1
Programs Represented 19
License Specified 100% (2445 of 2445)
Datasets with Redactions 0.0% (0 of 2445)
Redactions without explanation (rights field) 0.0% (0 of 2445)
File Size 7.38MB
Last modified Tuesday, 23-May-2017 11:33:53 EDT
Last crawl Wednesday, 31-May-2017 00:00:12 EDT
Analyze archive copies Analyze archive from 2017-05-31
Nearby Daily Crawls