Environmental Protection Agency

http://www.epa.gov/

Milestone 3 - May 31st 2014

OMB Review In Progress: OMB is currently reviewing the agency for this milestone. This review status indicator will change once the review is complete.

Leading Indicators

These indicators are reviewed by the Office of Management and Budget

Review Status in-progress
Reviewer Amélie Koran
Last Updated September 12, 2014, 6:39 pm EDT by Amélie E. Koran

Assessment Summary

Inventory Composition

Public Dataset Status

Inventory is greater than public data listing; schedule risk for 11/30/14

Status Indicator Automated Metrics
Overall Progress this Milestone
Inventory Updated this Quarter
3286 Number of Datasets
Number of APIs
Schedule Delivered Crawl details
1 Bureaus represented
1 Programs represented
3286 Number of public datasets
Number of restricted public datasets
Number of non-public datasets
Inventory > Public listing
Percentage growth in records since last quarter
Schedule Risk for Nov 30, 2014
Spot Check - datasets listed by search engine
Agency provides a public Enterprise Data Inventory on Data.gov
License specified Crawl details
Status Indicator Automated Metrics
Overall Progress this Milestone
3441 Number of Datasets Crawl details
Number of Collections Crawl details
3441 Number of Public Datasets with File Downloads Crawl details
Number of APIs Crawl details
Total number of access and download links Crawl details
Quality Check: Links are sufficiently working Crawl details
Quality Check: Accessible links Crawl details
Quality Check: Redirected links Crawl details
Quality Check: Error links Crawl details
Quality Check: Broken links Crawl details
4.7 Percentage growth in records since last quarter
86.9 Valid Metadata Crawl details
/data exists Crawl details
/data.json Crawl details
Harvested by data.gov
Views on data.gov for the quarter
Status Indicator Automated Metrics
Overall Progress this Milestone
Description of feedback mechanism delivered Crawl details
Data release is prioritized through public engagement
Feedback loop is closed, 2 way communication
See below Link to or description of Feedback Mechanism
http://www.epa.gov/digitalstrategy
Status Indicator Automated Metrics
Overall Progress this Milestone
Data Publication Process Delivered Crawl details
Information that should not to be made public is documented with agency's OGC
Status Indicator Automated Metrics
Overall Progress this Milestone
Tim Crawford Open Data Primary Point of Contact
POCs identified for required responsibilities
Status Indicator Automated Metrics
Overall Progress this Milestone
Identified 5 data improvements for this quarter
Primary Uses
Value or impact of data
Primary data discovery channels
User suggestions on improving data usability
User suggestions on additional data releases
Digital Analytics Program on /data

Automated Metrics

These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot

data.json
Expected Data.json URL http://www.epa.gov/data.json (From USA.gov Directory)
Resolved Data.json URL http://www2.epa.gov/sites/production/files/2013-11/dcat_public_v2.json
Number of Redirects 1 redirects
HTTP Status 200
Content Type text/plain
Valid JSON Valid
Datasets with Valid Metadata 86.9%(2991 of 3441)
Valid Schema Invalid
For more complete and readable validation results, see the full schema validator results
Schema Errors There are validation errors on 450 records

Only showing errors from the first 10 records:

Errors on record 0:
accessLevelComment
  • must be at least 1 characters long
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 16:
webService
  • Invalid URL format
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 17:
webService
  • Invalid URL format
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 18:
webService
  • Invalid URL format
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 20:
webService
  • Invalid URL format
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 21:
webService
  • Invalid URL format
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 24:
accessURL
  • Invalid URL format
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 238:
temporal
  • does not match the regex pattern ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  • does not match the regex pattern ^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?$
  • does not match the regex pattern ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  • does not match the regex pattern ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?$
  • does not match the regex pattern ^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  • does not match the regex pattern ^R\d*\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?\/P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?$
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 243:
accessURL
  • Invalid URL format
  • string value found, but a null is required
  • failed to match at least one schema
Errors on record 423:
temporal
  • does not match the regex pattern ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  • does not match the regex pattern ^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?$
  • does not match the regex pattern ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  • does not match the regex pattern ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?$
  • does not match the regex pattern ^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  • does not match the regex pattern ^R\d*\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?\/P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?$
  • string value found, but a null is required
  • failed to match at least one schema
Datasets 3441
Datasets with Distribution URLs 100% (3441 of 3441)
Total Distribution URLs 10255
Public Datasets 3441
Restricted Public Datasets 0
Non-public Datasets 0
Bureaus Represented 1
Programs Represented 1
File Size 8.18MB
Last modified Thursday, 05-Jun-2014 08:16:55 EDT
Last crawl Friday, 20-Jun-2014 03:04:38 EDT
Analyze archive copies Analyze archive from 2014-05-31
/data page
Expected /data URL http://www.epa.gov/data (From USA.gov Directory)
Resolved /data URL http://www.epa.gov/data/
Redirects 1 redirects
HTTP Status 200
Content Type text/html
Last modified Tuesday, 18-Mar-2014 15:25:37 EDT
/digitalstrategy.json
Expected /digitalstrategy.json URL http://www.epa.gov/digitalstrategy.json (From USA.gov Directory)
Resolved /digitalstrategy.json URL http://www.epa.gov/digitalstrategy.json
Redirects
HTTP Status 200
Content Type application/json
Valid JSON Valid
Last modified Wednesday, 18-Dec-2013 09:47:54 EST
Digital Strategy

Date specified: Friday, 31-May-2019 09:16:54 EDT

Date of digitalstrategy.json file: Wednesday, 18-Dec-2013 09:47:54 EST

1.2.4 Develop Data Inventory Schedule - Summary

Summarize the Inventory Schedule


In 2005 the EPA developed a metadata catalog for Agency datasets known as the Environmental Dataset Gateway (EDG). In response to Project Open Data requirements the EPA CIO issued an Agency-wide data call in Fall of 2013 asking all EPA organizations to register their datasets in EDG. Since that time the Agency has been committed to increasing the percentage of EPA data cataloged within the EDG. The Office of Environmental Information works with the Agency's Information Management Officers on an ongoing basis to encourage them to ensure that data within their program offices are registered in the EDG and has increased its network of stakeholders to ensure that any datasets not identified during the 2013 data call are registered in EDG.

1.2.5 Develop Data Inventory Schedule - Milestones

TitleProvide reports to programs to identify metadata holes.
DescriptionEPA uses its registry of IT Systems, READ, to identify datasets that may not be cataloged in the EDG. The EDG team works with the data stewardship network to identify and catalog the missing datasets.
Milestone DateOngoing
Description of how this milestone expands the InventoryThis milestone allows EPA to identify and catalog new datasets that were not identified in the original 2013 data call within the Agency's Environmental Dataset Gateway, the metadata catalog which is used to create the Enterprise Data Inventory.
Description of how this milestone enriches the InventoryThis milestone brings new and important datasets into EPA's Enterprise Data Inventory.
Description of how this milestone opens the InventoryThis milestones provides additional datasets that are publicly available through both the EDG and data.gov.

1.2.6 Develop Customer Feedback Process

Describe the agency's process to engage with customers


EPA interacts with the public on its data in numerous ways including public meetings/forums, feedback buttons on websites, webinars, mailboxes, FOIA Online and help desks. Recently the Agency launched an online public data forum to communicate with the public about its data in a fully transparent manner: http://developer.epa.gov/forums/forum/dataset-qa/. This forum enables two-way, transparent feedback between the Agency and the public. It can be accessed through three different webpages: the Environmental Dataset Gateway (EDG) webpage, the Developer Central webpage, and EPA's Digital Strategy webpage.  The forum shows both the public user's question and the Agency's answer and categorizes questions to increase discoverability about specific topics. Development work is underway to embed the forum into the Agency's metadata stylesheet. This would allow people to ask questions about a specific dataset directly from a metadata record and have that question routed to the metadata owner for response.
 In addition to the Forum, EPA has developed an error correction tool which allows the public to report errors that they find relating to data, especially in relation to the geographic locations of data points. The errors are routed through a system that returns feedback to the person who has reported the error.

1.2.7 Develop Data Publication Process

Describe the agency's data publication process


EPA has a number of policies and procedures concerning the publication of Agency data.  The Enterprise Information Management Policy requires all EPA Organization officials, employees, and individuals or non-EPA organizations, if applicable, to ensure information is cataloged and or labeled with metadata.  This includes geographic references, as appropriate, in EPA and Federal-wide registries, repositories or other information systems.  The EPA GeoPlatform Publishing Workflow Standard Operating Procedure and the EPA Environmental Dataset Gateway (EDG) Governance Structure and Standard Operating Procedure outline the details of EPA data publishing.
 EPA provides a range of tools and registry content (e.g. Reusable Component Services, Environmental Dataset Gateway, and Data Element Registry) through its System of Registries located at: www.epa.gov/sor. EPA is continuing efforts to document APIs through the development of an Agency-wide API Strategy.  The proposed strategy is based on 18F's API standards.  The proposal encourages the use of api.data.gov's API management platform. In addition, APIs produced by the EPA should be described using one of the common API definition formats (such as Swagger, API Blueprint and RAML).  The strategy is being finalized and an Agency-wide communication plan is being developed.  This communication plan will include Standard Operating Procedures (SOPs) that require API developers to register dataset APIs in the EPA's Environmental Dataset Gateway, which will allow these APIs to become part of the EPA's EDI/PDL.  All other APIs will be registered in EPA's Reusable Components Services (RCS).