Department of Commerce

M-13-13 Milestone 8 - August 31st 2015

OMB Review Complete: OMB has completed the agency review for this milestone. Agencies should contact their OMB desk officer if anything looks incorrect.

Leading Indicators

These indicators are reviewed by the Office of Management and Budget

Review Status complete
Reviewer Justin Grimes
Last Updated October 6, 2015, 10:16 am EDT by Justin Grimes

Assessment Summary

Some overall progress was made in EDI.

Steps to improve: 1) Increase number of datasets in EDI/PDL; note 18% decrease from last quarter. 2) Identify POCs for all roles and responsibilities 3) Take steps to improve use and impact reporting

Inventory Composition

Public Dataset Status

Dataset Link Quality

Status Indicator Automated Metrics
Overall Progress this Milestone
Inventory Updated this Quarter
529 Number of Datasets
7 Number of APIs
Schedule Delivered Crawl details
11 Bureaus represented
26 Programs represented
507 Number of public datasets
21 Number of restricted public datasets
1 Number of non-public datasets
Inventory > Public listing
7.5% Percentage growth in records since last quarter
10,100,000 Spot Check - datasets listed by search engine
Agency provides a public Enterprise Data Inventory on
98% License specified Crawl details
Status Indicator Automated Metrics
Overall Progress this Milestone
400 Number of Datasets Crawl details
Number of Collections Crawl details
377 Number of Public Datasets with File Downloads Crawl details
3 Number of APIs Crawl details
389 Total number of access and download links Crawl details
Quality Check: Links are sufficiently working Crawl details
319 Quality Check: Accessible links Crawl details
51 Quality Check: Redirected links Crawl details
Quality Check: Error links Crawl details
9 Quality Check: Broken links Crawl details
-18% Percentage growth in records since last quarter
100% Valid Metadata Crawl details
/data exists Crawl details
/data.json Crawl details
Harvested by
16482 Views on for the quarter
Status Indicator Automated Metrics
Overall Progress this Milestone
Description of feedback mechanism delivered Crawl details
Data release is prioritized through public engagement
Feedback loop is closed, 2 way communication
See below Link to or description of Feedback Mechanism
Status Indicator Automated Metrics
Overall Progress this Milestone
Data Publication Process Delivered Crawl details
Information that should not to be made public is documented with agency's OGC
Status Indicator Automated Metrics
Overall Progress this Milestone Open Data Primary Point of Contact
POCs identified for required responsibilities
Status Indicator Automated Metrics
Overall Progress this Milestone
Identified 5 data improvements for this quarter
See below Primary Uses
Commerce Department data is used in a very wide range of uses from weather forecasting and environmental research, economic analysis and development of trade opportunities to demographic analysis and planning by state and local governments as well as private industry.
Value or impact of data
See below Primary data discovery channels
Commerce has engaged with organizations such as GovLab (Open Data Rountable) and commercial groups such as the ESRI Users Conference to increase awareness of the data Commerce has to offer and to gather feedback from data users on what they need and how they would like to be able to access it.
See below User suggestions on improving data usability
• Create a centralized data catalogue: Organize catalogue of datasets and make them accessible in an easily findable way. Datasets should be catalogued in a common, machine-readable format. • Continue to develop APIs, and provide an interface at each bureau to help add context for APIs. • Establish common, open standards across Commerce for taxonomy, vocabulary, and APIs. • Develop communication channels for direct collaboration with subject matter experts at Commerce, and feedback channels for the private sector to have effective input. • Develop methods to track users: Establish ways to track who is using which datasets. • Improve metadata for datasets throughout DOC. (One suggestion: Hold a “metadata-thon.”) • Meet the needs of diverse data users – ranging from those who just want access to raw data, to those who want more developed information products and answers.
See below User suggestions on additional data releases
• Centralization of datasets: each bureau should have one place to identify all datasets o Catalogued in machine-readable formats o PTO: needs to develop better search functions, including for data in image form o Include context and documentation for each dataset  BEA: improve technical documentation to better differentiate between raw and modeled data  PTO: make more information available about the scope of patent rights, including expiration dates, or decisions by the agency and/or courts about patent claims  PTO: Put out data with more context to make it usable by non-experts – eg, trademark transaction data and trademark assignment. o Create a standardized registry that has standard metadata and can be queried with APIs. o Create a site map of bureaus/locations of datasets in lieu of a single site holding all datasets; easier to implement and update; and points of contact for each dataset • Provide data visualization tools on website with examples for different datasets • Provide for users’ ability to download large datasets o Census: increase download times • Provide more transparency and notice around changes to data distribution systems. o NOAA suggests using current processes used by NWS as a model • Specific issues for PTO o Move from paper-based systems to all-digital. Suggestion: Set up system for e-filing and promote it, initially voluntary but shifting to mandatory. The latter may require legislation and may be opposed by the patent bar. One possible model: IRS requiring e-filing of Form 990. o Provide APIs to enable third parties to build better interfaces for the existing legacy systems. The PAIR and PTAB data are most important here. o Improve search functionality to find patent data, which is now made available through different databases using different interfaces. Consider ways to make data searchable in image form. o Harmonize assignment databases for Patents and for Trademarks; clarify use of different address fields. o Make PACER accessible at lower/no cost to the user
Digital Analytics Program on /data

Automated Metrics

These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot

Expected Data.json URL (From Directory)
Resolved Data.json URL
Number of Redirects 1 redirects
HTTP Status 200
Content Type application/json
Valid JSON Valid
Detected Data.json Schema federal-v1.1
Datasets with Valid Metadata 100%(400 of 400)
Valid Schema Valid
Datasets 400
Datasets with Distribution URLs 97.0% (388 of 400)
Datasets with Download URLs 94.2% (377 of 400)
Total Distribution URLs 402
Total Download URLs 389
Total APIs 3
Public Datasets 397
Restricted Public Datasets 2
Non-public Datasets 1
Bureaus Represented 10
Programs Represented 23
License Specified 98.0% (392 of 400)
Datasets with Redactions 0.0% (0 of 400)
Redactions without explanation (rights field) 0.0% (0 of 400)
File Size 653.38KB
Last modified Tuesday, 25-Aug-2015 16:40:06 EDT
Last crawl Monday, 31-Aug-2015 00:12:19 EDT
Analyze archive copies Analyze archive from 2015-08-31
Nearby Daily Crawls
/data page
Expected /data URL (From Directory)
Resolved /data URL
Redirects 3 redirects
HTTP Status 200
Content Type text/html; charset=utf-8
Last modified Monday, 31-Aug-2015 00:11:56 EDT
Last crawl Monday, 31-Aug-2015 00:12:13 EDT
Expected /digitalstrategy.json URL (From Directory)
Resolved /digitalstrategy.json URL
Redirects 1 redirects
HTTP Status 200
Content Type application/json
Valid JSON Valid
Last modified Thursday, 28-May-2015 10:50:02 EDT
Last crawl Monday, 31-Aug-2015 00:12:14 EDT
Digital Strategy

Date specified: Thursday, 28-May-2015 10:50:02 EDT

Date of digitalstrategy.json file: Thursday, 28-May-2015 10:50:02 EDT

1.2.4 Develop Data Inventory Schedule - Summary

Summarize the Inventory Schedule

Given the volume and complexity of Commerce's data, we have worked with OMB and GSA to build an EDI that complies with OMB policy. However, we are still working to fully integrate our millions of data sets into a single data-management system.

1.2.5 Develop Data Inventory Schedule - Milestones

TitleMilestone 1 - Expand and Open EDI
DescriptionThe Department of Commerce will initially focus on expanding and opening its Enterprise Data Inventory. To this end, the Open Government Senior Leaders, in collaboration with Agency CIOs, Data Stewards, and the Points of Contact, will continue to work with Commerce Office of Open Government and Privacy to identify, prioritize and submit additional public, non-public, and restricted-public datasets to the Department. In addition, the Open Government Senior Leaders and Points of Contact will continue the development of an Open Data process and guidance that expedite the publication of its datasets in the future. At the end of Q1, Commerce will submit an updated EDI to OMB. Over the second quarter, from February 28 to May 31, Commerce and its component bureaus will continue to input into its EDI additional datasets. The Department will also continue to collaborate with its Data Stewards to update its customer feedback and outreach efforts. At the end of Q2, Commerce will submit an updated EDI to OMB.
Milestone DateFebruary 28, 2014
Description of how this milestone expands the InventoryThe Department’s Enterprise Data Inventory (EDI), as defined by OMB guidance, is the same as the Public Data Listing (PDL) in that none of the public data assets in the Department’s data catalog have been redacted.
Description of how this milestone enriches the InventoryAs part of an ongoing effort, the Department is making every effort to conduct and publish a comprehensive inventory of our total data assets. However, due to the size, history and complexity of the Department of Commerce’s Bureaus, the current inventory available on represents less than 1 percent of the actual enterprise.
Description of how this milestone opens the Inventory
TitleMilestone 2 - Enrich EDI and Public Datasets
DescriptionCommerce's data sets are now integrated into one .json file that is harvested by
Milestone DateMay 31, 2014
Description of how this milestone expands the Inventory
Description of how this milestone enriches the Inventory
Description of how this milestone opens the Inventory

1.2.6 Develop Customer Feedback Process

Describe the agency's process to engage with customers

Since the last milestone report, the following actions were taken in order to improve the Department’s mechanisms for public feedback:
•	Added functions to our web page that include a “suggest a dataset” web form and also an “Engage” set of tools to connect via social media.
•	Went live with a public, Department-level Github account that is publishing public repos and responding to issue posts.
•	Started hosting “office hours” with the Office of the Chief Data Officer where the public will be able to present questions and comments
•	Launched the Commerce Data Advisory Council that is, among other things, seeking to channel industry feedback for the entire Department.

1.2.7 Develop Data Publication Process

Describe the agency's data publication process