Department of Commerce

M-13-13 Milestone 6 - February 28th 2015

OMB Review Complete: OMB has completed the agency review for this milestone. Agencies should contact their OMB desk officer if anything looks incorrect.

Leading Indicators

These indicators are reviewed by the Office of Management and Budget

Review Status complete
Reviewer Jamie Berryhill
Last Updated April 17, 2015, 2:55 pm EDT by Jamie Berryhill

Assessment Summary

EDI: Number of datasets fell from 3,918 to 445.

PDL: Commerce's PDL represents a very small percentage of its overall data inventory and is not representative of the agency's quantity of public data. Commerce should being to harvest on and report its APIs in the PDL.

Use and Impact: Commerce users Ideascale for feedback, but there have only been two responses in the past year, neither of which is related to data.

Inventory Composition

Public Dataset Status

Dataset Link Quality

Status Indicator Automated Metrics
Overall Progress this Milestone
Inventory Updated this Quarter
445 Number of Datasets
20 Number of APIs
Schedule Delivered Crawl details
7 Bureaus represented
15 Programs represented
424 Number of public datasets
19 Number of restricted public datasets
2 Number of non-public datasets
Inventory > Public listing
-89% Percentage growth in records since last quarter
Spot Check - datasets listed by search engine
Agency provides a public Enterprise Data Inventory on
License specified Crawl details
Status Indicator Automated Metrics
Overall Progress this Milestone
313 Number of Datasets Crawl details
Number of Collections Crawl details
313 Number of Public Datasets with File Downloads Crawl details
Number of APIs Crawl details
344 Total number of access and download links Crawl details
Quality Check: Links are sufficiently working Crawl details
276 Quality Check: Accessible links Crawl details
35 Quality Check: Redirected links Crawl details
18 Quality Check: Error links Crawl details
Quality Check: Broken links Crawl details
-16% Percentage growth in records since last quarter
100% Valid Metadata Crawl details
/data exists Crawl details
/data.json Crawl details
Harvested by
94,482 Views on for the quarter
Status Indicator Automated Metrics
Overall Progress this Milestone
Description of feedback mechanism delivered Crawl details
Data release is prioritized through public engagement
Feedback loop is closed, 2 way communication
See below Link to or description of Feedback Mechanism
Status Indicator Automated Metrics
Overall Progress this Milestone
Data Publication Process Delivered Crawl details
Information that should not to be made public is documented with agency's OGC
Status Indicator Automated Metrics
Overall Progress this Milestone
See below Open Data Primary Point of Contact
Mike Kruger
POCs identified for required responsibilities

Best Practice: Department of Commerce has been highlighted for demonstrating a best practice on the Use & Impact indicator

Status Indicator Automated Metrics
Overall Progress this Milestone
Identified 5 data improvements for this quarter
See below Primary Uses
Commerce Department data is used in a very wide range of uses from weather forecasting and environmental research, economic analysis and development of trade opportunities to demographic analysis and planning by state and local governments as well as private industry.
See below Value or impact of data
The economic and social impact of Commerce data is immense. Commerce data inform decisions that help make government smarter, make businesses more competitive, make citizens better informed about their own communities – with the potential to guide up to $3.3 trillion in investments in the United States each year. Decennial Census and American Community Survey data alone guide $400 billion in federal spending annually. A recent Commerce report makes another essential point: the cost of government data is small relative to its benefits. The federal government spends roughly 3 cents per person, per day, collecting and disseminating statistical data. In other words, when it comes to data, taxpayers get tremendous bang for their buck. In addition to guiding economic decisions, Commerce data helps protect families and businesses during weather events and climate change. A recent Commerce report found a median valuation of weather forecasts per household of $286 per year, which suggests that the aggregate annual valuation of weather forecasts was about $31.5 billion. The sum of all federal spending on meteorological operations and research was $3.4 billion in the same year, and the private sector spent an additional $1.7 billion on weather forecasting, for a total of private and public spending of about $5.1 billion. In other words, the valuation people placed on the weather forecasts they consumed was 6.2 times as high as the total expenditure on producing forecasts. NOAA data is re-packaged and analyzed to produce 15 million weather products, such as air quality alerts, the three, five and ten day extended weather forecast, earthquake reports, and tornado and flash flood warnings
See below Primary data discovery channels
Commerce has engaged with organizations such as GovLab (Open Data Rountable) and commercial groups such as the ESRI Users Conference to increase awareness of the data Commerce has to offer and to gather feedback from data users on what they need and how they would like to be able to access it.
See below User suggestions on improving data usability
• Create a centralized data catalogue: Organize catalogue of datasets and make them accessible in an easily findable way. Datasets should be catalogued in a common, machine-readable format. • Continue to develop APIs, and provide an interface at each bureau to help add context for APIs. • Establish common, open standards across Commerce for taxonomy, vocabulary, and APIs. • Develop communication channels for direct collaboration with subject matter experts at Commerce, and feedback channels for the private sector to have effective input. • Develop methods to track users: Establish ways to track who is using which datasets. • Improve metadata for datasets throughout DOC. (One suggestion: Hold a “metadata-thon.”) • Meet the needs of diverse data users – ranging from those who just want access to raw data, to those who want more developed information products and answers.
See below User suggestions on additional data releases
• Centralization of datasets: each bureau should have one place to identify all datasets o Catalogued in machine-readable formats o PTO: needs to develop better search functions, including for data in image form o Include context and documentation for each dataset  BEA: improve technical documentation to better differentiate between raw and modeled data  PTO: make more information available about the scope of patent rights, including expiration dates, or decisions by the agency and/or courts about patent claims  PTO: Put out data with more context to make it usable by non-experts – eg, trademark transaction data and trademark assignment. o Create a standardized registry that has standard metadata and can be queried with APIs. o Create a site map of bureaus/locations of datasets in lieu of a single site holding all datasets; easier to implement and update; and points of contact for each dataset • Provide data visualization tools on website with examples for different datasets • Provide for users’ ability to download large datasets o Census: increase download times • Provide more transparency and notice around changes to data distribution systems. o NOAA suggests using current processes used by NWS as a model • Specific issues for PTO o Move from paper-based systems to all-digital. Suggestion: Set up system for e-filing and promote it, initially voluntary but shifting to mandatory. The latter may require legislation and may be opposed by the patent bar. One possible model: IRS requiring e-filing of Form 990. o Provide APIs to enable third parties to build better interfaces for the existing legacy systems. The PAIR and PTAB data are most important here. o Improve search functionality to find patent data, which is now made available through different databases using different interfaces. Consider ways to make data searchable in image form. o Harmonize assignment databases for Patents and for Trademarks; clarify use of different address fields. o Make PACER accessible at lower/no cost to the user
Digital Analytics Program on /data

Automated Metrics

These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot

Expected Data.json URL (From Directory)
Resolved Data.json URL
Number of Redirects
HTTP Status 200
Content Type text/plain
Valid JSON Valid
Detected Data.json Schema federal-v1.1
Datasets with Valid Metadata 100%(313 of 313)
Valid Schema Valid
Datasets 313
Datasets with Distribution URLs 0.0% (0 of 313)
Total Distribution URLs 0
Public Datasets 312
Restricted Public Datasets 1
Non-public Datasets 0
Bureaus Represented 7
Programs Represented 15
File Size 566.26KB
Last crawl Thursday, 26-Feb-2015 20:13:36 EST
Analyze archive copies Analyze archive from 2015-02-28
/data page
Expected /data URL (From Directory)
Resolved /data URL
HTTP Status 200
Content Type text/html; charset=utf-8
Last modified Thursday, 26-Feb-2015 20:13:36 EST
Last crawl Thursday, 26-Feb-2015 20:13:36 EST
Expected /digitalstrategy.json URL (From Directory)
Resolved /digitalstrategy.json URL
HTTP Status 200
Content Type text/plain
Valid JSON Valid
Last modified Wednesday, 23-Jul-2014 15:43:37 EDT
Last crawl Thursday, 26-Feb-2015 20:13:37 EST
Digital Strategy

Date specified: Thursday, 28-May-2015 10:50:02 EDT

Date of digitalstrategy.json file: Wednesday, 23-Jul-2014 15:43:37 EDT

1.2.4 Develop Data Inventory Schedule - Summary

Summarize the Inventory Schedule

Given the volume and complexity of Commerce's data, we have worked with OMB and GSA to build an EDI that complies with OMB policy. However, we are still working to fully integrate our millions of data sets into a single data-management system.

1.2.5 Develop Data Inventory Schedule - Milestones

TitleMilestone 1 - Expand and Open EDI
DescriptionThe Department of Commerce will initially focus on expanding and opening its Enterprise Data Inventory. To this end, the Open Government Senior Leaders, in collaboration with Agency CIOs, Data Stewards, and the Points of Contact, will continue to work with Commerce Office of Open Government and Privacy to identify, prioritize and submit additional public, non-public, and restricted-public datasets to the Department. In addition, the Open Government Senior Leaders and Points of Contact will continue the development of an Open Data process and guidance that expedite the publication of its datasets in the future. At the end of Q1, Commerce will submit an updated EDI to OMB. Over the second quarter, from February 28 to May 31, Commerce and its component bureaus will continue to input into its EDI additional datasets. The Department will also continue to collaborate with its Data Stewards to update its customer feedback and outreach efforts. At the end of Q2, Commerce will submit an updated EDI to OMB.
Milestone DateFebruary 28, 2014
Description of how this milestone expands the InventoryThe Department’s Enterprise Data Inventory (EDI), as defined by OMB guidance, is the same as the Public Data Listing (PDL) in that none of the public data assets in the Department’s data catalog have been redacted.
Description of how this milestone enriches the InventoryAs part of an ongoing effort, the Department is making every effort to conduct and publish a comprehensive inventory of our total data assets. However, due to the size, history and complexity of the Department of Commerce’s Bureaus, the current inventory available on represents less than 1 percent of the actual enterprise.
Description of how this milestone opens the Inventory
TitleMilestone 2 - Enrich EDI and Public Datasets
DescriptionCommerce's data sets are now integrated into one .json file that is harvested by
Milestone DateMay 31, 2014
Description of how this milestone expands the Inventory
Description of how this milestone enriches the Inventory
Description of how this milestone opens the Inventory

1.2.6 Develop Customer Feedback Process

Describe the agency's process to engage with customers

Since the last milestone report, the following actions were taken in order to improve the Department’s mechanisms for public feedback:
•	Added functions to our web page that include a “suggest a dataset” web form and also an “Engage” set of tools to connect via social media.
•	Went live with a public, Department-level Github account that is publishing public repos and responding to issue posts.
•	Started hosting “office hours” with the Office of the Chief Data Officer where the public will be able to present questions and comments
•	Launched the Commerce Data Advisory Council that is, among other things, seeking to channel industry feedback for the entire Department.

1.2.7 Develop Data Publication Process

Describe the agency's data publication process