Department of Commerce

M-13-13 Milestone 9 - November 30th 2015

OMB Review Complete: OMB has completed the agency review for this milestone. Agencies should contact their OMB desk officer if anything looks incorrect.

Leading Indicators

These indicators are reviewed by the Office of Management and Budget

Review Status complete
Reviewer Justin Grimes
Last Updated January 11, 2016, 2:56 pm EST by Justin Grimes

Assessment Summary

Department of Commerce improvements include: a 1,033% increase in the total number of datasets from last quarter.

Department of Commerce needs to address the following issues: 1) "Fails to document any outstanding Licensing information"; 2) "Fails to organize data assets in to Collections"; 3) "Fails to provide a public EDI as a dataset in your data.json file."; 4) "Agency provided a person and contact information for some but not all of the required data points of contact."; 5) "Fails to document all non-public and restricted datasets, redactions, restrictive licenses and provides an explanation for non-disclosure (in the rights field)"; 6) "Fails to provide an up-to-date Data Publication Process on the /digitalstrategy page."

Inventory Composition

Public Dataset Status

Dataset Link Quality

Status Indicator Automated Metrics
Overall Progress this Milestone
Inventory Updated this Quarter
Number of Datasets
7 Number of APIs
Schedule Delivered Crawl details
11 Bureaus represented
29 Programs represented
6510 Number of public datasets
1 Number of restricted public datasets
21 Number of non-public datasets
Inventory > Public listing
1135% Percentage growth in records since last quarter
Spot Check - datasets listed by search engine
Agency provides a public Enterprise Data Inventory on
11.0% License specified Crawl details
Status Indicator Automated Metrics
Overall Progress this Milestone
4,530 Number of Datasets Crawl details
Number of Collections Crawl details
4,437 Number of Public Datasets with File Downloads Crawl details
7 Number of APIs Crawl details
31492 Total number of access and download links Crawl details
Quality Check: Links are sufficiently working Crawl details
16626 Quality Check: Accessible links Crawl details
11113 Quality Check: Redirected links Crawl details
3233 Quality Check: Error links Crawl details
484 Quality Check: Broken links Crawl details
1,033% Percentage growth in records since last quarter
100% Valid Metadata Crawl details
/data exists Crawl details
/data.json Crawl details
Harvested by
115,202 Views on for the quarter
Status Indicator Automated Metrics
Overall Progress this Milestone
Description of feedback mechanism delivered Crawl details
Data release is prioritized through public engagement
Feedback loop is closed, 2 way communication
See below Link to or description of Feedback Mechanism -- Since the last milestone report, the following actions were taken in order to improve the Department’s mechanisms for public feedback: • Added functions to our web page that include a “suggest a dataset” web form and also an “Engage” set of tools to connect via social media. • Went live with a public, Department-level Github account that is publishing public repos and responding to issue posts. • Started hosting “office hours” with the Office of the Chief Data Officer where the public will be able to present questions and comments • Launched the Commerce Data Advisory Council that is, among other things, seeking to channel industry feedback for the entire Department.
Status Indicator Automated Metrics
Overall Progress this Milestone
Data Publication Process Delivered Crawl details
Information that should not to be made public is documented with agency's OGC
Status Indicator Automated Metrics
Overall Progress this Milestone Open Data Primary Point of Contact
POCs identified for required responsibilities
Status Indicator Automated Metrics
Overall Progress this Milestone
Identified 5 data improvements for this quarter
See below Primary Uses
Commerce Department data is used in a very wide range of uses from weather forecasting and environmental research, economic analysis and development of trade opportunities to demographic analysis and planning by state and local governments as well as private industry.
Value or impact of data
See below Primary data discovery channels
Commerce has engaged with organizations such as GovLab (Open Data Rountable) and commercial groups such as the ESRI Users Conference to increase awareness of the data Commerce has to offer and to gather feedback from data users on what they need and how they would like to be able to access it.
See below User suggestions on improving data usability
• Create a centralized data catalogue: Organize catalogue of datasets and make them accessible in an easily findable way. Datasets should be catalogued in a common, machine-readable format. • Continue to develop APIs, and provide an interface at each bureau to help add context for APIs. • Establish common, open standards across Commerce for taxonomy, vocabulary, and APIs. • Develop communication channels for direct collaboration with subject matter experts at Commerce, and feedback channels for the private sector to have effective input. • Develop methods to track users: Establish ways to track who is using which datasets. • Improve metadata for datasets throughout DOC. (One suggestion: Hold a “metadata-thon.”) • Meet the needs of diverse data users – ranging from those who just want access to raw data, to those who want more developed information products and answers.
See below User suggestions on additional data releases
• Centralization of datasets: each bureau should have one place to identify all datasets o Catalogued in machine-readable formats o PTO: needs to develop better search functions, including for data in image form o Include context and documentation for each dataset  BEA: improve technical documentation to better differentiate between raw and modeled data  PTO: make more information available about the scope of patent rights, including expiration dates, or decisions by the agency and/or courts about patent claims  PTO: Put out data with more context to make it usable by non-experts – eg, trademark transaction data and trademark assignment. o Create a standardized registry that has standard metadata and can be queried with APIs. o Create a site map of bureaus/locations of datasets in lieu of a single site holding all datasets; easier to implement and update; and points of contact for each dataset • Provide data visualization tools on website with examples for different datasets • Provide for users’ ability to download large datasets o Census: increase download times • Provide more transparency and notice around changes to data distribution systems. o NOAA suggests using current processes used by NWS as a model • Specific issues for PTO o Move from paper-based systems to all-digital. Suggestion: Set up system for e-filing and promote it, initially voluntary but shifting to mandatory. The latter may require legislation and may be opposed by the patent bar. One possible model: IRS requiring e-filing of Form 990. o Provide APIs to enable third parties to build better interfaces for the existing legacy systems. The PAIR and PTAB data are most important here. o Improve search functionality to find patent data, which is now made available through different databases using different interfaces. Consider ways to make data searchable in image form. o Harmonize assignment databases for Patents and for Trademarks; clarify use of different address fields. o Make PACER accessible at lower/no cost to the user
Digital Analytics Program on /data

Automated Metrics

These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot

Expected Data.json URL (From Directory)
Resolved Data.json URL
Number of Redirects 2 redirects
HTTP Status 200
Content Type application/json
Valid JSON Valid
Datasets with Valid Metadata 100%(4530 of 4530)
Valid Schema Valid
Datasets 4530
Number of Collections 0
Datasets with Distribution URLs 97.9% (4437 of 4530)
Datasets with Download URLs 97.5% (4416 of 4530)
Total Distribution URLs 31492
Total Download URLs 31467
Total APIs 7
Public Datasets 4508
Restricted Public Datasets 21
Non-public Datasets 1
Bureaus Represented 11
Programs Represented 29
License Specified 11.0% (498 of 4530)
Datasets with Redactions 0.0% (0 of 4530)
Redactions without explanation (rights field) 0.0% (0 of 4530)
File Size 17.97MB
Last modified Monday, 09-Nov-2015 15:17:49 EST
Last crawl Monday, 30-Nov-2015 23:20:55 EST
Analyze archive copies Analyze archive from 2015-11-30
Nearby Daily Crawls
/data page
Expected /data URL (From Directory)
Resolved /data URL
Redirects 3 redirects
HTTP Status 200
Content Type text/html; charset=utf-8
Last modified Monday, 30-Nov-2015 22:33:48 EST
Last crawl Monday, 30-Nov-2015 23:19:09 EST
Expected /digitalstrategy.json URL (From Directory)
Resolved /digitalstrategy.json URL
Redirects 2 redirects
HTTP Status 200
Content Type application/json
Valid JSON Invalid Check a JSON Validator
Last modified Thursday, 28-May-2015 10:50:02 EDT
Last crawl Monday, 30-Nov-2015 23:19:10 EST