National Science Foundation
Enterprise Data Inventory - Volume and composition over time
M-13-13 Milestone 10 - February 29th 2016
OMB Review Complete: OMB has completed the agency review for this milestone. Agencies should contact their OMB desk officer if anything looks incorrect.
Leading Indicators
These indicators are reviewed by the Office of Management and Budget
| Review Status | complete |
|---|---|
| Reviewer | Justin Grimes |
| Last Updated | April 18, 2016, 1:35 pm EDT by Justin Grimes |
Assessment Summary
Agency provides some datasets in human-readable form on agency's /data page, but not all. Fails to document any outstanding Licensing information in EDI.
Inventory Composition
Public Dataset Status
Dataset Link Quality
| Status | Indicator | Automated Metrics | ||
|---|---|---|---|---|
| Overall Progress this Milestone | ||||
| Inventory Updated this Quarter | ||||
| 147 | Number of Datasets | |||
| 1 | Number of APIs | |||
| 1 | Bureaus represented | |||
| 100% | Percentage of bureaus represented | |||
| 2 | Programs represented | |||
| 14.3% | Percentage of programs represented | |||
| 144 | Number of public datasets | |||
| 1 | Number of restricted public datasets | |||
| 2 | Number of non-public datasets | |||
| Percentage growth in records since last quarter | ||||
| To a very great extent (>75%) | To what extent is your agency’s Enterprise Data Inventory (EDI) complete? | |||
| See below | What steps have you taken to ensure your Enterprise Data Inventory is complete | |||
| NSF appreciates the opportunity to review the quarterly action items to improve and enhance the Enterprise Data Inventory's completeness. To ensure EDI completeness each quarter, NSF conducts a review of agency data sources recently added to our webpages and coordinates with internal NSF POCs to determine if there are new datasets for potential inclusion in the updated EDI. During this process, we also look into ways in which we can improve the quality of our datasets. These data improvements include: reviewing existing datasets and related metadata and adding metadata that will improve access to content; ensuring newly identified agency data sources are added to the updated EDI; ensuring all appropriate data assets are grouped as collections; and including new datasets and/or updating metadata in the Enterprise Data Inventory to reflect the public feedback received. | ||||
| Agency provides a public Enterprise Data Inventory on Data.gov | ||||
| Agency provided updated Enterprise Data Inventory to OMB | ||||
| 91.9% | License specified | Crawl details | ||
| Number of datasets with redactions | ||||
| Percent of datasets with redactions | ||||
| Status | Indicator | Automated Metrics |
|---|---|---|
| Overall Progress this Milestone | ||
| 148 | Number of Datasets | Crawl details |
| 1 | Number of Collections | Crawl details |
| 114 | Number of datasets not contained in a collection | Crawl details |
| 145 | Number of Public Datasets with File Downloads | Crawl details |
| 3 | Number of APIs | Crawl details |
| Number of public APIs | Crawl details | |
| Number of restricted public APIs | Crawl details | |
| Number of non-public APIs | Crawl details | |
| 151 | Total number of access and download links | Crawl details |
| Quality Check: Links are sufficiently working | Crawl details | |
| 143 | Quality Check: Accessible links | Crawl details |
| 1 | Quality Check: Redirected links | Crawl details |
| Quality Check: Error links | Crawl details | |
| 4 | Quality Check: Broken links | Crawl details |
| 75.5% | Quality Check: Percentage of download links in correct format as specified in metadata | Crawl details |
| 32.2% | Quality Check: Percentage of download links in HTML | Crawl details |
| 10.5% | Quality Check: Percentage of download links in PDF | Crawl details |
| Percentage growth in records since last quarter | ||
| 100% | Valid Metadata | Crawl details |
| /data exists | Crawl details | |
| Provides datasets in human-readable form on /data | ||
| /data.json | Crawl details | |
| Harvested by data.gov | ||
| 145 | Number of public datasets | Crawl details |
| 1 | Number of restricted public datasets | Crawl details |
| 2 | Number of non-public datasets | Crawl details |
| 1.4% | Percent growth of public datasets | |
| 0% | Percent growth of restricted public datasets | |
| 0% | Percent growth of non-public datasets | |
| Percent datasets licensed as U.S. Public Domain | ||
| Percent datasets licensed as Creative Commons Zero | ||
| Percent datasets with other licenses | ||
| Percent datasets with no license |
| Status | Indicator | Automated Metrics | ||
|---|---|---|---|---|
| Overall Progress this Milestone | ||||
| Description of feedback mechanism delivered | Crawl details | |||
| Data release is prioritized through public engagement | ||||
| Provided narrative evidence of data improvements based on public feedback this quarter | ||||
| Feedback loop is closed, 2 way communication | ||||
| See below | Link to or description of Feedback Mechanism | |||
| NSF has added Frequently Asked Questions to the Open Data at NSF webpage (http://www.nsf.gov/data/) to better direct individuals and organizations to resources and feedback mechanisms to improve our data usability. The FAQ also includes links to Data.gov's Help Desk (Open Data Stack Exchange) which is monitored continuously for customer feedback related to NSF's datasets and metadata quality. | ||||
| Provides valid contact point information for all datasets | ||||
| Status | Indicator | Automated Metrics | ||
|---|---|---|---|---|
| Overall Progress this Milestone | ||||
| Data Publication Process Delivered | Crawl details | |||
| Information that should not to be made public is documented with agency's OGC | ||||
| See below | Describe the agency's data publication process | |||
| NSF utilizes a regular process to review newly identified data sets prior to their planned release, to validate that there are no concerns with respect to privacy, confidentiality, security, contractual restrictions, or other factors. This process is similar to the agency approval process for the publication of agency data in Data.gov and incorporates best practices from the agency’s Freedom of Information Act (FOIA) program to ensure the presumption of openness is being applied to all agency data release decisions. When a potential restriction to release is identified, agency points of contact for Open Data will work with the Office of General Counsel and other subject matter experts as appropriate to review the concerns and, if required, document the determined barrier to release. Some potential characteristics of agency data that could prevent public release include privacy considerations (e.g., personally identifiable information); confidentiality matters (e.g., predecisional or deliberative material); contractual restrictions (e.g., contractor bidding information). Because of the nature of NSF’s mission, one common restriction for public release of data would likely be limitations in the full release of proposal-related data that may contain confidential, proprietary business information protected by FOIA Exemption 4. | ||||
| Status | Indicator | Automated Metrics |
|---|---|---|
| Overall Progress this Milestone | ||
| aesmith@nsf.gov | Open Data Primary Point of Contact | |
| POCs identified for required responsibilities | ||
| Chief Data Officer (if applicable) |
Automated Metrics
These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot
data.json
| Expected Data.json URL | http://www.nsf.gov/data.json (From USA.gov Directory) |
|---|---|
| Resolved Data.json URL | http://www.nsf.gov/data.json |
| Number of Redirects | |
| HTTP Status | 200 |
| Content Type | text/plain |
| Valid JSON | Invalid Check a JSON Validator |
| Detected Data.json Schema | federal-v1.1 |
| Datasets with Valid Metadata | 100%(148 of 148) - The JSON file is invalid and can't be parsed without special processing |
| Valid Schema | Valid |
| Datasets | 148 |
| Number of Collections | 1 |
| Number of datasets not in a collection | 114 |
| Datasets with Distribution URLs | 98.0% (145 of 148) |
| Datasets with Download URLs | 95.9% (142 of 148) |
| Total Distribution URLs | 151 |
| Total Download URLs | 147 |
| Total APIs | 3 |
| Public Datasets | 145 |
| Restricted Public Datasets | 1 |
| Non-public Datasets | 2 |
| Normally there would be a set of quality assurance fields here to verify that the download links included within the metadata are functioning properly, but the results of those tests are not currently available. | |
| Bureaus Represented | 1 |
| Programs Represented | 2 |
| License Specified | 91.9% (136 of 148) |
| Datasets with Redactions | 0.0% (0 of 148) |
| Redactions without explanation (rights field) | 0.0% (0 of 148) |
| File Size | 155.05KB |
| Last crawl | Monday, 29-Feb-2016 23:16:51 EST |
| Analyze archive copies | Analyze archive from 2016-02-29 |
| Nearby Daily Crawls | |
| Expected /data URL | http://www.nsf.gov/data (From USA.gov Directory) |
|---|---|
| Resolved /data URL | http://www.nsf.gov/data/ |
| Redirects | 1 redirects |
| HTTP Status | 200 |
| Content Type | text/html;charset=ISO-8859-1 |
| Last crawl | Monday, 29-Feb-2016 23:16:42 EST |
| Expected /digitalstrategy.json URL | http://www.nsf.gov/digitalstrategy.json (From USA.gov Directory) |
|---|---|
| Resolved /digitalstrategy.json URL | http://www.nsf.gov/digitalstrategy.json |
| Redirects | |
| HTTP Status | 200 |
| Content Type | text/plain |
| Valid JSON | Invalid Check a JSON Validator |
| Last crawl | Monday, 29-Feb-2016 23:16:42 EST |