General Services Administration
Enterprise Data Inventory - Volume and composition over time
M-13-13 Milestone 12 - August 31st 2016
OMB Review In Progress: OMB is currently reviewing the agency for this milestone. This review status indicator will change once the review is complete.
Leading Indicators
These indicators are reviewed by the Office of Management and Budget
| Review Status | in-progress |
|---|---|
| Reviewer | Bryant Renaud |
| Last Updated | October 22, 2016, 5:28 pm EDT by Bryant Renaud |
Assessment Summary
EDI is Yellow: some datasets are missing licensing information.
Inventory Composition
Public Dataset Status
| Status | Indicator | Automated Metrics | ||
|---|---|---|---|---|
| Overall Progress this Milestone | ||||
| Inventory Updated this Quarter | ||||
| 209 | Number of Datasets | |||
| 22 | Number of APIs | |||
| 5 | Bureaus represented | |||
| Percentage of bureaus represented | ||||
| 18 | Programs represented | |||
| 78.3% | Percentage of programs represented | |||
| 193 | Number of public datasets | |||
| 1 | Number of restricted public datasets | |||
| 15 | Number of non-public datasets | |||
| 1.4% | Percentage growth in records since last quarter | |||
| To a very great extent (>75%) | To what extent is your agency’s Enterprise Data Inventory (EDI) complete? | |||
| See below | What steps have you taken to ensure your Enterprise Data Inventory is complete | |||
| As we prepare for the start of a new Fiscal Year (FY17) we are also taking new steps to ensure our Enterprise Data Inventory (EDI) is complete and accurate. After working with the Working Data Management Group, Data Owners/Stewards and the D2D team to clean our current datasets, we are now focused on Quality datasets. To ensure accuracy we want to move away from manual uploading of data files, and pull the data straight from the source. We are going to reach out again to the Data Owners/Stewards and notify them (if applicable) that their dataset does not adhere to the M-13-13 Standards and Policy, and must be updated. We will be focusing on all datasets that were identified in our review as being non-compliant, and take the necessary steps to remove them if they are not updated. Our goals for Open Data are 'Quality not Quantity', we are striving to publish datasets that provide current and meaningful data, that will provide useful information to the Public and other Federal Agencies, to provide more API's, and pull data from the source. We continue to monitor and ensure that our EDI is accurate and complete, and shall continue to work on adding additional datasets and APIs. | ||||
| Agency provides a public Enterprise Data Inventory on Data.gov | ||||
| Agency provided updated Enterprise Data Inventory to OMB | ||||
| 100.0% | License specified | Crawl details | ||
| Number of datasets with redactions | ||||
| 0.0% | Percent of datasets with redactions | |||
| Status | Indicator | Automated Metrics |
|---|---|---|
| Overall Progress this Milestone | ||
| 209 | Number of Datasets | Crawl details |
| 9 | Number of Collections | Crawl details |
| 108 | Number of datasets not contained in a collection | Crawl details |
| 187 | Number of Public Datasets with File Downloads | Crawl details |
| 22 | Number of APIs | Crawl details |
| 21 | Number of public APIs | Crawl details |
| 1 | Number of restricted public APIs | Crawl details |
| Number of non-public APIs | Crawl details | |
| 198 | Total number of access and download links | Crawl details |
| Quality Check: Links are sufficiently working | Crawl details | |
| Quality Check: Accessible links | Crawl details | |
| 16 | Quality Check: Redirected links | Crawl details |
| 1 | Quality Check: Error links | Crawl details |
| 9 | Quality Check: Broken links | Crawl details |
| 77.1% | Quality Check: Percentage of download links in correct format as specified in metadata | Crawl details |
| 5.3% | Quality Check: Percentage of download links in HTML | Crawl details |
| 4.1% | Quality Check: Percentage of download links in PDF | Crawl details |
| 1.4% | Percentage growth in records since last quarter | |
| 100% | Valid Metadata | Crawl details |
| /data exists | Crawl details | |
| Provides datasets in human-readable form on /data | ||
| /data.json | Crawl details | |
| Harvested by data.gov | ||
| 193 | Number of public datasets | Crawl details |
| 1 | Number of restricted public datasets | Crawl details |
| 15 | Number of non-public datasets | Crawl details |
| 1.6% | Percent growth of public datasets | |
| 0.0% | Percent growth of restricted public datasets | |
| 0.0% | Percent growth of non-public datasets | |
| Percent datasets licensed as U.S. Public Domain | ||
| Percent datasets licensed as Creative Commons Zero | ||
| Percent datasets with other licenses | ||
| Percent datasets with no license |
| Status | Indicator | Automated Metrics | ||
|---|---|---|---|---|
| Overall Progress this Milestone | ||||
| Description of feedback mechanism delivered | Crawl details | |||
| Data release is prioritized through public engagement | ||||
| Provided narrative evidence of data improvements based on public feedback this quarter | ||||
| Feedback loop is closed, 2 way communication | ||||
| See below | Link to or description of Feedback Mechanism | |||
| http://www.gsa.gov/portal/content/140871 | ||||
| Provides valid contact point information for all datasets | ||||
| Status | Indicator | Automated Metrics | ||
|---|---|---|---|---|
| Overall Progress this Milestone | ||||
| Data Publication Process Delivered | Crawl details | |||
| Information that should not to be made public is documented with agency's OGC | ||||
| See below | Describe the agency's data publication process | |||
| Internal Clearance Processing The EIDM team met with representatives of the Office of General Counsel (OGC), the Freedom of Information Access Office (FOIA) and the GSA Privacy Officer to develop an internal clearance process for GSA datasets prior to their release. We agreed that our goal is proactive disclosure of datasets but we will ensure that the clearance process has risk mitigation included. The process that has been implemented includes: ● Program Manager, the data owner, will approve the dataset and metadata for public release and seek approval from their management. Associate Administrator within the SSO will provide approval to the SSO Portfolio Data Manager (PDM) ● If the dataset is new and has never been publicly published, the PDM will provide the metadata and dataset to Executive Secretariat office for entry into GSA’s internal correspondence routing system known as IQ. ● FOIA Officer will forward to OGC through IQ. OGC will coordinate with the GSA Privacy Officer and Executive Secretariat. ● If OGC approves the dataset for release, then the PDM is notified by OGC. OGC closes out the IQ/Executive Secretariat entry. ● The PDM will prepare the datasets for release and notify the SSO data owners and EIDM. ● If there were issues in the course of review, then the concerns will be routed back through the IQ/Executive Secretariat system by the FOIA or OGC to the PDM to respond to the issue. Access Level Determination Through consultation and coordination with the GSA FOIA office, OGC and Chief Privacy Officer, the decision was made to first use the GSA FOIA exceptions as a basis for initial access level determination. These FOIA exemptions are consistent governmentwide, not solely a GSA policy. Additional privacy analysis is performed within the FOIA, OGC and Chief Privacy offices to ensure PII is not disclosed through the release of a data asset, and that the “mosaic effect” will not create additional security and privacy concerns. The reviewers will document in the Enterprise Data Inventory the reasons for the restricted and private access level determinations. | ||||
| Status | Indicator | Automated Metrics |
|---|---|---|
| Overall Progress this Milestone | ||
| Open Data Primary Point of Contact | ||
| POCs identified for required responsibilities | ||
| Chief Data Officer (if applicable) |
Automated Metrics
These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot
data.json
| Expected Data.json URL | http://www.gsa.gov/data.json (From USA.gov Directory) |
|---|---|
| Resolved Data.json URL | http://open.gsa.gov/data.json |
| Number of Redirects | 1 redirects |
| HTTP Status | 200 |
| Content Type | application/json |
| Valid JSON | Valid |
| Detected Data.json Schema | federal-v1.1 |
| Datasets with Valid Metadata | 100%(209 of 209) |
| Valid Schema | Valid |
| Datasets | 209 |
| Number of Collections | 9 |
| Number of datasets not in a collection | 108 |
| Datasets with Distribution URLs | 89.5% (187 of 209) |
| Datasets with Download URLs | 68.4% (143 of 209) |
| Total Distribution URLs | 198 |
| Total Download URLs | 154 |
| Total APIs | 22 |
| Public APIs | 21 |
| Restricted Public APIs | 1 |
| Non-public APIs | 0 |
| Public Datasets | 193 |
| Restricted Public Datasets | 1 |
| Non-public Datasets | 15 |
| Normally there would be a set of quality assurance fields here to verify that the download links included within the metadata are functioning properly, but the results of those tests are not currently available. | |
| Bureaus Represented | 5 |
| Programs Represented | 18 |
| License Specified | 100% (209 of 209) |
| Datasets with Redactions | 0.0% (0 of 209) |
| Redactions without explanation (rights field) | 0.0% (0 of 209) |
| File Size | 313.18KB |
| Last modified | Tuesday, 23-Aug-2016 11:10:00 EDT |
| Last crawl | Sunday, 28-Aug-2016 00:02:09 EDT |
| Analyze archive copies | Analyze archive from 2016-08-31 |
| Nearby Daily Crawls | |
| Expected /data URL | http://www.gsa.gov/data (From USA.gov Directory) |
|---|---|
| Resolved /data URL | http://www.gsa.gov/portal/category/105839 |
| Redirects | 1 redirects |
| HTTP Status | 200 |
| Content Type | text/html;charset=UTF-8 |
| Last crawl | Sunday, 28-Aug-2016 00:02:06 EDT |
| Expected /digitalstrategy.json URL | http://www.gsa.gov/digitalstrategy.json (From USA.gov Directory) |
|---|---|
| Resolved /digitalstrategy.json URL | http://www.gsa.gov/digitalstrategy.json |
| Redirects | |
| HTTP Status | 200 |
| Content Type | application/json |
| Valid JSON | Invalid Check a JSON Validator |
| Last modified | Tuesday, 10-Dec-2013 16:50:09 EST |
| Last crawl | Sunday, 28-Aug-2016 00:02:07 EDT |