Office of Personnel Management
Enterprise Data Inventory - Volume and composition over time
Milestone 13 - November 30th 2016
OMB Review In Progress: OMB is currently reviewing the agency for this milestone. This review status indicator will change once the review is complete.
Leading Indicators
These indicators are reviewed by the Office of Management and Budget
Review Status | in-progress |
---|---|
Reviewer | Bryant Renaud |
Last Updated | February 7, 2017, 12:41 pm EST by Bryant Renaud |
Assessment Summary
Public Engagement is Red: Agency is working to develop better two-way feedback mechanism; this is a good roadmap and once it is available, this metric can change to green.
Inventory Composition
Public Dataset Status
Dataset Link Quality
Status | Indicator | Automated Metrics | ||
---|---|---|---|---|
Overall Progress this Milestone | ||||
Inventory Updated this Quarter | ||||
702 | Number of Datasets | |||
6 | Number of APIs | |||
1 | Bureaus represented | |||
100.0% | Percentage of bureaus represented | |||
8 | Programs represented | |||
72.7% | Percentage of programs represented | |||
603 | Number of public datasets | |||
56 | Number of restricted public datasets | |||
43 | Number of non-public datasets | |||
0.0% | Percentage growth in records since last quarter | |||
To a very great extent (>75%) | To what extent is your agency’s Enterprise Data Inventory (EDI) complete? | |||
See below | What steps have you taken to ensure your Enterprise Data Inventory is complete | |||
OPM has worked with its program offices to identify, collect and add appropriate data sets to the OPM EDI. This is an ongoing process as OPM continues to work to updating the EDI to ensure the data provides the most value to the public. | ||||
Agency provides a public Enterprise Data Inventory on Data.gov | ||||
Agency provided updated Enterprise Data Inventory to OMB | ||||
100.0% | License specified | Crawl details | ||
Number of datasets with redactions | ||||
0.0% | Percent of datasets with redactions |
Status | Indicator | Automated Metrics |
---|---|---|
Overall Progress this Milestone | ||
702 | Number of Datasets | Crawl details |
34 | Number of Collections | Crawl details |
186 | Number of datasets not contained in a collection | Crawl details |
497 | Number of Public Datasets with File Downloads | Crawl details |
6 | Number of APIs | Crawl details |
6 | Number of public APIs | Crawl details |
Number of restricted public APIs | Crawl details | |
Number of non-public APIs | Crawl details | |
788 | Total number of access and download links | Crawl details |
Quality Check: Links are sufficiently working | Crawl details | |
91 | Quality Check: Accessible links | Crawl details |
643 | Quality Check: Redirected links | Crawl details |
12 | Quality Check: Error links | Crawl details |
42 | Quality Check: Broken links | Crawl details |
81.3% | Quality Check: Percentage of download links in correct format as specified in metadata | Crawl details |
5.5% | Quality Check: Percentage of download links in HTML | Crawl details |
8.8% | Quality Check: Percentage of download links in PDF | Crawl details |
0.0% | Percentage growth in records since last quarter | |
100% | Valid Metadata | Crawl details |
/data exists | Crawl details | |
Provides datasets in human-readable form on /data | ||
/data.json | Crawl details | |
Harvested by data.gov | ||
603 | Number of public datasets | Crawl details |
56 | Number of restricted public datasets | Crawl details |
43 | Number of non-public datasets | Crawl details |
0.0% | Percent growth of public datasets | |
0.0% | Percent growth of restricted public datasets | |
0.0% | Percent growth of non-public datasets | |
Percent datasets licensed as U.S. Public Domain | ||
Percent datasets licensed as Creative Commons Zero | ||
Percent datasets with other licenses | ||
Percent datasets with no license |
Status | Indicator | Automated Metrics | ||
---|---|---|---|---|
Overall Progress this Milestone | ||||
Description of feedback mechanism delivered | Crawl details | |||
Data release is prioritized through public engagement | ||||
Provided narrative evidence of data improvements based on public feedback this quarter | ||||
Feedback loop is closed, 2 way communication | ||||
See below | Link to or description of Feedback Mechanism | |||
On both http://www.opm.gov/digitalstrategy and http://www.opm.gov/data, we have placed links to the open government topic area in our frequently asked questions system. Users can submit questions or requests for data using that system. OPM's open government staff redirects these requests to the proper program offices. The program office staff are responsible for responding to these requests. OPM's Data Governance Board considers remaining requests monthly and works with program offices to release that data as practicable. | ||||
Provides valid contact point information for all datasets |
Status | Indicator | Automated Metrics | ||
---|---|---|---|---|
Overall Progress this Milestone | ||||
Data Publication Process Delivered | Crawl details | |||
Information that should not to be made public is documented with agency's OGC | ||||
See below | Describe the agency's data publication process | |||
The Data Governance Board (DGB) oversees the processes of inventorying and releasing data. The members of the DGB are technically competent and represent OPM's major data owners. We are currently focusing on inventorying our data and are temporarily accepting program offices' assessments of the public access level of their data assets. However, as we further open our data, when a program office deems that the data's access level should be public, the DGB will offer a determination as to whether the data could be harmful because of the mosaic effect (i.e., two or more independently harmless data sets could be compared or unified to inadvertently identify an individual or otherwise cause harm) or other potential negative consequence that was unforeseen by the program office. If the DGB does not find any such potential harms, it will recommend to OPM's Investment Review Board (IRB) that the data be released. The IRB, in turn, will make a recommendation to OPM's Office of the General Counsel, which will make the final determination. Any program office that labels its data restricted public or non-public is required to provide justification. Reasons for restricting some of the data but not all of it (restricted public) fall primarily into the category of personally identifiable information, or PII. Social security numbers are the most obvious example, but information about law enforcement or homeland security personnel that could make them easy to target would also be off-limits. In these cases, we will release non-attributable or less granular data at the appropriate time. Reasons for restricting all of the data (non-public) will fall primarily into the category of security. For example, certain data about Continuity of Operations (COOP) are not be releasable because they could compromise the agency's ability to operate securely in a national emergency. |
Automated Metrics
These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot
data.json
Expected Data.json URL | http://www.opm.gov/data.json (From USA.gov Directory) |
---|---|
Resolved Data.json URL | http://www.opm.gov/data.json |
Number of Redirects | |
HTTP Status | 200 |
Content Type | application/javascript |
Valid JSON | Valid |
Detected Data.json Schema | federal-v1.1 |
Datasets with Valid Metadata | 100%(702 of 702) |
Valid Schema | Valid |
Datasets | 702 |
Number of Collections | 34 |
Number of datasets not in a collection | 186 |
Datasets with Distribution URLs | 70.8% (497 of 702) |
Datasets with Download URLs | 69.7% (489 of 702) |
Total Distribution URLs | 788 |
Total Download URLs | 556 |
Total APIs | 6 |
Public APIs | 6 |
Restricted Public APIs | 0 |
Non-public APIs | 0 |
Public Datasets | 603 |
Restricted Public Datasets | 56 |
Non-public Datasets | 43 |
Normally there would be a set of quality assurance fields here to verify that the download links included within the metadata are functioning properly, but the results of those tests are not currently available. | |
Bureaus Represented | 1 |
Programs Represented | 8 |
License Specified | 100% (702 of 702) |
Datasets with Redactions | 0.0% (0 of 702) |
Redactions without explanation (rights field) | 0.0% (0 of 702) |
File Size | 1.41MB |
Last modified | Friday, 11-Sep-2015 11:53:48 EDT |
Last crawl | Wednesday, 30-Nov-2016 23:03:50 EST |
Analyze archive copies | Analyze archive from 2016-11-30 |
Nearby Daily Crawls |
Expected /data URL | http://www.opm.gov/data (From USA.gov Directory) |
---|---|
Resolved /data URL | http://www.opm.gov/data/ |
Redirects | 1 redirects |
HTTP Status | 200 |
Content Type | text/html; charset=utf-8 |
Last crawl | Wednesday, 30-Nov-2016 23:03:45 EST |
Expected /digitalstrategy.json URL | http://www.opm.gov/digitalstrategy.json (From USA.gov Directory) |
---|---|
Resolved /digitalstrategy.json URL | http://www.opm.gov/digitalstrategy.json |
Redirects | |
HTTP Status | 200 |
Content Type | application/javascript |
Valid JSON | Valid |
Last modified | Friday, 11-Sep-2015 11:53:48 EDT |
Last crawl | Wednesday, 30-Nov-2016 23:03:45 EST |
Date specified: Monday, 16-Mar-2015 00:00:00 EDT
Date of digitalstrategy.json file: Friday, 11-Sep-2015 11:53:48 EDT1.2.4 Develop Data Inventory Schedule - Summary
Summarize the Inventory Schedule
Our open data leads continue to work with program offices to document data definitions and standards and publish releasable metadata. We have worked one-on-one with these offices to obtain the metadata and spread awareness about the importance of providing open data in machine-readable formats. Now that our inventory has begun to reach a good level of maturity, we are enriching and opening our inventory by providing more detailed metadata, better organizing our inventory, and providing more data to the public in machine-readable formats. One example of better organization was made possible with the late 2014 release of version 1.1 of the metadata schema: parent-child relationships. We are taking full advantage of this new feature. Since completing the initial inventory in November 2014 and moving to version 1.1 of the schema in February 2015, we have begun to turn our attention to providing machine-readable versions of data that we currently make available only in formats such as PDF. However, we continue to identify data assets that have fallen through the cracks and add them to the inventory. Besides maintaining the inventory, we will conduct qualitative analysis to develop categories among the data assets and modify our Metadata Repository (MDR) to incorporate these categories and accommodate the data assets.
1.2.5 Develop Data Inventory Schedule - Milestones
Title | Data asset identification |
---|---|
Description | Identification of data assets from throughout OPM |
Milestone Date | 2014-02-28 |
Description of how this milestone expands the Inventory | Increased the number of data assets in the inventory |
Description of how this milestone enriches the Inventory | Provided a broader picture of OPM's data |
Description of how this milestone opens the Inventory | Created a foundation for public release of data |
Title | Wider data asset identification |
---|---|
Description | Identification of data assets at a more granular level and from a wider range of programs |
Milestone Date | 2014-05-31 |
Description of how this milestone expands the Inventory | Increased the number of data assets in the inventory |
Description of how this milestone enriches the Inventory | Provided an even broader and deeper picture of OPM's data |
Description of how this milestone opens the Inventory | Built a stronger foundation for public release of data |
Title | Targeted asset identification |
---|---|
Description | Identification of missing data assets and targeting them for inclusion |
Milestone Date | 2013-08-30 |
Description of how this milestone expands the Inventory | Increased the number of data assets in the inventory |
Description of how this milestone enriches the Inventory | Continued to expand our understanding of OPM's data |
Description of how this milestone opens the Inventory | Helped identify data for immediate or future release |
Title | Complete inventory |
---|---|
Description | Release of the complete inventory |
Milestone Date | 2014-11-30 |
Description of how this milestone expands the Inventory | Brought stragglers into the fold |
Description of how this milestone enriches the Inventory | Provided a nearly full understanding of OPM's data |
Description of how this milestone opens the Inventory | Made as much of the full inventory available to the public as legally appropriate and practicable |
Title | Metadata Schema v. 1.1 |
---|---|
Description | Moved the inventory files to v. 1.1 of the metadata schema. In the process, designated some entries as parents and created hundreds of children, along with parent-child relationships |
Milestone Date | 2015-02-28 |
Description of how this milestone expands the Inventory | Provided individual entries for child datasets |
Description of how this milestone enriches the Inventory | Parent-child relationships show how one dataset connects to another. This change helps the public better understand OPM's data |
Description of how this milestone opens the Inventory | Provided more links to OPM's publicly available data, organized in a useful way |
Title | Deeper Look into Employee Services |
---|---|
Description | Focused on Employee Services data to fill in holes about areas such as employee relations, labor relations, veterans employment, and official time for unions representing federal employees |
Milestone Date | 2015-05-31 |
Description of how this milestone expands the Inventory | Adds entries regarding Employee Services |
Description of how this milestone enriches the Inventory | Provides a fuller understanding of OPM's government-wide policies |
Description of how this milestone opens the Inventory | Federal job series data are now available in a machine-readable format (CSV), making the data easier to work with |
1.2.6 Develop Customer Feedback Process
Describe the agency's process to engage with customers
On both http://www.opm.gov/digitalstrategy and http://www.opm.gov/data, we have placed links to the open government topic area in our frequently asked questions system. Users can submit questions or requests for data using that system. OPM's open government staff redirects these requests to the proper program offices. The program office staff are responsible for responding to these requests. OPM's Data Governance Board considers remaining requests monthly and works with program offices to release that data as practicable.
1.2.7 Develop Data Publication Process
Describe the agency's data publication process
The Data Governance Board (DGB) oversees the processes of inventorying and releasing data. The members of the DGB are technically competent and represent OPM's major data owners. We are currently focusing on inventorying our data and are temporarily accepting program offices' assessments of the public access level of their data assets. However, as we further open our data, when a program office deems that the data's access level should be public, the DGB will offer a determination as to whether the data could be harmful because of the mosaic effect (i.e., two or more independently harmless data sets could be compared or unified to inadvertently identify an individual or otherwise cause harm) or other potential negative consequence that was unforeseen by the program office. If the DGB does not find any such potential harms, it will recommend to OPM's Investment Review Board (IRB) that the data be released. The IRB, in turn, will make a recommendation to OPM's Office of the General Counsel, which will make the final determination. Any program office that labels its data restricted public or non-public is required to provide justification. Reasons for restricting some of the data but not all of it (restricted public) fall primarily into the category of personally identifiable information, or PII. Social security numbers are the most obvious example, but information about law enforcement or homeland security personnel that could make them easy to target would also be off-limits. In these cases, we will release non-attributable or less granular data at the appropriate time. Reasons for restricting all of the data (non-public) will fall primarily into the category of security. For example, certain data about Continuity of Operations (COOP) are not be releasable because they could compromise the agency's ability to operate securely in a national emergency.