Department of Health and Human Services

http://www.hhs.gov/

Milestone 4 - August 31st 2014

OMB Review In Progress: OMB is currently reviewing the agency for this milestone. This review status indicator will change once the review is complete.

Leading Indicators

These indicators are reviewed by the Office of Management and Budget

Review Status in-progress
Reviewer Amelie Koran
Last Updated October 3, 2014, 10:08 am EDT by paulOMB

Assessment Summary

Inventory Composition

Public Dataset Status

Status Indicator Automated Metrics
Overall Progress this Milestone
Inventory Updated this Quarter
1068 Number of Datasets
Number of APIs
Schedule Delivered Crawl details
10/13 Bureaus represented
11 Programs represented
1068 Number of public datasets
Number of restricted public datasets
Number of non-public datasets
Inventory > Public listing
-19.6 Percentage growth in records since last quarter
Schedule Risk for Nov 30, 2014
Spot Check - datasets listed by search engine
Agency provides a public Enterprise Data Inventory on Data.gov
License specified Crawl details

Negative growth in PDL this quarter.

Status Indicator Automated Metrics
Overall Progress this Milestone
1068 Number of Datasets Crawl details
Number of Collections Crawl details
1042 Number of Public Datasets with File Downloads Crawl details
Number of APIs Crawl details
Total number of access and download links Crawl details
Quality Check: Links are sufficiently working Crawl details
Quality Check: Accessible links Crawl details
Quality Check: Redirected links Crawl details
Quality Check: Error links Crawl details
Quality Check: Broken links Crawl details
-29.1 Percentage growth in records since last quarter
96.7 Valid Metadata Crawl details
/data exists Crawl details
/data.json Crawl details
Harvested by data.gov
Views on data.gov for the quarter

No feedback mechanism was found on the digital strategy page.

Status Indicator Automated Metrics
Overall Progress this Milestone
Description of feedback mechanism delivered Crawl details
Data release is prioritized through public engagement
Feedback loop is closed, 2 way communication
See below Link to or description of Feedback Mechanism
www.healthdata.gov/ideas
Status Indicator Automated Metrics
Overall Progress this Milestone
Data Publication Process Delivered Crawl details
Information that should not to be made public is documented with agency's OGC
Status Indicator Automated Metrics
Overall Progress this Milestone
Damon Davis Open Data Primary Point of Contact
POCs identified for required responsibilities
Status Indicator Automated Metrics
Overall Progress this Milestone
Identified 5 data improvements for this quarter
Primary Uses
Value or impact of data
Primary data discovery channels
User suggestions on improving data usability
User suggestions on additional data releases
Digital Analytics Program on /data

Automated Metrics

These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot

data.json
Expected Data.json URL http://www.hhs.gov/data.json (From USA.gov Directory)
Resolved Data.json URL http://hub.healthdata.gov/data.json
Number of Redirects 2 redirects
HTTP Status 200
Content Type application/json
Valid JSON Valid
Datasets with Valid Metadata 96.7%(1033 of 1068)
Valid Schema Invalid
For more complete and readable validation results, see the full schema validator results
Schema Errors There are validation errors on 35 records

Only showing errors from the first 10 records:

Errors on record 734:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 737:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 777:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 778:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 793:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 794:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 795:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 800:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 802:
  • the property bureauCode is required
  • the property programCode is required
Errors on record 803:
  • the property bureauCode is required
  • the property programCode is required
Datasets 1068
Datasets with Distribution URLs 97.6% (1042 of 1068)
Total Distribution URLs 2044
Public Datasets 1068
Restricted Public Datasets 0
Non-public Datasets 0
Bureaus Represented 11
Programs Represented 11
File Size 1.76MB
Last modified Friday, 29-Aug-2014 22:08:54 EDT
Last crawl Saturday, 30-Aug-2014 00:00:49 EDT
Analyze archive copies Analyze archive from 2014-08-31
/data page
Expected /data URL http://www.hhs.gov/data (From USA.gov Directory)
Resolved /data URL http://hub.healthdata.gov/pod/data-catalog
Redirects 2 redirects
HTTP Status 200
Content Type text/html; charset=utf-8
Last crawl Saturday, 30-Aug-2014 00:00:50 EDT
/digitalstrategy.json
Expected /digitalstrategy.json URL http://www.hhs.gov/digitalstrategy.json (From USA.gov Directory)
Resolved /digitalstrategy.json URL http://www.hhs.gov/digitalstrategy.json
Redirects
HTTP Status 200
Content Type text/plain
Valid JSON Valid
Last modified Friday, 29-Aug-2014 07:33:21 EDT
Last crawl Saturday, 30-Aug-2014 00:00:50 EDT
Digital Strategy

Date specified: Tuesday, 26-Aug-2014 15:05:31 EDT

Date of digitalstrategy.json file: Friday, 29-Aug-2014 07:33:21 EDT

1.2.4 Develop Data Inventory Schedule - Summary

Summarize the Inventory Schedule


Described in the The HHS Health Data Initiative Strategy & Execution Plan. http://www.healthdata.gov/blog/health-data-initiative-strategy-execution-plan-released-and-ready-feedback#6xTGucIWabDtqRRe.99

1.2.5 Develop Data Inventory Schedule - Milestones

TitleHealth Data Initiative Strategy & Execution Plan
DescriptionThe HHS Health Data Initiative Strategy & Execution Plan is a date driven, metrics based living document that details the strategies and execution plans for the Department’s Health Data Initiative (HDI). The HDI plan describes the steps that HHS and other contributors will take to expand, enrich, and open the vast catalog of data resources in the department and across the health care and human services ecosystem. Read more at http://www.healthdata.gov/blog/health-data-initiative-strategy-execution-plan-released-and-ready-feedback#6xTGucIWabDtqRRe.99
Milestone DateOngoing
Description of how this milestone expands the InventoryDescribed in the The HHS Health Data Initiative Strategy & Execution Plan. http://www.healthdata.gov/blog/health-data-initiative-strategy-execution-plan-released-and-ready-feedback#6xTGucIWabDtqRRe.99
Description of how this milestone enriches the InventoryDescribed in the The HHS Health Data Initiative Strategy & Execution Plan. http://www.healthdata.gov/blog/health-data-initiative-strategy-execution-plan-released-and-ready-feedback#6xTGucIWabDtqRRe.99
Description of how this milestone opens the InventoryDescribed in the The HHS Health Data Initiative Strategy & Execution Plan. http://www.healthdata.gov/blog/health-data-initiative-strategy-execution-plan-released-and-ready-feedback#6xTGucIWabDtqRRe.99

1.2.6 Develop Customer Feedback Process

Describe the agency's process to engage with customers


The now three-year-old Health Data Initiative (HDI), the collective effort to release vast stores of data for innovation, has at its core a mission to help improve health, healthcare, and the delivery of human services by harnessing the power of data and fostering a culture of innovative uses of data in public and private sector institutions, communities, research groups and policy making arenas.  The HDI’s goal is to make health data openly available, disseminate the data broadly across the health and human services ecosystem, and continuously educate internal and external participants in the ecosystem about the value of the data.   

HealthData.gov (www.healthdata.gov) serves as the discovery resource for those data assets, as well as a platform for communications, commentary on, and feedback about the data to improve the public’s understanding of each data set. The platform helps new data users discover resources they may not otherwise know exist.  This site is a flexible platform that acts as a discovery resource for new and seasoned users across the healthcare ecosystem from researchers to tech/developers, and healthcare professionals to academia.  Any organization or individual is free to employ the data to solve problems in the transformation of our nation’s healthcare system through data driven innovations in areas like: research; technology development; healthcare delivery; academia; policy making; human services delivery.

Methods for Customer Feedback and Public Engagement

This document describes how HHS is identifying and engaging with key data customer groups like these to help expand the value of our health data assets and prioritize the release of new data. To assist that prioritization HHS intends to capitalize on the quantity and quality of user demand it receives through various feedback channels as well as focusing on the identification of strategically relevant data assets (SRDA) tied directly to HHS’s articulated strategic goals.  To ensure the customer feedback loops are meaningful and robust HHS will regularly review feedback processes and refine them as opportunities and challenges present themselves.  

Here are some of the ways the HHS HDI seeks opportunities for public engagement:
*HealthData.gov (www.healthdata.gov)
Through this catalog data is available in multiple formats for maximum utilization by health care ecosystem participants.  Human readable data and machine readable data formats are accessible which are spawning and feeding key transformations across health care and the delivery of human services.  HHS is working to make broader volumes of machine readable data available. 
The “Ideas” tab (www.healthdata.gov/ideas) on the site is designed to invite public to provide feedback to HHS.  An idea could be anything from the submissions of data that you’d like to see cataloged on the platform, to ways you’d like to see the site improved, or suggestions for communications about data assets and their uses.  These submissions are very informative for our data liberation strategy so send in your great ideas! The section is divided into “Most Recent” submission which is ordered by the date the idea was posted, and “Most Popular” which are ranked by the number of public votes that idea has received.   Each idea can be voted on by the public using a five (5) star rating system (one (1) is the lowest rating, five (5) is the highest).  
The “Q & A” tab (www.healthdata.gov/questions-answers)offers users an opportunity to ask questions and receive answers from HDI staff about the data.  HHS is working to associate a direct point of contact individual, by name and email address, with each data set listing in the catalog.   This will allow direct interactions about the data with the experts who have cataloged it.  The tab similarly broken down by “Most recent” and “Most popular”. 
Our Blog (www.healthdata.gov/blog) offers a robust source of information about the HDI’s activities including the availability of new data, some of the creative and innovative uses of health data, and the technological advancement of healthcare and human services delivery supported by data’s broadening availability.  
The HDI staff makes every attempt to address ideas, questions and answers, and blog responses in a timely fashion.  

*Health Datapalooza! (www.healthdatapalooza.org)– This perennial health data event is a favorite among entrepreneurs, innovators, policy makers, data geeks, researchers and more.  The Health Data Initiative is widely represented during this event put on by the Health Data Consortium (www.healthdataconsortium.org), a public private partnership between government, non-profit, and private sector organizations working to foster the availability and innovative use of data to improve health and health care.  HHS welcomes this opportunity to engage face to face with the many innovators that are using, or seeking to use publicly available sources to support their work and initiatives. 

*Social Media – More than ever before topical conversations are occurring through social media and the open health data movement is no exception.  
You can follow the Health Data Initiative on Twitter @HealthDataGov.  From this account you will see announcements about new opportunities, new blog posts, and information from others on Twitter that we think is important (which could also be something you post).   Be sure to follow us and remember if you have a question, just ask!

Join the health data community online using the Facebook page U.S. Department of Health and Human Services Innovations! https://www.facebook.com/pages/US-Department-of-Health-and-Human-Services-Innovations/443533355685557?_fb_noscript=1  Twitter is great, but sometimes what we have to say needs more than 140 characters.   On our page you will find highlights coming from the HDI, but more importantly, it is an avenue in which you can engage directly with us.

The HDI is also participating in communities on LinkedIn, bringing the information directly to you in established LinkedIn Groups that have been working in the areas we care about the most. Look for us in various communities on LinkedIn.

1.2.7 Develop Data Publication Process

Describe the agency's data publication process


The HealthData.gov (www.healthdata.gov) online platform is the central data access point and communications vehicle for the HHS Health Data Initiative (HDI) offering access to, dissemination of, and bi-directional communications about HHS and other sources health data.  The HDI is the collective effort to release vast stores of health data for innovation with a mission of improving health, healthcare, and the delivery of human services by harnessing the power of data and fostering a culture of innovative uses of data in public and private sectors.  Therefore the goal for the platform is to be a highly useful, reliable platform for sharing datasets and fostering innovation.   HealthData.gov will continue to be at the forefront of HDI’s efforts to create a discovery zone for HHS and other health data.
 
Supporting the promotion of data to the platform are Health Data Leads, liaisons representing each division across the Department who are contributors to the execution of the Health Data Initiative Strategy and Execution Plan (http://www.healthdata.gov/blog/health-data-initiative-strategy-execution-plan-released-and-ready-feedback).  Each lead relies on a cadre of colleagues within their division to proactively discover and catalog data resources from the various research projects, surveys, contracts or other mechanisms for data generation and curation.   This document describes the data promotion process and workflow that Health Data Leads and their teams execute at HHS.


Data Promotion Process

Here, it is assumed that each division has completed its internal process for identifying data to be cataloged and is ready to begin the process for data promotion.   Therefore the first step in the workflow requires the completion of a metadata template which for catalog entry.  The template requires data elements including: 

1.	Title: Descriptive title for the data asset to be displayed on 
2.	HHS Group:  The Operating or Staff division within HHS responsible for the data in this catalogue entry.
3.	Description: A detailed description of the dataset or tool (e.g., an abstract) that help a user to determine the nature and purpose of the data. 
4.	Privacy and Information Quality Certification – Attestations that the data submission meets the agency’s standards for privacy and information quality. 
5.	Author Information and Agency – Includes fields for: 
	a.	HHS sub agency
	b.	Agency program URL
	c.	Data series or tool URL
	d.	Subject Area for the Data: Administrative, Biomedical Research, Children's Health,  Epidemiology, Health Care Cost, Health Care Providers, Medicaid, Medicare, Other, Population Statistics, Quality Measurement, Safety, Treatments
	e.	Additional Subject Area: Searchable keywords help users discover your datasets from different perspectives, providing ways of identifying other similar datasets.  Includes terms that would be used by both technical and non-technical users.
	f.	Date Released: Date when the dataset was first made available to the public. (Not the date the data was entered into HealthData.gov as it could have already been published on the agency website prior to being cataloged)
	g.	Date Updated: Date of last change to dataset or tool. (Note that this could be the same as the date released if the data has not changed since first being published.)
	h.	Contact Information: Name and email address for the individual making the catalogue entry.
	i.	Data Collection Date and Frequency:  N/A, Annually, Daily, Monthly, Quarterly, Semi-Weekly
	j.	Data Coverage Period: Dates covered by the data
6.	Data Documentation:
	a.	Technical Documentation URL: URL for the technical documentation for this dataset. This may include description to the study design, instrumentation, implementation, limitations, and appropriate use of the dataset or tool. 
	b.	Data Dictionary URL: URL to resource containing variable names, descriptions, standard vocabularies and taxonomies, units, multipliers, etc. if different from the technical documentation URL.
	c.	Data Collection Instrument URL: URL for resource containing a copy of, or detailed descriptions of, the data collection instrument, if different from the technical documentation.
	d.	 Dataset Use Requires a License Agreement: This is a required field to ensure that license agreements are not bypassed during the one-click download interface on the website. 
	e.	 Dataset License Agreement URL: URL to the license agreement page for the dataset or tool, if there is one.
7.	Resources
	a.	Access Point URL: If the data set is downloadable, enter the URL for instant access to the downloadable data file. This is the URL for access to the data set via a "one-click download".
	b.	Media Format - In some cases files are downloaded in a compressed file (e.g. zip). 

Data Promotion Roles

The specific roles of each individual in the workflow are Author, Editor, and Approver:  The responsibilities for each role are detailed below.

Author: 
Initiates the data catalog entry based on the current version of the metadata template.  Authors strive for adherence to Plain Language writing guidance to make the entry understandable for technical audiences and the general public.  When the data appears in the live online catalog it can be easily found using key words entered in the template during this step.  They also attest the submission meets agency privacy guidelines and information quality guidelines.  Authors may assign themselves as the public's point of contact (POC) for a data set, or insert the alternate contact information for a colleague who is the POC or the subject matter expert (SME) for the data or tool.  The author typically is not the Health Data Lead, but a colleague within the same division or a contractor supporting the project where the data originates.   Once completed, the catalogue entry's status is advanced to "Editor Review".

Editor: 
Editors, typically the Health Data Lead for their HHS division, are tasked with reviewing the initial catalog entry drafted by an author, then advanced to "Editor Review" status.  Their review consists of reviewing the metadata confirming adherence to Plain Language guidelines, adding key words, and confirming the attestations for .... . An editor may ask the author to modify or adjust a catalog entry.  Once the editor has completed their review and modifications have been made, the catalogue entry’s status is advanced to "Awaiting Approval".   

Approver: 
The Approver performs the final check on all elements of the catalog entry before it is made public on the platform. Approvers verify compliance with administrative procedures and with internal protocols for data promotion. Once the Approver has completed their review the catalogue entry’s status is advanced to “Approved”. 

Once approved, the catalogue appears on HealthData.gov within minutes.  Updates to already cataloged data assets are processed through the workflow again offering the Author an opportunity to alter the metadata to accurately reflect the update, allowing the Editor to re-validate that privacy and plain language compliance before being re-approved.