De-identification Processes

De-identification Processes2018-10-24T10:39:59-04:00

Protected Health Information (PHI)

The HIPAA Privacy Rule defines protected health information as “information, including demographic information, which relates to:

  • the individual’s past, present, or future physical or mental health or condition,
  • the provision of health care to the individual, or
  • the past, present, or future payment for the provision of health care to the individual, and that identifies the individual or for which there is a reasonable basis to believe can be used to identify the individual. Protected health information includes many common identifiers (e.g., name, address, birth date, Social Security Number) when they can be associated with the health information listed above.”

Safe Harbor Method

The Data Core uses the Safe Harbor Method to de-identify data sets. Per the U.S. Department of Health and Human Services, the following 18 identifiers are PHI and should be removed for a data set to comply with Safe Harbor standards:

  • Names
  • Geographic subdivisions smaller than a state.
  • Dates that are “directly related to an individual.”
  • Telephone numbers
  • Vehicle identifiers and serial numbers
  • Fax number
  • Device identifiers and serial numbers
  • Email addresses
  • Web Universal Resource Locators (URLs)
  • Social security numbers
  • Internet Protocol (IP) addresses
  • Medical record numbers
  • Biometric identifiers (e.g. finger and voice prints)
  • Health plan beneficiary numbers
  • Full-face photographs and any comparable images
  • Account numbers
  • Any unique identifying numbers, characteristics, or codes
  • Certificate/license numbers

Data Types

As an investigator, you may require PHI to answer your research questions. If you have completed the required documentation, the Data Core can provide you with the following data types:

  • Protected Health Information (Identified Data) – Includes any of the 18 elements of PHI
  • Limited Data Set- Excludes 16 of the 18 elements of PHI (e.g. a data set with only dates and any geographic subdivision greater than a patient’s street would qualify)
  • De-Identified Data – Removal of all of the 18 elements of PHI
    • At an aggregate level: This data cannot be traced back to a specific patient (e.g. aggregate statistics).
    • At a patient-specific level: Any data, other than PHI, that is specific to a patient (e.g. lab values).

Related Content

Get Involved with Indiana CTSI