Skip to main content

Integrated Data Repository Cloud

Privacy Impact Assessment (PIA) published by CMS as an Operating Division of the U.S. Department of Health and Human Services

Date signed: 12/5/2023

PIA Information for Integrated Data Repository Cloud
PIA QuestionsPIA Answers
OPDIV:CMS
PIA Unique Identifier:P-7352897-513931
Name:Integrated Data Repository Cloud
The subject of this PIA is which of the following?Major Application
Identify the Enterprise Performance Lifecycle Phase of the system.Operate
Is this a FISMA-Reportable system?Yes
Does the system include a Website or online application available to and for the use of the general public?No
Is this a new or existing system?Existing
Does the system have Security Authorization (SA)?Yes
Date of Security Authorization6/27/2023
Indicate the following reason(s) for updating this PIA. Choose from the following options.PIA Validation (PIA Refresh/Annual Review)
Describe in further detail any changes to the system that have occurred since the last PIA.The Snowflake application (a major component of the Integrated Data Repository Cloud (IDRC) system) now resides in Snowflakes AWS environment which is FedRAMP approved and sponsored by CMS. The IDRC system has created additional interconnections with the following systems: Informatica, Transformed Medicaid Statistical Information System (T-MSIS), The Office of Inspector General (OIG), Enterprise Data Lake (EDL), Drug Data Processing System (DDPS), Center for Medicare and Medicaid Innovation APM Management System (CMMI-APM), Medicare Online Support System (MOSS), Medicare Advantage and Prescription Drug System (MARX), Conversion Medicare (CVM), Risk Adjustment Data (RAD).
Describe the purpose of the systemThe purpose of the Integrated Data Repository Cloud (IDRC) system is to provide data warehousing and support the data processing needs of CMS by providing a database warehouse for Medicare and Medicaid beneficiary information. IDRC allows CMS to execute various functions, including detection of fraud, waste, and abuse; prescription drug event (PDE) claims processing; and research in the development of new provider payment models. The system houses data such as Medicare and Medicaid claims, beneficiary data, provider data, plan data, and drug data. IDRC is designed to structure this data such that CMS and its partners may access the information from a single source.
Describe the type of information the system will collect, maintain (store), or share. (Subsequent questions will identify if this information is PII and ask about the specific data elements)

Medicare and Medicaid information is not directly collected by the IDRC system. The information housed within the IDRC system is acquired from other CMS systems which are responsible for maintaining their own Privacy Impact Assessment (PIA). These systems include Medicare Drug Data Processing System, Medicare Beneficiary Database, Medicare Advantage Prescription Drug System, Medicaid Statistical Information System, Retiree Drug Subsidy Program, Common Working File, National Claims History, Enrollment Database, Multi-Carrier Claims System Fiscal Intermediary Shared System, Unique Physician/Provider Identification Number, and Medicare Supplier Identification File.

The data elements stored will include social security number, name, phone numbers, medical notes, date of birth, mailing address, e-mail address, medical record number, Health Insurance Clam Number (HICN), Unique Physician Identification Number (UPIN), race, sex, diagnosis codes and procedure codes. These data fields are all stored indefinitely in the IDRC.

Those that access the IDRC system include, system administrators, developers, and some IDRC users who are direct contractors with CMS. IDRC users also include CMS government employees. The CMS Enterprise User Administration (EUA) system collects user credential information. User credentials are not maintained by the IDRC system. All system user notifications pertaining to disclosures of their PII (user credentials) are performed by the CMS EUA system. The EUA has its own PIA.

Provide an overview of the system and describe the information it will collect, maintain (store), or share, either permanently or temporarily.

The IDRC system is a vital element of the CMS data warehouse strategy. Data that is collected will support the efforts in providing cost-effective data warehousing and processing by delivering a centralized location for CMS systems to access stored data.

The stored data will be used to provide a central resource for data analysis supporting detection of Medicare and Medicaid fraud, waste, and abuse; PDE claims processing; and research in the development of new provider payment models. 

The data elements stored for Beneficiaries will include social security number, name, phone numbers, medical notes, date of birth, mailing address, e-mail address, medical record number, Health Insurance Clam Number (HICN), Unique Physician Identification Number (UPIN), race, sex, diagnosis codes and procedure codes. 

Health Claims records are retrieved daily using personal identifiers to provide data to support the various programs and efforts at CMS. Date of birth, race, elements of the mailing address such as state, diagnosis codes, procedure codes and UPINS are used to retrieve records.

The categories of individuals stored in the IDRC include public citizens and patients.

The retention time for all stored data in the IDRC is indefinite, data is never deleted.

Does the system collect, maintain, use or share PII?Yes
Indicate the type of PII that the system will collect or maintain.
  • Social Security Number
  • Name
  • E-Mail Address
  • Phone Numbers
  • Medical Notes
  • Date of Birth
  • Mailing Address
  • Medical Records Number
  • Other - HICB, UPN, HICN, UPIN, Race, Sex, Diagnosis Codes, Procedure Codes
Indicate the categories of individuals about whom PII is collected, maintained or shared.
  • Public Citizens
  • Patients
How many individuals' PII in the system?1,000,000 or more
For what primary purpose is the PII used?The primary purpose of the PII stored in the IDRC is to provide cost-effective data warehousing and act as a central resource for data analysis supporting detection of Medicare and Medicaid fraud, waste, and abuse; PDE claims processing; and research in the development of new provider payment models. PII in the IDRC is not disclosed and/or shared.
Describe the secondary uses for which the PII will be used (e.g. testing, training or research)Not applicable.
Describe the function of the SSN.The function of the SSN in the IDRC system is to identify claim records to individual within the database.
Cite the legal authority to use the SSN.Section 10332 of the Patient Protection and Affordable Care Act (ACA)
Identify legal authorities​ governing information use and disclosure specific to the system and program.Authority for the collection of data maintained in this system is given under section 226, 226A, 1811, 1818, 1818A, 1831, 1833(a)(1)(A), 1836, 1837, 1838, 1843, 1866, 1874a, 1875, 1876, 1881, and 1902(a)(6) of the Social Security Act (the Act). The following are the corresponding sections from Title 42 of the United States Code (U.S.C.): 426, 426–1, 1395c, 1395i–2, 1395i–2a, 1395j, 1395l(a)(1)(A), 1395o, 1395p, 1395q, 1395v, 1395cc, 1395kk–l, 1395ll, 1395mm, 1395rr, 1396a(a)(6), and section 101 of the Medicare Prescription Drug, Improvement, and Modernization Act of 2003 (MMA) (Pub. L. 108–173), which established the Medicare Part D program. 5 USC Section 301, Departmental regulations.
Are records on the system retrieved by one or more PII data elements?Yes
Identify the number and title of the Privacy Act System of Records (SORN) that is being used to cover the system or identify if a SORN is being developed.Medicare Integrated Data Repository (IDR), System No. 09–70–0571. Published December 13th, 2006.
Identify the sources of PII in the system: Directly from an individual about whom the information pertainsOther - IDRC is not the source system for its data. IDRC is a warehouse for managing the data.
Identify the sources of PII in the system: Government Sources
  • Within the OPDIV
  • Other HHS OPDIV
Identify the sources of PII in the system: Non-Government SourcesMembers of the Public
Identify the OMB information collection approval number and expiration dateNot applicable
Is the PII shared with other organizations?Yes
Identify with whom the PII is shared or disclosed and for what purpose.Within HHS : The IDRC has an interconnection with the Office of Inspector General (OIG) to share data as the IDRC is a data warehouse. An ISA was established, approved, and signed for this interconnection and sharing of data.
Describe any agreements in place that authorizes the information sharing or disclosure (e.g. Computer Matching Agreement, Memorandum of Understanding (MOU), or Information Sharing Agreement (ISA)).There is an ISA with the Office of Inspector General (OIG). There is an MOU established for the following systems: Fraud Prevention System (FPS), Medicare Payment System Environment (MPSE), Trusted Third Party (TTP), Accountable Care Organization-Operational System (ACO-OS), One Program Integrity (OnePI), Risk Adjustments Suite of Systems (RASS), Enterprise Data Lake (EDL), Center for Medicare and Medicaid Innovation (CMMI) AI Coding Intensity Adjustment Research Project, Conversion Medicare (CVM), Risk Adjustment Data (RAD).
Describe the procedures for accounting for disclosuresThe IDRC follows CMS Breach Response Handbook which defines actions that must be taken in response to a suspected breach of Personally Identifiable Information (PII) / Protected Health Information (PHI) at CMS to meet federal requirements for breach response.
Describe the process in place to notify individuals that their personal information will be collected. If no prior notice is given, explain the reason.

Beneficiary information is not collected by the IDRC system. The appropriate procedures to notify individuals are performed by the CMS system(s) that are responsible for the collection the information. 

The CMS EUA system collects user credential information. User credentials are not maintained by the IDR system. All system user notifications pertaining to disclosures of their PII (user credentials) are performed by the CMS EUA system.

Is the submission of the PII by individuals voluntary or mandatory?Voluntary
Describe the method for individuals to opt-out of the collection or use of their PII. If there is no option to object to the information collection, provide a reason.Beneficiary information is not collected by the IDRC system. The appropriate procedures to allow individuals to opt-out of the collection or use of their PII are performed by the CMS system(s) that are responsible for the collection the information.
Describe the process to notify and obtain consent from the individuals whose PII is in the system when major changes occur to the system (e.g., disclosure and/or data uses have changes since the notice at the time of original collection). Alternatively, describe why they cannot be notified or have their consent obtained.Beneficiary information is not collected by the IDRC system. The appropriate procedures to allow individuals to opt-out of the collection or use of their PII are performed by the CMS system(s) that are responsible for the collection the information.
Describe the process in place to resolve an individual's concerns when they believe their PII has been inappropriately obtained, used, or disclosed, or that the PII is inaccurate. If no process exists, explain why not.Beneficiary information is not collected by the IDRC system. The appropriate procedures to allow individuals to opt-out of the collection or use of their PII are performed by the CMS system(s) that are responsible for the collection the information.
Describe the process in place for periodic reviews of PII contained in the system to ensure the data's integrity, availability, accuracy and relevancy. If no processes are in place, explain why not.Beneficiary information is not collected by the IDRC system. The appropriate procedures to allow individuals to opt-out of the collection or use of their PII are performed by the CMS system(s) that are responsible for the collection the information.
Identify who will have access to the PII in the system and the reason why they require access.
  • Users: Users in the IDRC environment are allowed access to the PII to perform data analytics to support detection of Medicare and Medicaid fraud, waste, and abuse; PDE claims processing; and research in the development of new provider payment models. Users can leverage data elements stored for Beneficiaries which include social security number, name, phone numbers, medical notes, date of birth, mailing address, e-mail address, medical record number, Health Insurance Clam Number (HICN), Unique Physician Identification Number (UPIN), race, sex, diagnosis codes and procedure codes.
  • Administrators: Administrators in the IDRC environment are allowed access to PII to assist in the management of the IDRC platform including data validation; data backup and restore; and data loading. Administrators can leverage data elements stored for Beneficiaries which include social security number, name, phone numbers, medical notes, date of birth, mailing address, e-mail address, medical record number, Health Insurance Clam Number (HICN), Unique Physician Identification Number (UPIN), race, sex, diagnosis codes and procedure codes.
  • Developers: Developers in the IDRC environment are allowed access to PII to support the development of system enhancements to the IDRC platform including updated data models and architecture; data loading techniques; and data validation. Developers can leverage data elements stored for Beneficiaries which include social security number, name, phone numbers, medical notes, date of birth, mailing address, e-mail address, medical record number, Health Insurance Clam Number (HICN), Unique Physician Identification Number (UPIN), race, sex, diagnosis codes and procedure codes.
  • Contractors: Users, Administrators, and Developers are all direct CMS contractors and are granted access to PII data as described above. Contractors can leverage data elements stored for Beneficiaries which include social security number, name, phone numbers, medical notes, date of birth, mailing address, e-mail address, medical record number, Health Insurance Clam Number (HICN), Unique Physician Identification Number (UPIN), race, sex, diagnosis codes and procedure codes.
Describe the procedures in place to determine which system users (administrators, developers, contractors, etc.) may access PII.Procedures in place to determine which system users may access PII include account management mechanisms for the IDRC which are used identify account types (i.e., individual, group, and system); establish conditions for group membership; and assign associated authorizations. Individuals with access to PII are granted access based on the assigned duty and intended use of the IDRC and PII in the IDRC. These individuals are added to the contractor DUA to ensure they have been approved to see this sensitive data.
Describe the methods in place to allow those with access to PII to only access the minimum amount of information necessary to perform their job.The least privilege standard is utilized for all access to the IDRC system. RBAC and Access Control Lists are used to filter access and control access based on an individual's assigned duties.
Identifying training and awareness provided to personnel (system owners, managers, operators, contractors and/or program managers) using the system to make them aware of their responsibilities for protecting the information being collected and maintained.The CMS Security Awareness Training and Privacy Act Training are required trainings that must be completed annually by all CMS employees and direct contractors with access to CMS systems. Additional trainings include RBT, Rules of Behavior, HIPAA Security/Privacy training, CMS Counterintelligence, and Insider Threat Awareness Training.
Describe training system users receive (above and beyond general security and privacy awareness training)

Training will be based on a user's role and will include:

CMS provides AWS Cloud Services training opportunities available to CMS Federal Employees as well as Contract Support Teams. AWS trainings are not mandatory but our currently available several times a year. Training includes cloud architecture design, cloud security and cloud development training.

Databricks - Cloud Service for ETL. Databricks training is being offered several months out of the year for IDRC developers. The training is not mandatory. An overview of Databricks and the technologies that support ETL development in Databricks is available.

Snowflake - Cloud Data Warehouse - Snowflake training is offered for free and is available monthly for developers, analysts or management who want to become more familiar with the environment. These trainings range from overviews to more specific data management topics. The trainings are not required and is provided to the Snowflake community.

Do contracts include Federal Acquisition Regulation and other appropriate clauses ensuring adherence to privacy provisions and practices?Yes
Describe the process and guidelines in place with regard to the retention and destruction of PII. Cite specific records retention schedules.

Given the business use of the IDRC, PII data stored in the IDRC is not removed to ensure users of the data can adequately search and analyze data in support of the stated business needs of the IDRC system.

User credential information is collected by the CMS EUA system, it is not gathered, updated, or maintained by the IDRC. All notification to users of the system related to disclosures of their PII (user credentials) are performed by the CMS EUA system.

IDRC follows HHS Records Management records retention schedules/guidelines and personnel attend annual training. 

NARA Records Schedule DAA-0440-2015-0009 dated July 17, 2017, notes CMS retention cutoff at the end of the calendar year, with transfer for accessioning to the National Archives. Transfer to the National Archives 5 year(s) after cutoff. NARA Retention: 75 years but CMS may reduce retention to 10 years but has flexibility to keep individual series longer as required by business needs.

Describe, briefly but with specificity, how the PII will be secured in the system using administrative, technical, and physical controls.

PII is secured by administrative controls which include only assigning necessary privileges to user accounts that access the IDRC environment, performing annual account reviews to ensure that those user accounts have the needed access and implementing automated actions to disable inactive user accounts.

Technical controls to secure PII include firewalls that protect and control network traffic that goes in and out of the IDRC system, including the data center where the IDRC system resides. For example, CCIC Continuous Logging and Monitoring and Alerting (AU&SI CMS ARS inherited controls).

 

Vulnerabilities and exploits are scanned for within the IDRC system by continuous independent vulnerability scans and monitoring. The IDRC also has host intrusion detection and antivirus services. Access into the IDRC system is only allowed through multi-factor authentication through CMS controlled VPNs.

Physical Controls are in place (CMS AWS East/Snowflake AWS SaaS (PE related CMS ARS inherited controls).

 

Security controls are tested/substantiated by audits and assessments required by the ATO.

 

CMS AWS Virtual Private Cloud and the Snowflake AWS environment are both FedRAMP approved.