Cohort Identification

You are here

Cohort identification is the selection of patient cohorts from disparate data sources, based on specific inclusion and exclusion criteria.

We use all available modern data stores and tools to provide cohort discovery, based on project specific inclusion and exclusion criteria. Current data sources include all or part of Cerner EHR, administrative billing records, and Medicaid data through data use agreements. The Clinical Research Data Warehouse (CRDW) serves as the data repository (self-service coming soon). CRI will also manage the Clinical Informatics Toolbox and Services (CITS) program.

CDB & i2B2

Clinical Data Base (CDB) is a comparative data base with discharge and line-item patient-level detail data from more than 270 UHC principal members and affiliate hospitals. The database can be used to compare an organization’s clinical outcome performance with that of other hospitals as well as run comparisons within the organization. The CDB is a flexible and powerful tool that supports organization’s performance improvement efforts.

Informatics for Integrating Biology and the Bedside (i2b2) is a project developed under a National Centers for Biomedical Computing grant from NIH. i2b2 is an information system framework to allow clinical researchers to use existing clinical data for discovery research. It provides clinical investigators with the tools necessary to integrate medical record and clinical research data in the genomics age. i2b2 software may be used by an enterprise's research community to find sets of interesting patients from electronic patient medical record data, while preserving patient privacy through a query tool interface.

The CRI is working on the integration of CDB data with the i2b2 platform and will make this platform available to clinical investigators for research and quality improvement initiatives. The highlights of the integration project include the following:

• Setting up of a staging i2b2 build environment with Oracle database and i2b2 application running parallel on a VM
• Defined and implemented ETL processes to import CDB data files into i2b2 platform
• Developed a strategies to transform data from CDB schema to i2b2 star schema and populate the data tables
• Defined and implemented i2b2 ontology structure based CDB data model
• Implemented patient data de-identification strategies to maintain patient privacy
• Moved staging i2b2 platform to production for general accessibility