Data models
Main content start
In the context of health data, the data model is the schema representing the patient data in a clinical data warehouse. STARR supports a number of data models as mentioned below.
- Epic Clarity data model: This is our raw data from the two hospitals. There are two separate Clarity data models - they are similar but not identical. The data model is complex and there are over 10,000 tables. There is lack of strong type e.g., many numerical data fields are captured as text fields. Nearly a 1000 tables have data that is frequently used for research. You can access Clarity data via data consultation.
- Stanford in-house data model: When STARR Tools (fka STRIDE) first launched in 2008, Common Data Models didn't exist. This first generation research Clinical Data Warehouse (r-CDW) developed an in-house data model. It is specific to Stanford, has been refined over the years to meet the needs of Stanford community. The typical use case for research using the STRIDE data model is chart review, and prospective studies. You can access this data via STARR Tools self-service.
- OMOP Common Data Model (CDM): In 2019, STARR launched its next generation r-CDW using Observational Medical Outcomes Partnership (OMOP) CDM from OHDSI consortium. This data model is particularly suitable for advanced electronic phenotyping due to richness of the vocabulary and high quality of Stanford mapping to standard concepts. Our OMOP database is used for CTSA CLIC data quality reporting. The OHDSI analytical tools also support collaborative and reproducible research, allowing Stanford to participate in a large number of network studies. Furthermore, STARR-OMOP is specifically suited to researchers interested in Big Data analytics and method development such as Natural Language Processing. You can access a PHI scrubbed non-human subject OMOP r-CDW via self-service or request eProtocol approved data via data consultation.
- PEDSNet Data Model: The PEDSNet Common Data Model is similar in spirit to OMOP and is optimized for pediatric research. Research Technology develops a PEDSNet r-CDW using LPCH Clarity data - where SHC provides services to LPCH, SHC Clarity is used to pull relevant data. Since 2022, Research Technology has sent a Limited Data Set version of PEDSNet r-CDW to the PEDSNet Data Coordinating Center, CHOP. Please request a data consultation if you want to learn more.
- PCORNet Data Model: Since 2024, Research Technology has developed a PCORNet r-CDW using adult hospital data. The r-CDW is maintained at Stanford and Research Technology team runs feasibility queries on the r-CDW based on requests from the PCORNet Data Coordinating Center. Please request a data consultation if you want to learn more.