Skip to main content Skip to secondary navigation

STARR-OMOP is Stanford Electronic Health Record data from its two Hospitals in a Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). Use OMOP for observational science, population health science, collaborative network studies and reproducible data science.


Main content start


There are a number of popular CDMs to choose from including i2b2, Pediatric Learning Healthcare System PEDSNet, Patient-Centered Clinical Research Network PCORNet, Health Care Systems Research Network, and the US Food and Drug Administration Sentinel. Choosing a particular CDM over another is a matter of meeting specific research objectives. It is not uncommon for an academic medical center to support more than one.

Our second generation research clinical data warehouse (r-CDW) needs to support a large number of use cases. For this r-CDW, we choose OMOP CDM. OMOP CDM demonstrates applicability for many different use cases including a) claims and EHR (link), b) EHR based longitudinal registries (link) and, c) Hospital transactional database (link).  The OMOP CDM demonstrates strong results in comparative effectiveness research (link) with minimal information loss during data transformation (link), speeds up implementation of clinical phenotypes across networks (link), and promotes research reproducibility (link). There is demonstrated interoperability between different CDMs (link) so choosing OMOP does not exclude support for other CDMs in future. Furthermore, there is a strong focus in OHDSI community on data quality and broad support for the analytical toolkits (aka methods library) that together strive to deliver consistency in cohort definition, analysis design, and reporting of results. Perhaps the most appealing aspect is that OHDSI is an open source public-private partnership and welcomes community participation. There is a robust community of end users, developers and thought leaders who are actively engaged in various shared repositories, discussion forums, training and workshops. The collection of learning resources are vast (link) and includes FAQs, code snippets and video lectures. Finally, OMOP is adopted at other CTSA sites e.g., Albert Einstein College of Medicine – Montefiore Health, Columbia University, Icahn School of Medicine at Mt. Sinai.

OMOP artifacts

The OMOP database in STARR portfolio is derived from the two Epic Clarity EHRs. 


OMOP pipeline

Research IT receives raw Clarity from each of the two hospitals and builds a filtered Clarity (learn more) for each. These filtered Clarity databases then become the source for OMOP ETLs. Patients at the two hospitals are linked via their MRN. We first build a OMOP database with all PHI present. Then, we build two PHI scrubbed databases that are accessible to Stanford researchers without an IRB, STARR-OMOP-deid and STARR-OMOP-deid-lite. The former contains PHI scrubbed clinical text (NOTES) using TiDE. The OHDSI Cohort analysis tool, ATLAS, and ACE, Advanced Cohort Engine run on STARR-OMOP-deid-lite.

STARR-OMOP-deid is refreshed monthly and  STARR-OMOP-deid-lite is refreshed weekly. The patient identifiers stay stable between refreshes. The STARR-OMOP-deid(-lite) are accessible as self service. For the OMOP PHI or any other variation like Limited Data Set or linked with Clarity or sharing with non-Stanford researchers, please request a consultation service.  For documentation and training, please refer to research support.