Skip to main content Skip to secondary navigation

The Stanford in-house data model contains Stanford Electronic Health Record (EHR) data from its two Hospitals in a set of tables related by join keys. Use the Stanford in-house data model for investigator-initiated studies, clinical trials, quality improvement initiatives, preparatory to research activities and other secondary use purposes not requiring outside collaboration based on standard data models.

In house data model

Main content start


Stanford has been supplying clinical data for research and other secondary uses to the Stanford Medicine community since 2008, long before standard data models became popularized. Stanford’s original research repository, STRIDE, was designed based on the simple forms of clinical data available at the time, namely, clinical notes and reports, lab orders and results, medication orders and administration records, and of course patient identifiers and demographics. 

The in-house data model is derived from Epic’s reporting database Clarity using the process diagrammed below. First, the dataset is filtered to remove person and encounter data that due to legal/contractual obligations may not be used for secondary purposes. Then, selected clinical data elements are extracted and transformed from each of the two source systems, and finally merged into a shared data model. The two systems share a single service for issuing and managing medical record numbers, so data for patients seen at both hospitals is aggregated into a single unified patient record. This data is then made available to researchers by the STARR Tools applications.

The data model has evolved over the years in response to research needs. Today, the in-house data model contains minimally modified clinical data, suitably filtered for compliance and de-identified as needed, but otherwise faithfully represents the information exactly as captured in the original clinical system.

Learn more about STARR Tools

In-house artifacts