Skip to main content Skip to secondary navigation
Peter Wegner, ''Monument to Change as It Changes"


Main content start

STARR manuscripts and peer-reviewed publications

Note that some of our work is published as open access arxiv manuscripts. The arxiv allows for open (and free) access to end users, is version controlled, provides a DOI and allows us to focus on the resource description.  Please work with your publishing agency so our resource descriptions can be included in a suitable manner. NIH provides guidelines on how to include citation of pre-prints (link). 

Following manuscripts describe the STARR ecosystem and are in reverse chronological order:

  1. American Family Cohort, a data resource description, Balraj D,  Vala A,  Hao S,  Philofsky M,  Tsvetkova A, Trach E, Narra SP, Zhuk O,  Shamkhorskaya M, Singer J, Mesterhazy J,  Datta S,  Chu I,  Rehkopf D, arXiv:2309.13175 (link)
  2. The Stanford Medicine data science ecosystem for clinical and translational research (2023), Callahan A, Ashley E, Datta S, Desai P, Ferris TA, Fries JA, Halaas M, Langlotz CP, Mackey S, Posada JD, Pfeffer MA, Shah NH, doi:10.1093/jamiaopen/ooad054 (link)
  3. A study linking patient EHR data to external death data at Stanford Medicine (2022), Peralta AA, Desai P, Datta S, arXiv:2211.01488 (link)
  4. Integrating Flowsheet Data in OMOP Common Data Model for Clinical Research (2021), Seto T, Sung L, Posada J, Desai P, Weber S, Datta S, Sep 2021, arXiv:2109.08235 (link)  
  5. the Advanced Cohort Engine for searching longitudinal patient records (2021), Callahan A, Polony V, Posada JD, Banda JM, Gombar S, Shah NH, J Am Med Inform Assoc. 2021 Jul 14;28(7):1468-1479. doi: 10.1093/jamia/ocab027 (link)
  6. A highly scalable repository of waveform and vital signs data from bedside monitoring devices (2021), Malunjkar S, Weber S, Datta S, Jun 2021, arXiv: 2106.03965 (link)
  7. High performance on-demand de-identification of a petabyte-scale medical imaging data lake (2020), Mesterhazy J, Olson G, Datta S, Aug 2020, arXiv:2008.01827 (link)
  8. A new paradigm for accelerating clinical data science at Stanford Medicine (2020), Datta S, Posada J, Olson G, Li W, O'Reilly C, Balraj D, Mesterhazy J, Pallas J, Desai P, Shah N, Mar 2020, arXiv:2003.10534 (link)
  9. Oncoshare: Lessons learned from building an integrated multi-institutional database for comparative effectiveness research, Weber SC, Seto T, Olson C, Kenkare P, Kurian AW, Das AK.  American Medical Informatics Association Annual Symposium Proceedings 2012:970-8.  (link)
  10. Implementing a Real-time Complex Event Stream Processing System to Help Identify Potential Participants in Clinical and Translational Research Studies (2010), Weber SC, Lowe HJ, Malunjkar S, Quinn J, AMIA Annu Symp Proc. 2010 Nov 13;2010:472-6. (link)
  11. STRIDE--An integrated standards-based translational research informatics platform (2009), Lowe HJ, Ferris TA, Hernandez PM, Weber SC, AMIA Annu Symp Proc. 2009 Nov 14;2009:391-5. (link)

STARR Posters and Talks

These posters are presented at external conferences and presented in reverse chronological order.

  1. Enabling Innovation at the Bedside using STARR-OMOP,  Desai P,  Callahan A, Banda JM, Kotecha N, Shah S, Datta S, OHDSI Global Symposium 2023 (Link)
  2. Making OHDSI Tooling accessible to Researchers and Students in a HIPAA Compliant Platform, Morgan-Cooper H, Black A, Naderalvojoud B, Minty E, Desai P, OHDSI Global Symposium 2023 (Link)
  3. Linking Analysis Ready Multi-modal Clinical data,  Desai P,  Datta S, OHDSI Global Symposium 2021 (Link)
  4. ATLAS with a BigQuery backend running Execution Engine – a Software demo, Posada J,  Desai P,  Yaroshovets K,  Klebanov G, OHDSI Global Symposium 2021 (Link)
  5. Open Source Text de-identification Pipeline for Clinical Notes in the OMOP-CDM, Posada J, OHDSI Global Symposium 2020 (Link)

STARR mentioned in newsletters and elsewhere

These articles are written for a general audience and presented in reverse chronological order.

  1. Power of the Commons, D. Balraj and S. Datta, TDS Connection Newsletter, May 2023 (link)
  2. Groundbreaking TiDE Uses NLP to ‘Clean’ Patient Data, S. Datta, TDS Connection Newsletter, Feb 2023 (link)
  3. Smarter Data Warehousing Preserves a ‘Goldmine’ of Information, S. Malunjkar, TDS Connection Newsletter, Jan 2023 (link)
  4. Matching Clinical Trial Subjects with Researchers in Real Time, S. Malunjkar, TDS Connection Newsletter, Sep 2022 (link)
  5. STARR and Our Journey Towards Better Data Quality and Accessibility, P. Desai, TDS Connection Newsletter, Aug 2022 (link)
  6. Move Toward a More Synergistic Model Helps Improve Data Quality, P. Desai, Center for Leading Collaboration and Innovation (CLIC) Insights to Inspire 2022 blog, May 2022. (link)
  7. Stanford’s School of Medicine reimagines its clinical data warehouse with Google Cloud, Mar 2021, Google Higher Ed Blog (link)