COVID-19 Update: Information and resources can be found here.

Large Database Analysis

The Large Database Analysis Cluster facilitates the use of large datasets for Patient-Centered Outcomes Research (PCOR)/Comparative Effectiveness Research (CER) studies by providing access to numerous databases. The Cluster also provides training, programming, and analytic support for high quality PCOR, CER, health services research, geospatial, epidemiology, and behavioral research. It maintains a directory of locally available datasets, faculty, and programmers skilled in using them. It also maintains a directory of past and ongoing large dataset studies to facilitate interdisciplinary collaboration.

Sandi Pruitt, Ph.D., leads the Large Database Analysis Cluster. 

The Cluster will assist investigators and trainees in:

  • Selecting and using appropriate in-house databases to answer PCOR/CER questions
  • Requesting and obtaining access to accessible databases not currently in-house
  • Collaborating with biostatisticians regarding decisions about choice of advanced analytics approaches
  • Assisting investigators with programming resources and IT storage infrastructure.

The Cluster has access to:

  • National and state administrative databases (e.g. Medicare, National Inpatient Sample)
  • National and state clinical registries (e.g. SEER, National Surgical Quality Improvement Program)
  • Large population-based epidemiology cohort studies (e.g. Dallas Heart Study)
  • National VA administrative and clinical datasets
  • National population-based probability sample surveys (e.g. NHIS, NHANES, BRFSS)

We also facilitate access to:

  • Dallas-Fort Worth Hospital Council database (all-payer hospitalization and emergency visit data for all 75 hospitals in North Texas)
  • Texas State Discharge Dataset (all 583 Texas hospitals)
  • Texas Cancer Registry
  • Cooper Center Longitudinal Study-Medicare linked cohort (25,000 patients with clinical and fitness data linked with Medicare claims)