COVID-19 Update: Information and resources can be found here.

Data Science

Our Goal

To provide comprehensive informatics, data analytics, data integration, and data management support for Simmons Cancer Center members.

Location

Dan Danciger Research Building (H)
H9.124 - South Campus
5323 Harry Hines Blvd.
Dallas, TX 75235

The Data Science Shared Resource (DSSR) provides: access to high-performance computing systems; support for bioinformatics and data analyses, including high-throughput molecular data pre-processing, quality assessment and analysis, and cancer image data analysis; support for data integration and risk prediction modeling, including integrative analysis, biomarker discovery, and development of prediction models for clinical outcomes; support for data management and data-sharing, including developing comprehensive databases and facilitating investigators’ use of publicly available datasets and bioinformatics tools; data science support for grant applications; and analytical tools/software distribution and education. 

Services

  • Experimental design and grant preparation, especially for projects that involve high-throughput genomics data and/or data management.
  • Database development
  • Design and implementation of interactive web applications
  • Data storage, sharing, and management
  • Access to software/tools hosted in the DSSR high-performance clusters
  • Data analysis
    • Next-generation sequencing (NGS) data processing and analysis, including DNA-seq, RNA-seq, ChIP-seq, RIP-seq, and single cell genomics
    • High-throughput data, including high-throughput screening (HTS), proteomics, metabolomics, and imaging data
    • Biomarker discovery and validation
    • Bioinformatics and systems biology analysis
    • Genome-wide Association Analysis (GWAS)
    • Secure handling and analysis of local clinical data, including electronic medical records from the UT Southwestern data warehouse.

Equipment and Technology

DSSR works with the BioHPC facility at UT Southwestern to provide Simmons Cancer Center members with data storage and access to HPC. BioHPC hardware currently consists of:

  • 28,000 CPU Cores HPC cluster
  • 60 large-memory GPU equipped nodes
  • 5,400TB high-end Lustre file system
  • 5,530TB high-end GPFS parallel file system

In addition, there is a separate HPC cluster with 256 cores across 36 processors, more than 400TB of disk storage and a tape backup system available to Simmons Cancer Center members. The shared resource has also developed maintained servers for secured clinical data, and database and web servers located in the demilitarized zone at the UT Southwestern Information Resource to allow for data sharing with investigators outside the institution. 

Fees

The DSSR supports the development of grant applications, development of new analysis pipelines and informatics infrastructure to facilitate research led by Simmons members. The DSSR also provides fee-based data science support and consultation for projects with extramural or other sources of support. Long-term collaborations with members are supported by commitment of effort by DSSR staff to a project, which is negotiated at the time of grant submission or at the start of a DSSR collaboration.

Leadership and Contact