Exon Capture/Whole Genome

Exon Capture or Whole Exome Sequencing is an efficient approach to sequence the coding regions of the human genome. Many researchers are only interested in the regions that are responsible for protein coding i.e. 1-2 percent of the genome. It only makes sense to target these regions during sequencing, which guarantees a greater resolution and lower cost than Whole genome sequencing that aims to capture the entire genetic spectrum. In general, knowing the complete DNA sequence of an individual's genome does not, on its own, provide useful clinical information, but this may change over time as a large number of scientific studies continue to be published detailing clear associations between specific genetic variants and disease. Whole genome sequencing is also preferable when questions cannot be answered by looking only at the coding regions. This sequencing technique aims to decipher many different types of gene mutations including germline, somatic, insertions, deletions, and copy number variations, to name a few.

Gene Mutation

Most of the biologically significant events arise due to variations at the DNA sequence level. These sequence variations can be successfully correlated to various phenotypes, disease conditions, experimental conditions, etc. Identifying these sequence variants is the most fundamental feature of modern genomic studies. These changes can now be identified with confidence at a reasonable price using high throughput genomic sequencing. At the McDermott Sequencing Core, high-throughput genomic sequencing is done on Illumina machines and is available in two flavors: Whole Genome Sequencing and Exome Sequencing. Which of the two options should be chosen depends on the desired results.

Whole genome sequencing is mostly done when variations are expected at the whole genome level. These would include Structural Variations, Copy Number Variations, variations at Non-Coding regions, and other regions not target by Exome Sequencing. It is also used when the organism to be sequenced has a relatively smaller genome size or does not have exome capture kits available. Whole genome sequencing can be expensive and takes longer than Exome Sequencing. Many times the researchers are only interested in variants at exonic regions and these can be specifically sequenced at high coverage using Exome Sequencing. Exome Sequencing involves targeted exon capture and sequencing of the exome of an organism using various kits.

Our expertise in this field of genetic research includes geneticists, statisticians, and computational biologists. We provide cutting-edge sequencing, data analysis, and support to numerous researchers in UT Southwestern and beyond. The following is a basic workflow that we employ for the analysis of such data.

Please contact us if you would like more details about the workflow including specific parameters of the software, genome versions used, etc. Exome sequencing: the sweet spot before whole genomes (Teer and Mullikin, 2010) is an accessible introduction to exome sequencing.

The Broad GATK Workflow
The Broad GATK Varinat Calling Workflow

Files provided with Exon Capture/Whole genome analysis

  • Raw unprocessed gzipped FASTQ files
  • FASTQC report with basic sequencing quality statistics
  • Mapped BAM files (sorted -> low-qual filtered -> duplicates removed -> realigned -> recalibrated)
  • Mapping statistics
  • PICARD metrics
  • Coverage statistics (plots, target region coverage, mean coverage, etc.)
  • Annotated Variant calls in VCF formats and tab delimited formats