Samples that pass QC are finally ready to be mapped and analyzed. The researchers will be consulted on what genome version they would like to map to, although the default would be to use the latest version available. On special request, we can try to use an older genome or even a custom one.
Reads are mapped to the reference genome using the Burrows-Wheeler Alignment Tool BWA and the output is produced in BAM format. The mapping workflow is synonymous to the workflow employed by the 1000genomes project. Using other software is not out of the question if the experiment needs it. We have a very high mapping rate given the high-quality sequencing at the McDermott Center, using almost all of the data produced!
Duplicates are removed from the mapped data using PICARD and reads with quality less than a phred score of 10 are also filtered out. This leaves us with an alignment file with analysis-ready data.