Once samples pass initial sequencing quality metrics generated by the sequencers, they are assessed by FASTQC that checks for per-base sequence quality, GC content, and N content, among others. If the data indeed looks sub-par, they will be immediately reprepped and resequenced. Data trimming is done if needed using any one of:
Our quality control process also includes assessing per base coverage, mean coverage, and on target percentages, among others; these will be discussed in the Useful Metrics section. We also check for contaminants across different genomes as well as different contaminants such as ribosomal, mitochondrial, adapters, vectors, etc.
A very important part of our quality control is to quantify the amount of ribosomal content in each of the samples. For each sample, a random selection of reads is mapped to the ribosomal sequences of the pertinent species. If the percentage mapped is too high, they are flagged as contaminated and the sequencing core is notified that the sample needs re-prepping and re-sequencing.
It is necessary to remove mitochondrial sequences from the sample before sequencing so that they do not interfere with downstream analysis.