Expression Analysis: RNA-Seq

We use Cufflinks and featureCounts to generate both normalized as well as raw counts for RNA-Seq data. These counts are used during downstream analysis to generate cluster plots and/or differential expression analysis. We currently use Cuffdiff and edgeR for differential expression analysis.

Transcript assembly and expression

The raw counts for each of the classifiers that we have described are used in our edgeR analysis for differential expression and to output normalized CPM values for both the GENCODE and igenomes classifications. Cufflinks is also used to generate normalized FPKM counts for the igenomes classified set of transcripts. Cufflinks outputs these files for each sample/alignment while featureCounts outputs raw counts described here (Section "6.2.8 Program output").

Differential expression

Differential expression analysis is carried out by both Cuffdiff (using the igenomes GTF) and edgeR (using both the igenomes GTF and the GENCODE GTFs). The output of Cuffdiff is described onlineCummeRbund is also run on the data to produce various useful metrics pertaining to the experiment. The edgeR analysis produces fold change smear plots, tagwise dispersion plots, cluster plots, mean variance plots, normalized counts and most importantly a table of differentially expressed genes/transcripts. The different outputs are described more in detail in this edgeR user guide.