To support the research projects in the Kraus lab, we have developed and/or applied a wide variety of genomic tools, including novel computational pipelines designed to integrate, analyze, and visualize data from a wide variety of genomic (and proteomic) platforms. These include groHMM, a hidden Markov model-based algorithm for predicting primary transcription units based on GRO-seq data. We have used groHMM, which we deposited as an R-based package in Bioconductor for the community to use freely, to annotated thousands of previously unannotated noncoding RNA transcripts of unknown function. Furthermore, we have used genomic assays to examine the molecular mechanisms that drive signal-regulated transcriptional responses. These studies have characterized: (1) the robust and rapid changes that occur across the genome in response to estrogen and TNFα and (2) the expression of thousands of previously unannotated noncoding RNA transcripts, significantly altering our view of signal-regulated transcriptional responses.
We have recently developed TFSEE, a computational pipeline that integrates data from GRO-seq, RNA-seq, histone modification ChIP-seq, and motif searches, allowing for the simultaneous identification of putative subtype-specific enhancers and their cognate transcription factors. In addition to generating useful tools, our studies have helped to elucidate new facets of the genome and transcriptome.
Danko C.G., Chae M., Martins A., Kraus W.L. (2014) groHMM: GRO-seq Analysis Pipeline. R package version 1.0.0. Bioconductor. (Software) www.bioconductor.org/packages/release/bioc/html/groHMM.html
Chae M, Danko CG, Kraus WL (2015). groHMM: a computational tool for identifying unannotated and cell type-specific transcription units from global run-on sequencing data. BMC Bioinformatics. 16(222). PMCID: PMC4502638
Danko C.G., Hyland S.L., Core L.J., Martins A.L., Waters C.T., Lee H.W., Cheung V.G., Kraus W.L., Lis J.T., Siepel A. (2015). Identification of active transcriptional regulatory elements from GRO-seq data. Nat Methods. 12(5), 433-438. PMCID: 4507281
Nagari, A., Murakami, S., Malladi, V. S., & Kraus, W. L. (2017). Computational approaches for mining GRO-Seq data to identify and characterize active enhancers. Methods Mol Biol. 1468, 121-138. PMCID: PMC5522910
Franco, H. L., Nagari, A., Malladi, V. S., Li, W., Xi, Y., Richardson, D., Allton, K. L., Tanaka, K., Li, J., Murakami, S., Keyomarsi, K., Bedford, M. T., Shi, X., Li, W., Barton, M. C., Dent, S. Y. R., Kraus, W. L. (2018). Enhancer transcription reveals subtype-specific gene expression programs controlling breast cancer pathogenesis. Genome Res. 28(2), 159-170. PMID: 29273624