2020
Gap Year Projects
Undergraduates from across the nation were invited to apply for one of the following listed projects in our department faculty labs. They worked on these projects remotely, learned computational biology research, and many of them then chose to come to UT Southwestern for graduate education.
Making Deep Learning Models that are interactive and self-interpreting
Jian Zhou's group
Interactivity is a cornerstone of exploratory data analysis and with ever-more complex deep learning models, it is critical to build an intuitive human-model interactive interface to allow understanding and extracting knowledge from models. In this project, you will work on interface design, with mostly web-technologies and build web-based applications that utilize our deep learning models. For example, you may implement a web-interface that allows the user to interact with a deep learning genomic sequence model (by generating predictions, identify important sequence features, and allow the user to introduce mutations etc). Deep learning models are typically black-boxes that requires significant efforts in interpreting how it is making any prediction. Interpreting highly nonlinear functions can be inherently challenging, so why don’t we make the deep learning model interpret itself? In this project, you will design and train “self-interpreting” deep learning networks that provide not just prediction but also directly output important input features.
Building a Deep Learning-Based Shape Selector
Gaudenz Danuser's group, Mentor: Xinxin Wang
Using computational models to interpret microscopy images can be valuable in studying morphological changes at the cellular level. However, the complexity in the geometry of cells imposes significant difficulties in selecting and calibrating the models to match the experiments either by hand or via simple algorithms. Therefore, this challenge necessitates the need for an autonomous shape selecting pipeline in several ongoing projects in our lab. Here, we invite undergraduate students interested in machine learning to help us design a deep learning-based shape selector that can support our state-of-the-art computational and experimental studies on cell morphology (CM). We expect the selector to: 1) among many computationally generated CM models, determine which one best represents the experimental observation; 2) with help from experts, classify different CMs obtained experimentally into groups. Achieving these two goals will be highly valuable in our endeavour to discover molecular causes of CM changes and their role in cancer progression.
Computer Graphics of Cancer Cells
Gaudenz Danuser's group, Mentors: Meghan Driscoll & Hanieh Mazloom-Farsibaf
Cancer proliferation and metastasis is governed in part by the spatial localization of signaling molecules within cells. Powerful new microscopy technologies, in particular light-sheet microscopes developed at UT Southwestern, now enable the 3D visualization of cell signaling. However, cancer cells have convoluted dynamic morphologies, which complicate quantitative descriptions of signaling distributions. The field of computer graphics has developed mathematical and algorithmic techniques that are potentially useful for describing signaling distributions on the cell surface. However, these tools have been developed primarily to visualize objects at the human scale, such as characters in video games, and it is unclear how best to translate them to the peculiar world of cancer cells. In this project, you will adapt computer graphics algorithms for use on cancer cells. In particular, you will develop a tool to measure the spatial correlations of signaling distributions defined on the irregular manifold that is the cell surface, or, alternatively, choose to work on another mutually agreed upon application of computer graphics to cell morphology.
Deep Learning Deconvolution
Reto Fiolka's group, Mentor: Bo-Jui Chang
Have you been impressed by the license plate recognition in the CSI (Crime Scene Investigation) TV series? Have you ever wondered how they do it? Similar tasks also exist in biomedical imaging, in which scientists try to enhance image quality or/and improve image resolution, to resolve sub-cellular structures. Here, deconvolution is the key and the very often used technique to tackle this task. Conventional deconvolution techniques such as Lucy-Richardson deconvolution can theoretically improve the resolution by a factor of 1.41(). However, it involves iterations, which is time consuming, especially when dealing with large 3D image volumes. More critically, it requires critical user input that is tedious to define and can significantly bias the content of the processed data. In this project we will use deep learning to perform image deconvolution (Ref 1). We anticipate that deep learning deconvolution will achieve same or similar results as conventional deconvolution, but at a much faster speed (10-100x) and without need for user input. In the future, deep learning can also be used to further enhance super-resolution microscopy. [Ref 1: Guo, M. et al. Rapid image deconvolution and multiview fusion for optical microscopy. Nat. Biotechnol. 1–10 (2020). doi:10.1038/s41587-020-0560-x].
Really Smart Microscopes for Cancer Cell Biology
Reto Fiolka's group, Mentor: Stephan Daetwyler
The Fiolka lab at UT Southwestern develops state-of-the art microscopy to study biological processes such as the behavior of cancer cells in circulation. Many of our projects involve computational challenges from advanced hardware control, GPU real-time processing of microscopy data, or neuronal network-based analysis and reconstruction of acquired data. If you are interested in microscopy and its computational aspects, we are happy to discuss with you potential projects in detail. Specifically, we aim to improve current multi-photon raster scanning microscopes to increase their acquisition speed by an order of magnitude. This project will be part of a group effort and involve systematic analysis of acquisition parameters, hyper-parameter optimization, 3D visualization, programming in Python, and advanced reconstruction algorithms including neuronal networks.
Deciphering Visual Evaluation on Reconstructive Surgery Outcomes using an Eye-Tracking Platform & Machine Learning Techniques
Jeon Lee's group
Human interactions begin with unconscious evaluation of the visual characteristics of one another, centered around the face. We automatically and almost immediately assess familiarity and attractiveness. When encountering a person who had craniofacial reconstructive surgery, we make near-instantaneous evaluation of the presence of facial deformity, the outcome of which modifies our initial emotional responses and social behavior toward that person. Infants born with craniofacial differences (such as cleft lip or craniosynostosis) often undergo craniofacial reconstruction in an attempt to restore ‘normal’ appearance. However, little is known about how we recognize 'normal and how laypersons assess craniofacial differences. Eye-tracking technology may provide the proxy necessary to evaluate whether reconstructive surgery for craniofacial craniosynostosis or cleft lip has achieved the ultimate goal of surgery, that is, reconstructing a face perceived as normal during social interaction. In this study, we aim to combine eye-tracking technology with machine learning techniques in order to decipher visual evaluation on reconstructive surgery outcomes. An eye-tracking platform will be used to collect subjects’ gazing patterns and durations over pre- and post-surgery images. Then this ‘human’ behavior driven features, rather than mathematical derived features, will be used to train the machine learners.
Developing an Automated Surgical Skill Analysis Platform
Andrew Jamieson's group
UTSW hosts nationally recognized specialists in robotic surgery who perform numerous procedures using advanced systems such as the Da Vinci robot from Intuitive Surgical (https://www.intuitive.com/en-us). As we are a large training and research hospital, these procedures are typically recorded and kept for later review. The potential abundance of video surgical data coupled with limitations for manually reviewing it in a timely and comprehensive fashion, immediately suggests the potential benefit of applying artificial intelligence (AI) technology to this domain. Help build the essential infrastructure that will support an AI-based system that could provide timely, quantitative, consistent, and scalable surgical skill analyses, greatly enhancing the quality of surgery as well as the effectiveness and efficiency of training. (Collaboration between Lyda Hill Department of Bioinformatics and Department of Surgery).
Image Segmentation, Deep Learning Architectures, & Predictive Modeling to Advance Neuroscience
Albert Montillo's Deep Learning for Precision Health Lab
Segmentation of deep brain structures imapacted by neurodegenerative disorders; Mentors: Son Nguyen & Alex Treacher
The putative initial site impacted in multiple neurodegenerative diseases, including movement disorders such as Parkinson’s disease and atypical parkinsonian disorders, are deep seated structures in the mid brain. These structures include the substantia nigra pars compacta and pars reticulata, red nucleus, globus pallidus, and putamen among others. 3D MRI can reveal these structures however there are no publically available pipelines to automatically label these structures on individual patient scans. This presents a significant obstacle to quantifying their degradation in disease with measures from neurite density, neuronal demylination, iron concentration and thereby forming a biomarker of disease progression. In this project we will use 3D MRI from the NIH/NINDS PDBP dataset, manually labeled with the support of collaborating neuroradiologists at UTSW, to develop a fully automated neuroanatomical structure labeling 3D deep learning algorithm. Quantitative segmentation results will be computed through standardized metrics against neuroradiologist, and externally validated a second dataset from UTSW AIRC. Student will work closely with PI, PhD student and neuroradiologst and first author publication is anticipated within 6-8mo.
Novel deep learning architectures for decoding 4D brain activation; Mentors: Son Nguyen & Kevin Nguyen
Patterns of brain activity can be measured non-invasively in 4D using fMRI, EEG and MEG among other modalities. In mental disorders such as schizophrenia, bipolar disorder and schizoaffective disorder, this data is being used to identify biomarkers capable of objectively informing the diagnoses of these conditions. The standard approach to forming a predictive model entails preprocessing of the brain activity data which entails the selection of many image processing parameters such as the choice of atlas to divide the brain into a predetermined set of regions, smoothing kernels to smooth regional signals. The extracted inter-regional brain connectivity is then provided as input to a machine learning algorithm which is trained to decode the connectivity into a diagnosis. While deep learning models could be utilized here, they have also shown ability to directly learn a hierarchy of preprocessing kernels that directly act on the raw data which are optimal for a given classification task and can outperform such hand crafted features. Accordingly, the goal of this project is to develop novel deep learning architectures that directly operate on the raw brain activity data. This will entail layers to integrate temporal information and layers that integrate this information spatially in a multiresolution manner. The extensive, multisite BSNIP 1 dataset will be used to develop the models and predict the mental disorder diagnosed by board-certified psychiatrists. The model will be externally validated upon the multi-site BSNIP 2 database, and through leave-one-site out cross validation. Student will work closely with PI, Postdoc and psychiatrists from UTSW and collaborating sites. First author publication is anticipated within 6-8mo.
Interpretability of tensor decomposition methods of brain connectivity measures predictive of treatment response in depression; Mentors: Kevin Nguyen & Cooper Mellema
In major depression disorder, treatments include antidepressants and psychotherapy. There are >20 antidepressants to choose from and selecting which is best for each patient currently entails a trial-and-error process, leaving 40% of patients without an effective treatment for a year or more. it is widely believed that patterns of brain activity in pre-treatment EEG and MRI contain information that may be used to predict the treatment response profile of individual patients. However, these brain activity patterns are extremely high dimensional (e.g. 120K voxels by 180 time frames), while the number of subjects in a given study is typically on the order of 300. Therefore the development of dimensionality reduction methods are essential to fully exploit this information. Tensor decomposition methods are an attractive approach and include methods such as probabilistic PCA, probabilistic CCA, Non-negative matrix factorization. Which yield the optimal predictive power whilst also yielding interpretable factorization is yet to be discovered. Accordingly, this project will compare and optimize the methods to predict the recovery slopes in the largest randomized placebo controlled clinical trial of antidepressants, EMBARC from UTSW. Results will be externally validated on UTSW’s D2K dataset. Student will work closely with PI, MD/PhD students and psychiatrists from UTSW and collaborating sites. First author publication is anticipated within 6-8mo.
Visit Deep Learning for Precision Health Lab
Deep Learning for Histopathology and Broad Utility Biomedical Tools
Satwik Rajaram's group
Adversarial Attacks on Deep Learning Models for Clinical Grade Histopathology
Deep learning has the potential to revolutionize histopathological diagnoses, yet in practice, few models reach the level of reliability needed for clinical adoption. Given the difficulty of generating diverse training data sets, most models tend to be overly sensitive to experimental modalities (e.g. microscope or staining parameters) and fail to generalize to variations they would experience in the field. Current approaches meant to overcome these limitations are largely based on simplistic models of image variation and fail to be sufficiently challenging. The Rajaram Lab (https://www.rajaramlab.org) aims to develop data-driven adversarial attacks designed precisely to identify blind-spots of existing models, thereby forcing these models to become robust against the variations that are inevitable in practical settings. In this way, we hope to generate the next generation of histopathology models with clinical grade reliability. We are looking for motivated undergraduates with previous experience in deep-learning: comfortable in Tensorflow or PyTorch and familiar with the theory of deep learning. No previous experience in biology or histopathology is required, but some exposure to adversarial approaches would be a definite advantage.
Enable Reproducible Deep Learning for BioMedical Science
The bedrock of biomedical science is reproducibility. Yet, the reality of cutting edge research is that the data, models and questions are highly dynamic. So, while deep learning approaches have shown great promise, the cascading set of choices involved (training data, model choice, hyper-parameters, choice of augmentation etc) has meant that currently these pipelines typically do not reach the level of accountability required by the scientific community. The Rajaram Lab (https://www.rajaramlab.org) in collaboration with the UTSW BioHPC (https://portal.biohpc.swmed.edu) aims to develop a framework optimized for generating reproducible deep-learning pipelines in biomedicine. By building on existing frameworks such DVC (dvc.org), we aim to simultaneously perform version control on data, code and models, thereby allowing us to know exactly what went into developing a specific model, and to compare different models. As the work will be developed within a high-performance computing environment to support ongoing research, we expect that it will address several issues specific to biomedical research, and shall be of broad scientific utility. We are looking for an undergraduate passionate about deep-learning and reproducible research to spearhead this project.
Interpretable Deep Learning for Cancer-Associated T-Cell Receptors
Bo Li's group
T cells are critical in mediating adaptive immunity by selectively killing the target cells. The recognition of target cells is through the binding of the T cell receptor (TCR) and the antigens presented on the surface of the target cells. TCRs are genetically diversified through a biological process called V(D)J recombination, which is able to produce 10^15-10^16 different types of T cells in humans. Such diversity allows efficient recognition of a wide spectrum of antigens, including virus, bacteria, cancer, etc. We developed a novel deep learning algorithm, DeepCAT, that is able to distinguish cancer vs non-cancer TCRs, with ~80% AUC. It can also de novo predict neoantigen-specific TCRs that were never seen in the training data. This indicates that either cancer or non-cancer TCRs share common biochemical signatures, a fact that was previously unknown. We want to learn these signatures through approach of interpretable deep learning, by looking through the 'black box' of the layers of neural network, and find which parameters are reflective of the cancer/non-cancer distinction. The findings from this research will provide useful insights of the co-evolution of tumor and T cells in the microenvironment and inspire novel diagnostic or therapeutic approaches.
3D Visualization of an Entire Cell
Daehwan Kim's group
3D visualization of an entire cell: this project aims to develop a 3D computer graphics program that can visualize an entire cell (e.g. E. coli) with its inner and outer membranes, cell walls, and other numerous molecules such as DNA, RNA, proteins, and metabolites. The program will also allow users to operate a “spaceship” within and outside a cell to explore various cellular components at different angles and sizes, like the Magic School Bus. To do this, we will need to develop several key algorithms and data structures for efficiently storing molecular structural information, identifying molecules that are currently visible from the viewpoint, and rapidly rendering molecules with different levels of detail on the fly in real time.