Machine learning (ML) is an application of artificial intelligence (AI) in which computer systems learn and improve automatically from experience, that is, collected data without explicit programming. As the size and complexity of biological data grow, it becomes very hard to extract meaningful features – they may be even high-dimensional – and build prediction models. Hence, ML is drawing the interest of a wide range of biomedical scientists as well as clinicians and practitioners to use ML as disruptive technique to augment human intuition in data analysis. One of most well-known ML techniques is Deep Learning (DL), which accomplishes automatic feature extraction and prediction through a deep layer structure corresponding to a model with very high complexity. Already, this has resulted in numerous, remarkable success stories for biological/medical data. For example, A. Esteva et al. (2017) developed a deep convolutional neural network for skin cancer classification. They just find-tuned a pre-trained DL architecture with their skin images and diagnostic labels without worrying about which features work best for the tasks or how to extract them from the skin images. Their DL machine turned out to achieve dermatologist-level skin cancer classification.
Detection of ear deformity in Children: Ear deformities are congenital abnormalities that affect the aesthetic appearance of the external structure of the ear in 5% of the pediatric population. Patients with ear deformity often undergo surgical correction or neonatal ear molding to repair the aesthetic appearance to improve the child’s quality of life. In this study, we used transfer learning methods to automatically classify ear deformity and normal ears based on CNN. This study is in collaboration with Dr. Rami Hallac, PhD, Assistant Professor in the Department of Plastic Surgery.
Prediction of fluorescent labels from unlabeled microscopic images: E.M. Chistiansen et al. (2018) showed a computational ML approach named “In Silico Labeling” that reliably predicted the distribution of fluorescent labels from transmitted-light images of unlabeled biological samples. We are currently developing ideas for a DL machine that will predict chromosome, spindle, and centrosome labels. This study is in collaboration with Dr. Hanzhi Wang, PhD, Instructor in Dr. Guo-Min Li's Lab, Department of Radiation Oncology.
Parallel Programming in MATLAB on BioHPC: MATLAB facilitates the implementation of various ML algorithms thanks to its embedded ML algorithms and Neural Network Toolbox. Furthermore, MATLAB Parallel Computing Toolbox enables us to use BioHPC nodes’ graphics processing units (GPUs) for high speed computation required for ML-based tool development.
BioHPC OnDemand DIGITS: DIGITS from NVIDIA is a web-based platform allowing BioHPC users to easily harness the power of modern deep learning toolkits, including Caffe, Torch, and Tensorflow. DIGITS provides a web interface to upload and explore datasets; define, train, and test models; interactively examine predictions. DIGITS can use multi-GPUs and allows easy setup of jobs with different parameters, minimizing coding so that you can concentrate on your applying deep learning to your data.