I am interested in developing computational tools and statistical models for systems-level understanding of tumor heterogeneity and epignetic dysregulations in cancer.
During my postdoc work, I have developed tools for inferring subpopulations from single-cell gene expression data while simultaneously normalizing and imputing dropouts. This has enabled characterization of the tumor-immune microenvironment using single-cell RNA-seq data from tumor immune infiltrating leukocytes collected from breast cancer patients. I am interested in extending this direction to develop and depoloy machine learning tools to infer epigenetic dysregulations in Acute Myeloid Leukemia through integrating single-cell transcriptional and epigenetic data.
My PhD research was on developing integrative models for regulatory programs in microbial organisms, in particular Mycobacterium tuberculosis, the causative agent of tuberculosis, the fungus Neurospora crassa, a model organism for studying circadian rythyms, and Escherichia coli.
My work involved:
• Constructing predictive models of gene expression by combining ChIP-Seq binding structure and transcriptomics expression data, in E.coli, M. tuberculosis and N. crassa;
• In M. tuberculosis, using the learned models to identify drug resistance mechanisms, drug synergies and conditions that potentiate drug treatments;
• In N. crassa, using learned models to explain diversities in phase differences in circadian rythyms;
• Experimental validation of the hypothesized regulatory programs;
I also worked on learning network models and developing statistical Bayesian inference, from integration of different data types to identify modular dependency structures and showed theoretical advantages of data integration.
At Microsoft Research, I worked on understanding regulation of the hematopoietic pathway using high-throughput data from Acute Myeloid Leukemia patients.