Genome scale technologies can measure thousands or even millions of molecules but the individual measurements are often noisy redouts of the internal biological state. For many computational approaches the ultimate goal is to infer the true biological state parameters from their effects on a much larger set of experimental readouts. Indeed, much of genomic data analysis can be viewed as a special case of dimensionality reduction. Unlike a generic dimensionality reduction problem, however, in the biological case we have extensive prior knowledge about the data-generating process. In this talk we will discuss several methods that exploit this prior knowledge to create interpretable low-dimensional representations for genome scale datasets.
Keywords: genomics, factor analysis, dimensionality reduction, prior information
Dr. Maria Chikina is an assistant professor of Computational and Systems Biology at the University of Pittsburgh School of Medicine. She received her BSc degree in Mathematics and Biology from the University of Chicago and a PhD in Computational Biology from Princeton University with Olga Troyanskaya as her advisor. She did her postdoctoral research with Stuart Sealfon at the Icahn School of Medicine at Mount Sinai. Her group works on diverse data problems ranging from molecular evolution to machine learning for large datasets.