November 23, 2020
Virtual
America/New_York timezone

Lab meeting will take place on
Monday, November 23, 2020
10:00am 


Presenter: Vikram Mulligan, Ph.D., Research Scientist,  Systems Biology Group, CCB 

Computational methods for molecular design and prediction in the absence of large amounts of prior knowledge

The recent surge in interest in machine learning (ML) methods has given birth to many ML applications in protein design and prediction of protein structures and properties.  While powerful, advanced ML methods such as deep neural networks require large pools of training data.  Although the 150,000 experimentally-determined protein structures in the PDB and the 195,000,000 amino acid sequences in the Uniprot database have been invaluable for certain ML tasks, they are less useful for others.  In particular, the design of folding molecules built from synthetic building-blocks that are not represented in the structural or sequence databases is a challenge that is difficult to meet by methods that rely on prior knowledge alone.  In this talk, I will review methods that I have been developing for sampling and predicting tertiary and quaternary structures of coiled-coil heteropolymers in the absence of structural templates.  I will also discuss two machine learning projects that are ongoing.  In the first, I am working to produce a fast estimator of a synthetic heteropolymer's fold propensity by training it not against data from known foldamers (of which there are very few), but against slow but reliable simulations.   In the second, inspired by ongoing collaborations with researchers interested in predicting the future mutations of key SARS-CoV-2 proteins, I am trying to improve the accuracy of current protocols for quantifying the impact of mutation on a natural protein's fold propensity by training a very simple ML model against limited experimental data to provide a correction factor for simulation-based approaches.  A long-range goal of all of these projects is to try to find general ways in which simulation-based approaches and ML approaches may complement each other in realms in which prior knowledge is limited.

Starts
Ends
America/New_York
Virtual
For access, please contact Camille Norrell - cnorrell@flatironinstitute.org