|Instructor||Prof. Ellen Zhong|
|Time||Thursdays 3:00-5:00p, Julis Romo A02|
|"Precept" / student-only discussion||Wednesdays 1:00-2:00p, CS 301|
|Office hours||Mondays 4:00-5:00p, CS 314|
Recent breakthroughs in machine learning algorithms have transformed the study of the 3D structure of proteins and other biomolecules. This seminar class will survey recent papers on ML applied to tasks in protein structure prediction, structure determination, computational protein design, physics-based modeling, and more. We will take a holistic approach when discussing papers, including discussing their historical context, algorithmic contributions, and potential impact on scientific discovery and applications such as drug discovery.
For more information on the discussion format, expectations, and grading, see the course syllabus.
A non-exhaustive list of topics we will cover include:
Selected papers will cover a broad range of algorithmic concepts and machine learning techniques including:
In addition to the assigned papers, optional primers or reviews on relevant topics will be made available for background reading.
Assignment 1. Due 11am, Friday, September 30th via Canvas
Thursday September 22nd, 3pm ET
Dr. Michael Figurnov (DeepMind)
Title: Highly accurate protein structure prediction with AlphaFold
Abstract: Predicting a protein’s structure from its primary sequence has been a grand challenge in biology for the past 50 years, holding the promise to bridge the gap between the pace of genomics discovery and resulting structural characterization. In this talk, we will describe work at DeepMind to develop AlphaFold, a new deep learning-based system for structure prediction that achieves high accuracy across a wide range of targets. We demonstrated our system in the 14th biennial Critical Assessment of Protein Structure Prediction (CASP14) across a wide range of difficult targets, where the assessors judged our predictions to be at an accuracy “competitive with experiment” for approximately 2/3rds of proteins. The talk will focus on the underlying machine learning ideas, while also touching on the implications for biological research.
Bio: Michael Figurnov is a Staff Research Scientist at DeepMind. He has been working with the AlphaFold team for the past four years. Before joining DeepMind, he did his Ph.D. in Computer Science at the Bayesian Methods Research Group under the supervision of Dmitry Vetrov. His research interests include deep learning, Bayesian methods, and machine learning for biology.
Thursday November 10th, 12:30p ET (CS 105)
Dr. John Ingraham (Generate Biomedicines)
|1||September 8||Course overview; Introduction to machine learning in structural biology||
1. Dill et al. The Protein Folding Problem. Annual Review of Biophysics 2008.
|2||September 15||Protein structure prediction; CASP; supervised learning; The alphabet soup of protein-specific terminology and acronyms||
1. Senior, A.W., Evans, R., Jumper, J. et al. Improved protein structure prediction using
potentials from deep learning. Nature 2020.
2. Ingraham, J. et al. Learning Protein Structure with a Differentiable Simulator. ICLR 2019 Oral. [Talk]
Optional further reading:
4. trRosetta: Yang et al. Improved protein structure prediction using predicted interresidue orientations. PNAS 2020.
|3||September 22||Breakthroughs in protein structure prediction||
1. Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure
prediction with Alphafold. Nature 2021.
2. Tunyasuvunakool, K., Adler, J., Wu, Z. et al. Highly accurate protein structure prediction for the human proteome. Nature 2021.
Optional further reading:
3. AlphaFold2 slides. [CASP14 talk] [Michael Figurnov slides]
5. Baek et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 2021. [paper]
6. Primer on transformers:  
|Guest Seminar (Michael Figurnov) + Paper discussion||N/A|
|4||September 29||Complexes, integrative modeling, and limits of structure prediction||
1. Evans et al. Protein complex
prediction with AlphaFold-Multimer. bioRxiv
2. Terwilliger et al. Improved AlphaFold modeling with implicit experimental information. bioRxiv
Optional further reading:
3. Nuclear pore complexes: https://www.science.org/doi/full/10.1126/science.abq4792?intcmp=trendmd-sci
4. Cluspro: https://www.nature.com/articles/nprot.2016.169
|Paper discussion||Assignment 1 due at 11am Fri, Sept 30th|
|5||October 6||Computer vision and cryo-EM||
1. Zhong et al. Reconstructing continuous distributions of protein structure from cryo-EM images. ICLR 2020 Spotlight.
2. Zhong et al. CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks. Nature Methods 2021. [pdf]
3. Mildenhall, Srinivasan, Tancik et al. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. ECCV 2020 Oral.
Optional further reading: Stay tuned on slack.
|6||October 13||Physics-based modeling||Paper discussion||Assignment 2 due at 11am Fri, Oct 14|
|7||October 20||No class -- Fall Recess||N/A||N/A|
|8||October 27||Geometric deep learning and drug discovery||Paper discussion||N/A|
|9||November 3||Computational protein design||Guest lecture||(Tentative) Assignment 3 due|
|10||November 10||Protein design continued; DNA/RNA structure||Guest Seminar (John Ingraham) + Paper Discussion||N/A|
|11||November 17||Protein language models||Paper discussion||(Tentative) Assignment 4 due|
|12||November 24||No class -- Thanksgiving||N/A||N/A|
|13||December 1||No class -- NeurIPS||N/A||(Tentative) Assignment 5 due|
|14||December 8||Generative modeling of sequence and structure||Paper discussion||(Tentative) Assignment 6 due|
|15||December 15||Structural bioinformatics||Paper discussion||N/A|