# CCB Virtual Retreat 2021

America/New_York
Zoom

#### Zoom

Topic: CCB Retreat - 1/20-1/22 Join from PC, Mac, Linux, iOS or Android: https://simonsfoundation.zoom.us/j/91900242103?pwd=Qmp0Uk00TDYrZmxab2J1QmVnQ09Vdz09 Passcode: 133184 Or Telephone: Dial(for higher quality, dial a number based on your current location): US: +1 646 558 8656 or +1 312 626 6799 or +1 301 715 8592 or +1 669 900 6833 or +1 253 215 8782 or +1 346 248 7799 Meeting ID: 919 0024 2103 Passcode: 133184
Indira Goris
• Tuesday, January 19
• 5:00 PM 7:00 PM
CCB Happy Hour/Wine Tasting: Happy Hour/Wine Tasting
• Wednesday, January 20
• 10:00 AM 10:10 AM
Welcome & Opening 10m
Speaker: Michael J. Shelley
• 10:10 AM 10:30 AM
Systems Biology in 2021 20m
Speaker: Rich Bonneau
• 10:30 AM 10:50 AM
Genomics in 2021 20m
Speaker: Olga Troyanskaya
• 10:50 AM 11:10 AM
Biophysical Modeling in 2021 20m
Speaker: Michael Shelley
• 11:10 AM 11:30 AM
Developmental Dynamics in 2021 20m
Speaker: Stas Shvartsman
• 11:30 AM 11:40 AM
Break/Overflow Time 10m
• 11:40 AM 11:50 AM
Excited about Structural and Molecular Biophysics (Sonya Hanson, Structural and Molecular Biophysics) 10m

As Richard Feynman said in The Feynman Lectures on Physics “if we were to name the most powerful assumption of all, which leads one on and on in an attempt to understand life, it is that all things are made of atoms, and that everything that living things do can be understood in terms of the jigglings and wigglings of atoms.” Understanding how these 'jigglings and wigglings of atoms' translate into the mechanisms of key biological processes is an exciting topic of modern science. I will be talking about upcoming projects in our new little team, most but not all of which relate to elucidating mechanisms of biological temperature sensing.

Speaker: Sonya Hanson
• 11:50 AM 12:00 PM
A Bayesian Approach for Extracting Free Energy Profiles from Cryo-electron Microscopy Experiments Using a Path Collective Variable (Pilar Cossio, Structural and Molecular Biophysics) 10m

Cryo-electron microscopy (cryo-EM) extracts single-particle density projections of individual biolmolecules. Although cryo-EM is widely used for 3D reconstruction, due to its single-particle nature, it has the potential to provide information about a biomolecule's conformational variability and underlying free energy landscape. However, treating cryo-EM as a single-molecule technique is challenging because of the low signal-to-noise ratio in the individual particles. In this work, we developed a Bayesian framework that uses a path collective variable to extract free energy profiles and their uncertainties from cryo-EM images. We tested the framework over several systems, finding that for realistic cryo-EM environments, and relevant biomolecular systems, it is possible to recover the underlying free energy.

Speaker: Pilar Cossio
• 12:00 PM 12:10 PM
Cell Division, Energy and Embryos (Daniel Needleman, Biophysical Modeling) 10m
Speaker: Daniel Needleman
• 12:10 PM 12:20 PM
3D Instance Segmentation of Nuclei in Mouse Embryogenesis (Lisa Brown, Developmental Dynamics) 10m

We are developing 3D models for instance segmentation of nuclei during mouse embryogenesis. To that end, we are creating the first 3D dataset of nuclei acquired with light-sheet microscopy with 3D instances fully annotated. We are designing an end-to-end 3D instance segmentation model, using extensive data augmentation based on the noise properties of our light-sheet microscope and a statistical model of nuclei shape properties.

Speaker: Lisa Brown
• 12:20 PM 12:30 PM
Further Applications of Quantum Computing for Biology and Biochemistry (Vikram Mulligan, Systems Biology) 10m

Protein design involves hard combinatorial problems, and we have shown that these can be mapped to current-generation quantum computers. This talk will briefly summarize our past work in applying quantum computing to design molecules (including candidate COVID-19 therapeutics), and will introduce new directions in which we hope to expand this nascent field in the coming year.

Speaker: Vikram Mulligan
• 12:30 PM 2:00 PM
Lunch/Break 1h 30m
• 1:00 PM 2:00 PM
Yoga (Lucianna Silvestri) 1h
• 2:00 PM 2:10 PM
Building a Three-Dimensional Model of Human Nephrogenesis (Rachel Sealfon, Genomics) 10m

The function of organs is underpinned by their three-dimensional structures, but it is challenging to precisely characterize the molecular features of these structures. Here we address this challenge specifically as it relates to the complex process of nephrogenesis, which is critical to understand for improving treatment for congenital abnormalities of the kidney and building kidney organoids. We develop a three-dimensional model of the developing human kidney by integrating single-cell expression data with three-dimensional protein confocal microscopy images using a data-driven framework.

Speaker: Rachel Sealfon
• 2:10 PM 2:20 PM
Simulating Active Fluids (David Stein, Biophysical Modeling) 10m

Complex and active fluids can serve as simplified models for many biophysical systems. I'll discuss some methods we've developed to enable fast, scalable simulations of these fluids confined within evolving geometries, such as deforming droplets or vesicles.

Speaker: David Stein
• 2:20 PM 2:30 PM
AMSOS: Biology, Optimization, and HPC (Wen Yan, Biophysical Modeling) 10m

Some large scale bio-assembly simulations and how we implemented them.

Speaker: Wen Yan
• 2:30 PM 2:40 PM
The Impact of RNA-Binding Protein Dysregulation on Disease Risk (Christopher Park, Genomics) 10m

Despite the strong genetic basis of developmental disorders, the underlying molecular mechanisms are largely unmapped. RNA-binding proteins (RBPs) are responsible for most posttranscriptional regulation, thus act as key gatekeepers of cellular homeostasis. Here, we quantify the RBP target site dysregulation effects of inherited and de novo mutations and investigate its contribution to disease risk.

Speaker: Christopher Park
• 2:40 PM 2:50 PM
Break/Overflow Time 10m
• 2:50 PM 3:20 PM
Mapping the Microbiome by Large-Scale Structure and Function prediction (Julia Koehler, Doug Renfrew, and Vladimir Gligorijevic, Systems Biology) 30m

The microbiome consists of the about 60% microbial cells in and around our body that give rise to 99% of the genes in our body. We are just beginning to understand the role of the human gut microbiome in numerous diseases such as type 1 diabetes, immune system disorders and digestive disorders such as ulcerative colitis and Crohn’s disease. Here we take a structure-function perspective by running large-scale computational prediction of structures, functions and interactions of the proteins in these microbes to gain insights into disease mechanisms.

Speakers: Doug Renfrew, Julia Koehler, Vladimir Gligorijevich
• 3:20 PM 3:30 PM
SARS-CoV-2 receptor networks in diabetic and COVID-19–associated kidney disease (Aaron Wong, Genomics) 10m

COVID-19 morbidity and mortality are increased via unknown mechanisms in patients with diabetes and kidney disease. I will discuss our work developing HumanBase and how the system was used to find molecular network modules induced in kidney cells susceptible to SARS-CoV-2 infection in diabetic kidney disease and linked to viral processes, such as viral entry, immune activation, endomembrane reorganization, and RNA processing.

Speaker: Aaron Wong
• 3:30 PM 3:40 PM
Biophysical modeling of early embryogenesis in nematodes (Reza Farhadifar, Biophysical Modeling) 10m

Mitotic spindle is the central apparatus driving cell division in all Eukaryotic cells. I will present a simple mathematical model of spindle dynamics and positioning, which quantitatively explains the spindle's universal behavior across nematode species.

• 3:40 PM 3:50 PM
Single Cell Epigenetics and Transcriptomics Data Analysis for Viral and Bacterial Infection (Xi Chen, Genomics) 10m

COVID-19 pandemic has already caused over 2 million deaths globally, mostly due to acute lung injury and acute respiratory distress syndrome. To better interpret immune and cellular responses associated with viral/bacterial infection, we collaborate with Mount Sinai and generate comprehensive cellular maps capturing more than ten different types of infections including SARS-CoV-2 through single-cell sequencing of human blood samples.

Speaker: Xi Chen
• 4:00 PM 4:10 PM
From Molecular Scale Interactions to Phenotypes: Design Principles for Living Materials (Sebastian Fürthauer, Biophysical Modeling) 10m

Our main concern is this: How can large functional structures emerge from smaller scale agents and how are these processes of self-organization controlled? I will give a brief overview of the current challenges and our recent progress in attacking these questions, focussing on the physics of the cytoskeleton.

Speaker: Sebastian Fürthauer
• 4:20 PM 4:35 PM
BREAKOUT SESSION PLANNING 15m
• Thursday, January 21
• 10:00 AM 10:10 AM
Algorithms for Describing and Predicting Chromatin Remodeling during Cortical Interneuron Development (Mariano Gabitto, Systems Biology) 10m

Interneurons within the mature cortex are remarkably diverse in their morphology, connectivity, and transcriptional signatures. How developmental changes in chromatin structure contribute to the emergence of distinct interneuron subtypes is unknown. In recent years, single-cell ATAC-seq has become the leading assay for probing the chromatin regulatory landscape. How do we discover structure in this high-dimensional data and use it to understand interneuron development? During this talk, I will describe Bayesian state-space models to characterize chromatin information. I will use this model to describe the chromatin developmental landscape of cortical Interneurons. By focusing on two of the largest cardinal classes of interneurons, I identify three epochs during which chromatin is refined. After identifying enriched transcriptional regulators at different epochs, I harness transcriptional and chromatin information to calculate cell type-specific gene regulatory networks. I will show that remodeling of gene regulatory networks echoes chromatin reshaping during different epochs. Finally, I will briefly demonstrate that gene regulatory networks provide a framework to predict

Speaker: Mariano Gabitto
• 10:10 AM 10:20 AM
HumanBase and Sleipnir: Architectures for Large Scale Genomics (Jerry Vinokurov, Genomics) 10m

I will be talking about the work I have done in the last year of modernizing both the HumanBase pipeline and the associated Sleipnir library. I will cover the transition to a more principled architecture for both projects and future work.

Speaker: Jerry Vinokurov
• 10:20 AM 10:30 AM
Protein Structural Alignments From Sequence (Jamie Morton, Systems Biology) 10m

Computing sequence similarity is a fundamental task in biology, with alignment forming the basis for the annotation of genes and genomes and providing the core data structures for evolutionary analysis. Standard approaches are a mainstay of modern molecular biology and rely on variations of edit distance to obtain explicit alignments between pairs of biological sequences. However, sequence alignment algorithms struggle with remote homology tasks and cannot identify similarities between many pairs of proteins with similar structures and likely homology. Recent work suggests that using machine learning language models can improve remote homology detection. To this end, we introduce DeepBLAST, that obtains explicit alignments from residue embeddings learned from a protein language model integrated into an end-to-end differentiable alignment framework. This approach can be accelerated on the GPU architectures and outperforms conventional sequence alignment techniques in terms of both speed and accuracy when identifying structurally similar proteins.

Speaker: Jamie Morton
• 10:30 AM 10:40 AM
Integrated view of Posterior Patterning in Drosophila Embryo (Maria Avdeeva, Developmental Dynamics) 10m

During early Drosophila embryogenesis, a network of gene regulatory interactions orchestrates terminal patterning responsible for the subsequent formation of the gut. We have used MS2/MCP mRNA labeling system to obtain live imaging of gene expression at the posterior end of the fly embryo. We use this data to demonstrate our ability to construct an integrated view of multivariable gene expression dynamics in the embryo, with our analysis revealing low intrinsic dimensionality of posterior patterning.

Speaker: Maria Avdeeva
• 10:40 AM 10:50 AM
Mechanics of Planar Tissues with Active Surfaces (XinXin Du, Biophysical Modeling) 10m

Many biological developmental systems feature tissues that mold their own structure using active forces that live on their surfaces in the form of populations of force-producing molecules. I am presenting a 3D model of these tissues in a rectangular geometry that emphasizes tracking the internal 3D structure of how the material moves, deep into the plane of the tissue. This model benefits studying cell deformations that are non-uniform in the perpendicular direction of cells in a planar tissue.

Speaker: XinXin Du
• 10:50 AM 11:00 AM
AutoML for Genomics (Frank [Zijun] Zhang, Genomics) 10m

AMBER is a fully automated framework to efficiently design and apply CNNs for genomic sequences. Interpretation of AMBER architecture search revealed its design principles of utilizing the full space of computational operations for accurately modelling genomic sequences. We illustrated the use of AMBER to accurately discover functional genomic variants in allele-specific binding and disease heritability enrichment.

Speaker: Frank (Zijung) Zhang
• 11:00 AM 11:10 AM
What is hiding in chromosome conformation capture data? Searching for overlooked mechanisms that organize DNA 10m

Chromosome conformation capture technologies(3C, Hi-C, Micro-C, etc.), assays that map nucleotide separation of genomic regions to their average proximity in space, have greatly increased our understanding of how DNA is organized. However, a full picture of the physical mechanisms behind the hierarchical and dynamic structures of chromosomes remains elusive. I am developing methods from Multi-View Convolutional Neural Networks (MVCNN) to more efficiently search high-resolution 3C data for overlooked topological features. These features then inform physical theories on how chromosomes organize, why they follow certain principles, and what may drive organization in the first place.

• 11:10 AM 11:20 AM
Axisymmetric Membranes with Edges under External Force (Leroy Jia, Biophysical Modeling) 10m

A time-honored problem in the calculus of variations is to show that a soap film suspended between two symmetric rings forms a catenoid. Inspired by experiments from the Dogic (UCSB) and Sharma (IIS) Labs, we will explore the shapes of a membrane with a resistance to bending and a fixed area in this geometry. In so doing, we unify various classic shapes such as catenoids, thin tethers, and elastica as different limits of a single system.

Speaker: Leroy Jia
• 11:20 AM 11:30 AM
Common Coordinate Registration of High-Resolution Histology Images (Aidan Daly, Systems Biology) 10m

Registration of histology images from multiple sources is a pressing problem in large-scale studies of spatial -omics data. Researchers often perform common coordinate registration,'' akin to segmentation, in which samples are partitioned based on tissue type to allow for quantitative comparison of similar regions across samples. I will present on a novel CNN architecture designed to address this problem and motivated in particular by the data acquired from spatial transcriptomics experiments.

Speaker: Aidan Daly
• 11:30 AM 11:40 AM
A Theoretical Framework for Active Chiral Filaments (Aleks Plochocka, Biophysical Modeling) 10m

The cell cytoskeleton - the active structure that drives cellular motion - is an active chiral fluid. Chirality refers to the breaking of left-right symmetry and can be observed experimentally. We develop a framework for coarse graining chiral filament-filament interactions to obtain an active gel theory for living chiral materials.

Speaker: Aleks Plochocka
• 11:40 AM 11:50 AM
Longitudinal Data Analysis of COVID-19 (Natalie Sauerwald, Genomics) 10m

I will share some preliminary results from data analysis work searching for patterns in gene expression data from a longitudinal COVID-19 study. Time permitting, I will also discuss other avenues for my future work here.

Speaker: Natalie Sauerwald
• 11:50 AM 12:00 PM
Learning to Count: The Problem of Transit Amplification (Hayden Nunley, Developmental Dynamics) 10m

Both in the germline and in embryos, proper development depends on the control of when cells start dividing, which cells divide in synchrony, and when divisions halt. The genesis of germline cysts via rounds of incomplete cytokinesis inspires a minimal model of cells that divide only a finite number of times and that remain coupled to their progeny.

Speaker: Hayden Nunley
• 12:00 PM 12:10 PM
Active Nuclear Transport in Living Cells (Alex Rautu, Biophysical Modeling) 10m

The transport of cargo into and out of the nucleus is crucial for the functioning of all eukaryotes cells. Although the migration of ions and small molecules can occur diffusively, the nuclear transport of larger macromolecules generally require an active process. Here, a physical model of this mechanism is discussed that allows us to describe the temporal and spatial distribution of cargo within the nucleus, and its role in the regulation of other cellular processes such as chromatin remodeling and transcription.

Speaker: Alex Rautu
• 12:10 PM 12:20 PM
Building Tools For Biologists on the Web (Julien Funk, Genomics) 10m

I will share my work and process of designing, deploying and maintaining the tools of HumanBase.

Speaker: Julien Funk
• 12:20 PM 12:30 PM
A Minimal Model for Ciliary Coordination (Brato Chakrabarti, Biophysical Modeling) 10m

I will talk about a one dimensional reduced model for ciliary beating. This model captures some salient features of hydrodynamic synchronization in ciliary arrays and also highlights the role of heterogeneity in metachronal coordination.

Speaker: Brato Chakrabarti
• 12:30 PM 12:40 PM
Sticky, Stretchy, Sliding Cells: A Minimal Model for Cell Intercalation (Tatyana Gavrilchenko, Developmental Dynamics) 10m

Tissue morphogenesis is driven by cell rearrangements, which stem from mechanical forces generated at membrane interfaces and stabilized by tricellular junctions. Existing vertex models are successful in describing bulk tissue flow, but do not resolve cellular interfaces consisting of two separate membranes. We present our progress on a model that captures single-cell-level dynamics during rearrangements in small clusters of cells, driven by adhesions between neighboring cell membranes.

Speaker: Tatyana Gavrilchenko
• 12:40 PM 2:00 PM
Lunch/Break 1h 20m
• 1:00 PM 2:00 PM
Trivia Challenge 1h
• 2:00 PM 3:30 PM
Keynote Speaker: Katie Pollard, PhD 1h 30m

Katie Pollard, PhD is the Director of the Gladstone Institute of Data Science and Biotechnology, Professor of Epidemiology and Biostatistics at UC San Francisco, Director of the Biomedical Informatics Graduate Program at UC San Francisco, and an Investigator, Chan Zuckerberg Biohub.

Katie Pollard earned her BA at Pomona College and her master’s degree and PhD in biostatistics from UC Berkeley. At Berkeley, she developed computationally intensive statistical methods for the analysis of microarray data with applications in cancer biology. She implemented these approaches in Bioconductor, an open-source software program used with high-throughput genomic data. As a comparative genomics postdoctoral fellow at UC Santa Cruz, Pollard participated in the Chimpanzee Genome Project and used this sequence to identify the fastest-evolving regions in the human genome, known as Human Accelerated Regions.

Before joining Gladstone, Pollard was an assistant professor in the Genome Center and Department of Statistics at UC Davis. She was awarded the Thomas J. Watson Fellowship in 1995 and the Sloan Research Fellowship in 2008. Pollard is a fellow of the California Academy of Sciences and a Chan Zuckerberg Biohub Investigator. In 2018, she became the founding director of the Gladstone Institute of Biotechnology and Data Science. Pollard is a member of the American Society of Human Genetics, the American Statistical Association, and the International Society for Computational Biology.

Speaker: Katie Pollard
• 3:30 PM 5:00 PM
BREAKOUT SESSIONS 1h 30m

(1) Interpretability of Machine Learning
(2) Cells and Tissues
(3) Molecular Dynamics

• Friday, January 22
• 9:30 AM 9:40 AM
Computational Platform for Simulation of Cellular Mechanics (Gokberk Kabacaoglu, Biophysical Modeling) 10m

I will review our computational platform for the large-scale three-dimensional simulations of flexible filaments, motor proteins and rigid bodies in a Stokesian fluid.

Speaker: Gokberk Kabacaoglu
• 9:40 AM 9:50 AM
Biophysics of Arrested Cytokinesis (Jaspreet Singh, Developmental Dynamics) 10m

Incomplete cytokinesis leads to the formation of daughter cells that are connected through arrested cleavage furrows. These develop into stable intercellular bridges that are critical for processes such as cell differentiation and cell cycle synchronization. We show that precise control of the magnitude and duration of contractility, together with the evolving stiffness of the F-actin ring is essential for assembling intercellular bridges with experimentally observed aspect ratios.

Speaker: Jaspreet Singh
• 9:50 AM 10:00 AM
Pseudotemporal Ordering and Network Inference from Single Cell Technologies (Anders Rasmussen, Systems Biology) 10m

Single cell RNA-seq and ATAC-seq provide a means to capture transcriptomic and epigenomic information from individual cells. In cell populations undergoing dynamic processes, pseudotemporal ordering aims to arrange cells by their progression through this process. Many algorithms have been developed for this task, employing techniques in dimensionality reduction and graph construction, however, approaches to infer gene regulatory networks (GRNs) across cellular trajectories remain limited. In this brief talk, I will discuss prospects of learning GRNs in low-dimensional representations of pseudotemporally ordered transcriptomic and epigenomic data.

Speaker: Anders Rasmussen
• 10:00 AM 10:10 AM
Symmetry Breaking Mechanisms in Cell Polarization (Pearson Miller, Developmental Dynamics) 10m

Using numerical methods, we explore spatial-symmetry breaking in reaction diffusion processes in a model cell, with an emphasis on the role cell geometry plays in localizing steady state patterns. These results provide greater insight into mechanisms of cell polarization during development.

Speaker: Pearson Miller
• 10:10 AM 10:20 AM
Break/Overflow Time 10m
• 10:20 AM 10:30 AM
Inferring Single-Cell Gene Regulatory Networks in the Developing Drosophila Optic Lobe (Claudia Skok Gibbs, Systems Biology) 10m

Methods to elucidate single-cell gene regulatory networks must incorporate several orthogonal genomic datatypes. In my talk I will present gene regulatory networks inferred within the developing Drosophila Optic Lobe using the network inference method, the Inferelator.

Speaker: Claudia Skok Gibbs
• 10:30 AM 10:40 AM
Networks in Inflammation (Danxun Li, Systems Biology) 10m

Infiltration of CD4+ T cells into the inflamed intestines of inflammatory bowel disease (IBD) patients is important for disease pathogenesis. Despite many efforts to understand priming and differentiation of CD4+ T cells in IBD, epigenetic characterization of primary CD4+ T cells isolated from intestinal tissues in IBD patient samples is lacking. We integrated ATAC-seq and RNA-seq profiles from primary CD4+ T cells isolated from intestinal pinch biopsies of IBD patients to characterize regulatory networks in inflammation.

Speaker: Danxun Li
• 10:40 AM 10:50 AM
Identifying Turbunance Driving Scales using the Bicoherence (Michael O'Brien, Biophysical Modeling) 10m

Higher-order spectral statistics like the bispectrum have proven to be a natural tools for analysing magnetohydrodynamic turbulence by identifying non-linear scaling correlations and breaking degeneracy in lower order statistics like the power spectrum. We have found that the bicoherence, a related statistic, is able to pick out driving scales in supersonic turbulence with the potential to be used on galaxy observational data. These statistics can shed light on the parallels between classical turbulence and turbulence in living fluids such as active nematics.

Speaker: Michael O'Brien
• 11:00 AM 12:30 PM
Keynote Speaker: Apoorva Mandavilli 1h 30m

Apoorva Mandavilli is a health and science reporter at The New York Times, where she write mostly about infectious diseases. She has also written for The Atlantic, The New Yorker online, Slate, Nature, Scientific American and others.
Apoorva is the 2019 winner of the Victor Cohn Prize for Excellence in Medical Reporting. She likes to tell stories about complex science through the lives of people directly affected by it — whether that’s the second person ever to be cured of H.I.V., women with autism, the people who live near the abandoned Bhopal factory or a man recovering from a traumatic brain injury. Apoorva was the founding editor and editor-in-chief of the autism news site Spectrum from its launch in 2008 through May 2020. With her colleague Nidhi Subbaraman, she launched Culture Dish, a nonprofit dedicated to enhancing diversity in science journalism. For four years, she also served as an adjunct professor at New York University’s Science Health and Environmental Reporting Program.

Speaker: Apoorva Mandavilli
• 12:30 PM 2:00 PM
Lunch/Break 1h 30m
• 1:00 PM 1:30 PM
Meditation 30m
• 2:00 PM 3:30 PM
BREAKOUT SESSIONS 1h 30m

(1) Combining Machine Learning and Dynamical Biophysical Models
(2) Reproducibility, Software Development, and Benchmarking
(3) CCB Group Activities

• 3:30 PM 3:40 PM
Closing Remarks 10m
Speaker: Michael J. Shelley