As Richard Feynman said in The Feynman Lectures on Physics “if we were to name the most powerful assumption of all, which leads one on and on in an attempt to understand life, it is that all things are made of atoms, and that everything that living things do can be understood in terms of the jigglings and wigglings of atoms.” Understanding how these 'jigglings and wigglings of atoms' translate into the mechanisms of key biological processes is an exciting topic of modern science. I will be talking about upcoming projects in our new little team, most but not all of which relate to elucidating mechanisms of biological temperature sensing.
Cryo-electron microscopy (cryo-EM) extracts single-particle density projections of individual biolmolecules. Although cryo-EM is widely used for 3D reconstruction, due to its single-particle nature, it has the potential to provide information about a biomolecule's conformational variability and underlying free energy landscape. However, treating cryo-EM as a single-molecule technique is challenging because of the low signal-to-noise ratio in the individual particles. In this work, we developed a Bayesian framework that uses a path collective variable to extract free energy profiles and their uncertainties from cryo-EM images. We tested the framework over several systems, finding that for realistic cryo-EM environments, and relevant biomolecular systems, it is possible to recover the underlying free energy.
We are developing 3D models for instance segmentation of nuclei during mouse embryogenesis. To that end, we are creating the first 3D dataset of nuclei acquired with light-sheet microscopy with 3D instances fully annotated. We are designing an end-to-end 3D instance segmentation model, using extensive data augmentation based on the noise properties of our light-sheet microscope and a statistical model of nuclei shape properties.
Protein design involves hard combinatorial problems, and we have shown that these can be mapped to current-generation quantum computers. This talk will briefly summarize our past work in applying quantum computing to design molecules (including candidate COVID-19 therapeutics), and will introduce new directions in which we hope to expand this nascent field in the coming year.
The function of organs is underpinned by their three-dimensional structures, but it is challenging to precisely characterize the molecular features of these structures. Here we address this challenge specifically as it relates to the complex process of nephrogenesis, which is critical to understand for improving treatment for congenital abnormalities of the kidney and building kidney organoids. We develop a three-dimensional model of the developing human kidney by integrating single-cell expression data with three-dimensional protein confocal microscopy images using a data-driven framework.
Complex and active fluids can serve as simplified models for many biophysical systems. I'll discuss some methods we've developed to enable fast, scalable simulations of these fluids confined within evolving geometries, such as deforming droplets or vesicles.
Some large scale bio-assembly simulations and how we implemented them.
Despite the strong genetic basis of developmental disorders, the underlying molecular mechanisms are largely unmapped. RNA-binding proteins (RBPs) are responsible for most posttranscriptional regulation, thus act as key gatekeepers of cellular homeostasis. Here, we quantify the RBP target site dysregulation effects of inherited and de novo mutations and investigate its contribution to disease risk.
The microbiome consists of the about 60% microbial cells in and around our body that give rise to 99% of the genes in our body. We are just beginning to understand the role of the human gut microbiome in numerous diseases such as type 1 diabetes, immune system disorders and digestive disorders such as ulcerative colitis and Crohn’s disease. Here we take a structure-function perspective by running large-scale computational prediction of structures, functions and interactions of the proteins in these microbes to gain insights into disease mechanisms.
COVID-19 morbidity and mortality are increased via unknown mechanisms in patients with diabetes and kidney disease. I will discuss our work developing HumanBase and how the system was used to find molecular network modules induced in kidney cells susceptible to SARS-CoV-2 infection in diabetic kidney disease and linked to viral processes, such as viral entry, immune activation, endomembrane reorganization, and RNA processing.
Mitotic spindle is the central apparatus driving cell division in all Eukaryotic cells. I will present a simple mathematical model of spindle dynamics and positioning, which quantitatively explains the spindle's universal behavior across nematode species.
COVID-19 pandemic has already caused over 2 million deaths globally, mostly due to acute lung injury and acute respiratory distress syndrome. To better interpret immune and cellular responses associated with viral/bacterial infection, we collaborate with Mount Sinai and generate comprehensive cellular maps capturing more than ten different types of infections including SARS-CoV-2 through single-cell sequencing of human blood samples.
Our main concern is this: How can large functional structures emerge from smaller scale agents and how are these processes of self-organization controlled? I will give a brief overview of the current challenges and our recent progress in attacking these questions, focussing on the physics of the cytoskeleton.
Interneurons within the mature cortex are remarkably diverse in their morphology, connectivity, and transcriptional signatures. How developmental changes in chromatin structure contribute to the emergence of distinct interneuron subtypes is unknown. In recent years, single-cell ATAC-seq has become the leading assay for probing the chromatin regulatory landscape. How do we discover structure in this high-dimensional data and use it to understand interneuron development? During this talk, I will describe Bayesian state-space models to characterize chromatin information. I will use this model to describe the chromatin developmental landscape of cortical Interneurons. By focusing on two of the largest cardinal classes of interneurons, I identify three epochs during which chromatin is refined. After identifying enriched transcriptional regulators at different epochs, I harness transcriptional and chromatin information to calculate cell type-specific gene regulatory networks. I will show that remodeling of gene regulatory networks echoes chromatin reshaping during different epochs. Finally, I will briefly demonstrate that gene regulatory networks provide a framework to predict
I will be talking about the work I have done in the last year of modernizing both the HumanBase pipeline and the associated Sleipnir library. I will cover the transition to a more principled architecture for both projects and future work.
Computing sequence similarity is a fundamental task in biology, with alignment forming the basis for the annotation of genes and genomes and providing the core data structures for evolutionary analysis. Standard approaches are a mainstay of modern molecular biology and rely on variations of edit distance to obtain explicit alignments between pairs of biological sequences. However, sequence alignment algorithms struggle with remote homology tasks and cannot identify similarities between many pairs of proteins with similar structures and likely homology. Recent work suggests that using machine learning language models can improve remote homology detection. To this end, we introduce DeepBLAST, that obtains explicit alignments from residue embeddings learned from a protein language model integrated into an end-to-end differentiable alignment framework. This approach can be accelerated on the GPU architectures and outperforms conventional sequence alignment techniques in terms of both speed and accuracy when identifying structurally similar proteins.
During early Drosophila embryogenesis, a network of gene regulatory interactions orchestrates terminal patterning responsible for the subsequent formation of the gut. We have used MS2/MCP mRNA labeling system to obtain live imaging of gene expression at the posterior end of the fly embryo. We use this data to demonstrate our ability to construct an integrated view of multivariable gene expression dynamics in the embryo, with our analysis revealing low intrinsic dimensionality of posterior patterning.
Many biological developmental systems feature tissues that mold their own structure using active forces that live on their surfaces in the form of populations of force-producing molecules. I am presenting a 3D model of these tissues in a rectangular geometry that emphasizes tracking the internal 3D structure of how the material moves, deep into the plane of the tissue. This model benefits studying cell deformations that are non-uniform in the perpendicular direction of cells in a planar tissue.
AMBER is a fully automated framework to efficiently design and apply CNNs for genomic sequences. Interpretation of AMBER architecture search revealed its design principles of utilizing the full space of computational operations for accurately modelling genomic sequences. We illustrated the use of AMBER to accurately discover functional genomic variants in allele-specific binding and disease heritability enrichment.
Chromosome conformation capture technologies(3C, Hi-C, Micro-C, etc.), assays that map nucleotide separation of genomic regions to their average proximity in space, have greatly increased our understanding of how DNA is organized. However, a full picture of the physical mechanisms behind the hierarchical and dynamic structures of chromosomes remains elusive. I am developing methods from Multi-View Convolutional Neural Networks (MVCNN) to more efficiently search high-resolution 3C data for overlooked topological features. These features then inform physical theories on how chromosomes organize, why they follow certain principles, and what may drive organization in the first place.
A time-honored problem in the calculus of variations is to show that a soap film suspended between two symmetric rings forms a catenoid. Inspired by experiments from the Dogic (UCSB) and Sharma (IIS) Labs, we will explore the shapes of a membrane with a resistance to bending and a fixed area in this geometry. In so doing, we unify various classic shapes such as catenoids, thin tethers, and elastica as different limits of a single system.
Registration of histology images from multiple sources is a pressing problem in large-scale studies of spatial -omics data. Researchers often perform ``common coordinate registration,'' akin to segmentation, in which samples are partitioned based on tissue type to allow for quantitative comparison of similar regions across samples. I will present on a novel CNN architecture designed to address this problem and motivated in particular by the data acquired from spatial transcriptomics experiments.
The cell cytoskeleton - the active structure that drives cellular motion - is an active chiral fluid. Chirality refers to the breaking of left-right symmetry and can be observed experimentally. We develop a framework for coarse graining chiral filament-filament interactions to obtain an active gel theory for living chiral materials.
I will share some preliminary results from data analysis work searching for patterns in gene expression data from a longitudinal COVID-19 study. Time permitting, I will also discuss other avenues for my future work here.
Both in the germline and in embryos, proper development depends on the control of when cells start dividing, which cells divide in synchrony, and when divisions halt. The genesis of germline cysts via rounds of incomplete cytokinesis inspires a minimal model of cells that divide only a finite number of times and that remain coupled to their progeny.
The transport of cargo into and out of the nucleus is crucial for the functioning of all eukaryotes cells. Although the migration of ions and small molecules can occur diffusively, the nuclear transport of larger macromolecules generally require an active process. Here, a physical model of this mechanism is discussed that allows us to describe the temporal and spatial distribution of cargo within the nucleus, and its role in the regulation of other cellular processes such as chromatin remodeling and transcription.
I will share my work and process of designing, deploying and maintaining the tools of HumanBase.
I will talk about a one dimensional reduced model for ciliary beating. This model captures some salient features of hydrodynamic synchronization in ciliary arrays and also highlights the role of heterogeneity in metachronal coordination.
Tissue morphogenesis is driven by cell rearrangements, which stem from mechanical forces generated at membrane interfaces and stabilized by tricellular junctions. Existing vertex models are successful in describing bulk tissue flow, but do not resolve cellular interfaces consisting of two separate membranes. We present our progress on a model that captures single-cell-level dynamics during rearrangements in small clusters of cells, driven by adhesions between neighboring cell membranes.
Katie Pollard, PhD is the Director of the Gladstone Institute of Data Science and Biotechnology, Professor of Epidemiology and Biostatistics at UC San Francisco, Director of the Biomedical Informatics Graduate Program at UC San Francisco, and an Investigator, Chan Zuckerberg Biohub.
Katie Pollard earned her BA at Pomona College and her master’s degree and PhD in biostatistics from UC Berkeley. At Berkeley, she developed computationally intensive statistical methods for the analysis of microarray data with applications in cancer biology. She implemented these approaches in Bioconductor, an open-source software program used with high-throughput genomic data. As a comparative genomics postdoctoral fellow at UC Santa Cruz, Pollard participated in the Chimpanzee Genome Project and used this sequence to identify the fastest-evolving regions in the human genome, known as Human Accelerated Regions.
Before joining Gladstone, Pollard was an assistant professor in the Genome Center and Department of Statistics at UC Davis. She was awarded the Thomas J. Watson Fellowship in 1995 and the Sloan Research Fellowship in 2008. Pollard is a fellow of the California Academy of Sciences and a Chan Zuckerberg Biohub Investigator. In 2018, she became the founding director of the Gladstone Institute of Biotechnology and Data Science. Pollard is a member of the American Society of Human Genetics, the American Statistical Association, and the International Society for Computational Biology.
(1) Interpretability of Machine Learning
(2) Cells and Tissues
(3) Molecular Dynamics
I will review our computational platform for the large-scale three-dimensional simulations of flexible filaments, motor proteins and rigid bodies in a Stokesian fluid.
Incomplete cytokinesis leads to the formation of daughter cells that are connected through arrested cleavage furrows. These develop into stable intercellular bridges that are critical for processes such as cell differentiation and cell cycle synchronization. We show that precise control of the magnitude and duration of contractility, together with the evolving stiffness of the F-actin ring is essential for assembling intercellular bridges with experimentally observed aspect ratios.
Single cell RNA-seq and ATAC-seq provide a means to capture transcriptomic and epigenomic information from individual cells. In cell populations undergoing dynamic processes, pseudotemporal ordering aims to arrange cells by their progression through this process. Many algorithms have been developed for this task, employing techniques in dimensionality reduction and graph construction, however, approaches to infer gene regulatory networks (GRNs) across cellular trajectories remain limited. In this brief talk, I will discuss prospects of learning GRNs in low-dimensional representations of pseudotemporally ordered transcriptomic and epigenomic data.
Using numerical methods, we explore spatial-symmetry breaking in reaction diffusion processes in a model cell, with an emphasis on the role cell geometry plays in localizing steady state patterns. These results provide greater insight into mechanisms of cell polarization during development.
Methods to elucidate single-cell gene regulatory networks must incorporate several orthogonal genomic datatypes. In my talk I will present gene regulatory networks inferred within the developing Drosophila Optic Lobe using the network inference method, the Inferelator.
Infiltration of CD4+ T cells into the inflamed intestines of inflammatory bowel disease (IBD) patients is important for disease pathogenesis. Despite many efforts to understand priming and differentiation of CD4+ T cells in IBD, epigenetic characterization of primary CD4+ T cells isolated from intestinal tissues in IBD patient samples is lacking. We integrated ATAC-seq and RNA-seq profiles from primary CD4+ T cells isolated from intestinal pinch biopsies of IBD patients to characterize regulatory networks in inflammation.
Higher-order spectral statistics like the bispectrum have proven to be a natural tools for analysing magnetohydrodynamic turbulence by identifying non-linear scaling correlations and breaking degeneracy in lower order statistics like the power spectrum. We have found that the bicoherence, a related statistic, is able to pick out driving scales in supersonic turbulence with the potential to be used on galaxy observational data. These statistics can shed light on the parallels between classical turbulence and turbulence in living fluids such as active nematics.
Apoorva Mandavilli is a health and science reporter at The New York Times, where she write mostly about infectious diseases. She has also written for The Atlantic, The New Yorker online, Slate, Nature, Scientific American and others.
Apoorva is the 2019 winner of the Victor Cohn Prize for Excellence in Medical Reporting. She likes to tell stories about complex science through the lives of people directly affected by it — whether that’s the second person ever to be cured of H.I.V., women with autism, the people who live near the abandoned Bhopal factory or a man recovering from a traumatic brain injury. Apoorva was the founding editor and editor-in-chief of the autism news site Spectrum from its launch in 2008 through May 2020. With her colleague Nidhi Subbaraman, she launched Culture Dish, a nonprofit dedicated to enhancing diversity in science journalism. For four years, she also served as an adjunct professor at New York University’s Science Health and Environmental Reporting Program.
(1) Combining Machine Learning and Dynamical Biophysical Models
(2) Reproducibility, Software Development, and Benchmarking
(3) CCB Group Activities