Cosmic Connections: a Symposium to Explore the Intersection of Astrophysics and Machine Learning” at the Simons Foundation’s Flatiron Institute will take place from May 22 to May 24, 2023 in New York City. The symposium aims to bring together machine learning researchers and astrophysicists to discuss what are the most interesting astrophysical challenges and where state-of-the-art machine learning models are expected to outperform conventional methods. We aim for a relatively small symposium with ~100 participants to allow possibilities of brain-storming, collaboration building with a relatively informal atmosphere.
The overarching theme this year is unsupervised and generative models which have surprised us with recent successes in transformers, diffusion models and foundation models, with related areas of interest including simulation-based inference and DL-accelerated simulations.
Invited Spekers include:
Peter Battaglia (Deepmind)
Josh Bloom (Berkeley)
Joan Bruna (NYU)
Diana Cai (Princeton)
Kyunghyun Cho (NYU)
Kate Storey-Fisher (NYU)
Marylou Gabrie (École Polytechnique (CMAP))
Shirley Ho (Simons Foundation/ NYU/ Princeton)
Tomasz Kacprzak (ETH)
Julia Kempe (NYU)
Francois Lanusse (CNRS)
Yann LeCun (NYU/ Meta)
Pablo Lemos (MILA/CCA)
Stephane Mallat (Simons Foundation/ ENS/ College De France)
Stephan Mandt (UCI)
Siddharth Mishra-Sharma (MIT)
Laurence Perreault Levasseur (University of Montreal / MILA)
Siamak Ravanbaksh (McGill / MILA)
Irina Rish (University of Montreal/ MILA)
Anna Scaife (University of Manchester)
Jeff Schneider (CMU)
Uros Seljak (UC Berkeley)
David Spergel (Simons Foundation)
Yang Song (OpenAI)
Kimberly Stachenfeld (Deepmind)
Ashley Villar (PSU)
Ingo Waldman (UCL)
Greg Yang (Microsoft Research)
Chair: Julia Kempe
Time-domain astrophysics, the study of cosmic phenomena which evolve on human timescales, is undergoing a Big Data revolution thanks to public, wide-field surveys of the night sky. Here, I will provide a broad overview on the technical and astrophysical challenges on the horizon. I will highlight recent examples which utilize unsupervised and generative models to identify, classify and analyze transient events in real time.
Chair: Julia Kempe
Physicians make critical time-constrained decisions everyday. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data based clinical predictive models have limited use in everyday practice due to complexity in data processing, model development, and deployment. Here, we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing to train a large language model for medical language (NYUTron), and subsequently finetune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an AUC of 78.7%-94.9%, with an improvement of 5.36%-14.7% AUC compared to traditional models. We additionally demonstrate the benefits of pretraining with clinical text, potential for increasing generalizability to different sites through finetuning, and demonstrate full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.
Chair: Julia Kempe
Recently, the theory of infinite-width neural networks led to the first technology, muTransfer, for tuning enormous neural networks that are too expensive to train more than once. For example, this allowed us to tune the 6.7 billion parameter version of GPT-3 using only 7% of its pretraining compute budget, and with some asterisks, we get a performance comparable to the original GPT-3 model with twice the parameter count. In this talk, I will explain the core insight behind this theory. In fact, this is an instance of what I call the Optimal Scaling Thesis, which connects infinite-size limits for general notions of “size” to the optimal design of large models in practice, illustrating a way for theory to reliably guide the future of AI. I'll end with several concrete key mathematical research questions whose resolutions will have incredible impact on how practitioners scale up their NNs.
Chair: Julia Kempe
10:10 AM Ching Yao Lai: Discovering blow-up solutions to fluids equations using PDE-constrained neural networks
10:20 AM Biwei Dai: Multiscale Flow for Robust and Optimal Cosmological Analysis
10:30 AM Christian Kragh Jespersen: Graphs and Galaxies
Chair: Julia Kempe
Simulation is one of the most important tools in science and engineering. However accurate simulation faces two challenges: (1) heavy compute requirements, and (2) sophisticated underlying equations which require deep expertise to formulate. Recent advances in machine learning-based simulation are now addressing both challenges by (1) allowing dynamics to be modeled with cheaper representations and computations, and (2) learning dynamics models directly from data. This talk will survey advances in graph-based learned simulation from the past few years, then deep dive into recent advances in machine learning-based weather prediction that have resulted in learned simulators that outperform the top operational forecasting systems in the world.
Chair: Julia Kempe
In recent years, there have been significant advancements in data-driven approaches to global weather forecasting that have demonstrated accuracy competitive with modern operational systems. While current state-of-the-art learned models achieve lower errors at medium-range lead-times, physics-based models like IFS feature superior physical consistency. In this talk I’ll describe our ongoing research effort where we are developing a hybrid atmospheric model based on a differentiable dynamical core augmented with learned physics parameterizations, trained end-to-end. Specifically, I’ll discuss the rationale behind our model formulation and show preliminary results on accuracy, physical consistency and emergent long-term atmospheric phenomena.
Chair: Julia Kempe
The laws of physics obey exact symmetries. Equivariant machine learning (ML) approaches aim to encode these symmetries in our models, which is important for ensuring our astrophysical analyses are physically motivated as well as improving performance. In this talk, I will outline the physical symmetries we care about, current equivariant ML techniques, and relevant astrophysical and cosmological problems. I will present a recently developed approach based on invariant scalars and its application to characterizing the dark matter distribution in cosmological simulations, and discuss future directions and opportunities.
Chair: Julia Kempe
12:10 PM Ronan Legin: Posterior Sampling of the Initial Conditions of the Universe using Score-Based Generative Models
12:20 PM Moritz Muenchmeyer: Learning the Matter Distribution of the Universe with Normalizing Flows and Diffusion Models
12:30 PM Ruben Ohana: Direct Feedback Alignment as an alternative to Backpropagation
Chair: Uros Seljak
How could machines learn as efficiently as humans and animals?
How could machines learn how the world works and acquire common sense?
How could machines learn to reason and plan?
Current AI architectures, such as Auto-Regressive Large Language
Models fall short. I will propose a modular cognitive architecture
that may constitute a path towards answering these questions. The
centerpiece of the architecture is a predictive world model that
allows the system to predict the consequences of its actions and to
plan a sequence of actions that optimize a set of objectives. The
world model employs a Hierarchical Joint Embedding Predictive
Architecture (H-JEPA) trained with self-supervised learning.
The JEPA learns abstract representations of the percepts that are
simultaneously maximally informative and maximally predictable.
The corresponding working paper is available here:
https://openreview.net/forum?id=BZ5a1r-kVsf
Chair: Uros Seljak
The universe is a system of small components with local interactions, forming self-organizing patterns like stars and galaxies. Like a virtual universe, Lenia is an abstract complex system that generates a huge variety of self-organizing patterns. These patterns demonstrate biology-like behaviors, e.g. self-replication, regeneration, swarming, and also physics-like ones, e.g. geometric symmetry, elastic collisions, and invariance under various substrates. I will describe several on-going research on Lenia, including searching for Lenia patterns using machine learning and evolutionary algorithms, measuring their spatio-temporal stability, and experimenting with new system dynamics like mass conservation and particle energy minimization.
Chair: Uros Seljak
Chair: Uros Seljak
Latent variable models have been an integral part of probabilistic machine learning, ranging from simple mixture models to variational autoencoders to powerful diffusion probabilistic models at the center of recent media attention. Perhaps less well-appreciated is the intimate connection between latent variable models and compression, and the potential of these models for advancing natural science. I will begin by showcasing connections between variational methods and the theory and practice of neural data compression, ranging from constructing learnable codecs to assessing the fundamental compressibility of real-world data, such as images and particle physics data. I will then connect this lossy compression perspective to climate science problems, which often involve distribution shifts between unlabeled datasets, such as simulation data from different models or data simulated under different assumptions (e.g., global average temperatures). I will show that a combination of non-linear dimensionality reduction and vector quantization can assess the magnitude of these shifts and enable intercomparisons of different climate simulations. Additionally, when combined with physical model assumptions, this approach can provide insights into the implications of global warming on extreme precipitation.
Stephan Mandt is an Associate Professor of Computer Science and Statistics at the University of California, Irvine. From 2016 until 2018, he was a Senior Researcher and Head of the statistical machine learning group at Disney Research in Pittsburgh and Los Angeles. He held previous postdoctoral positions at Columbia University and Princeton University. Stephan holds a Ph.D. in Theoretical Physics from the University of Cologne in Germany, where he received the National Merit Scholarship. He received the NSF CAREER Award, the UCI ICS Mid-Career Excellence in Research Award, the German Research Foundation's Mercator Fellowship, and a Kavli Fellowship of the U.S. National Academy of Sciences. He is a member of the ELLIS Society and a former visiting researcher at Google Brain. Stephan will serve as the Program Chair of the AISTATS 2024 conference, currently serves as an Action Editor for JMLR and TMLR, and frequently serves as Area Chair for NeurIPS, ICML, AAAI, and ICLR.
Chair: Uros Seljak
4:10 PM Helen Qu: Enabling Time Domain Science with Deep Learning: From CNNs to Foundation Models
4:20 PM Niall Jeffrey: Solving scientific model comparison with Evidence Networks
Moderator: Shirley Ho
What can/ will AI/ AGI do for science?
Chair: Kyunghyun Cho
Democratization of AI/ML in astronomy has been fostered by increased awareness, powerful software tools, and improving education. Yet as diverse AI/ML methods begin to be infused into workflows and inference chains it is legitimate to ask how AI/ML has fundamentally and uniquely contributed to novel science. I address this question in the context of AI as an assistive tool in three contexts: 1) to leapfrog people-centric bottlenecks, 2) as a model-based computational accelerant, and 3) as a hypothesis generation engine. One recent effort of ours surfaces insights of large language models (LLMs) with a focus on user experience (UX). Another demonstrates an unexpected fundamental breakthrough in our understanding of the theory of microlensing via simulation-based inference.
Chair: Kyunghyun Cho
Simulation is important for countless applications in science and engineering, and there has been increasing interest in using machine learning for efficiency in prediction and optimization. In the first part of the talk, I will describe our work on training learned models for efficient turbulence simulation. Turbulent fluid dynamics are chaotic and therefore hard to predict, and classical simulators typically require expertise to produce and take a long time to run. We found that learned CNN-based simulators can learn to efficiently capture diverse types of turbulent dynamics at low resolutions, and that they capture the dynamics of a high-resolution classical solver more accurately than a classical solver run at the same low resolution. We also provide recommendations for producing stable rollouts in learned models, and improving generalization to out-of-distribution states. In the second part of the talk, I will discuss work using learned simulators for inverse design. In this work, we combine Graph Neural Network (GNN) learned simulators [Sanchez-Gonzalez et al 2020, Pfaff et al 2021] with gradient-based optimization in order to optimize designs in a variety of complex physics tasks. These include challenges designing objects in 2D and 3D to direct fluids in complex ways, as well as optimizing the shape of an airfoil. We find that the learned model can support design optimization across 100s of timesteps, and that the learned models can in some cases permit designs that lead to dynamics apparently quite different from the training data.
Chair: Kyunghyun Cho
The main goal of cosmology, is to perform parameter inference and model selection, from astronomical observations. But, uniquely, it is a field that has to do this limited to a single experiment, the Universe that we live in. With very powerful existing and upcoming cosmological surveys, we need to leverage state-of-the-art inference techniques to extract as much information as possible from our data. In this talk, I will begin present Machine Learning based methods to perform inference in cosmology, such as simulation-based inference, and stochastic control sampling approaches. I will finish by showing how these methods are being used to improve our knowledge of the Universe.
Chair: Kyunghyun Cho
10:10 AM Chang Hoon Hahn: Simulation-Based Inference of Higher-Order Galaxy Clustering
10:20 AM Devina Mohan: Bayesian Deep Learning for Radio Galaxy Classification
10:30 AM Marc Huertas-Company: Exploring the diversity of galaxy morphology at z>3 with self supervised learning and JWST
Chair: Kyunghyun Cho
The field of AI is advancing at unprecedented speed in the past few years, due to the rise of large-scale, self-supervised pre-trained models (a.k.a. “foundation models”), such as GPT-3, GPT-4, ChatGPT, Chinchilla, LLaMA, CLIP, DALL-e, StableDiffusion and many others. Impressive few-shot generalization capabilities of such models on a very wide range of novel tasks appear to emerge primarily due to the drastic increase in the size of the models, training data and compute resources. Thus, predicting how the model’s performance and other important characteristics (e.g., robustness, truthfulness, etc) scale with the data, model and compute became a rapidly growing area of research in the past couple of years. Neural scaling laws serve as “investment tools” suggesting optimal allocation of compute resources w.r.t. various aspects of a training process (e.g., model size vs data size ratio), and better compare different architectures and algorithms, predicting which ones will stand the test-of-time as larger compute resources become available. Last, but not least, accurate prediction of emerging behaviors in large-scale AI systems is essential from AI Safety perspective. In this talk, we will present a brief history of neural scaling laws, with the focus on our recent Broken Neural Scaling Laws that generalize previously observed scaling laws to more complex behaviors, metrics, and settings.
While the recent advances in foundation models are truly exciting, they also pose a new challenge to academic and non-profit AI research organizations which historically had no access to the level of compute resources available in industry. This motivated us – a rapidly growing international collaboration across several Universities and non-profit organizations, including U of Montreal/Mila, LAION, EleutherAI, and many others – to join forces and initiate an effort towards developing common objectives and tools for advancing the field of open-source foundation models, in order to avoid accumulation of state-of-the-art AI in a small set of large companies and facilitate democratization of AI. We will overview our recent effort in obtaining large compute resources (e.g., Summit supercomputer) and ongoing large-scale projects we are working on.
Chair: Kyunghyun Cho
In large scale structure cosmology, the information about the cosmological parameters governing the evolution of the universe is contained in the complex and rich structure of dark matter density field. To date, this information was probed using simple human-designed statistics, such as the 2-pt functions, which are not guaranteed or expected to capture the full information content of the LSS maps. I will discuss the recent applications of AI to map-level analysis of large scale structure probes: weak lensing analysis and galaxy clustering. I will present the AI-oriented CosmoGridV1 cosmological simulation set that we recently made available to the community.
Chair: Kyunghyun Cho
Reinforcement Learning (RL) has had many recent successes in games and
other applications where accurate, cheap simulations are available or a
large amount of learning trials are possible. In doing so, it has
demonstrated its ability to learn control and decision making policies for
large-scale, nonlinear, hidden state systems that are too challenging for
scientists to manually design policies. The problem of designing control
policies for nuclear fusion in tokamaks contains all these features of a
difficult dynamic system and the challenge of not having good simulators or
the ability to run many experiments.
In this talk, I will describe our recent efforts at overcoming these
challenges on the DIII-D tokamak. I will first present our algorithms for
learning dynamic models with uncertainty quantification and then the
corresponding RL and Bayesian optimization algorithms that build on those
models to produce control policies. Finally, I will summarize the results
of testing these policies on the DIII-D tokamak over the past 6 months.
Chair: Kyunghyun Cho
12:10 PM Benjamin Remy: Hierarchical Bayesian Models with generative and physical components for inference with corrupted data
12:20 PM Daniel Tian Levy: Using Multiple Vector Channels Improves E(n)-Equivariant Graph Neural Networks
12:30 PM Bingjie Wang: SBI++: Flexible, Ultra-fast Likelihood-free Inference Customized for Astronomical Applications
Chair: Laurence Perreault Levasseur
The Square Kilometre Array (SKA) will be the world's largest radio telescope, producing data volumes approaching exa-scale within a few years of operation. Extracting scientific value from those data in a timely manner will be a challenge that quickly goes beyond traditional analyses and instead requires robust domain-specific AI solutions. Here I will discuss how we have been building foundation models that can be adapted across different SKA precursor instruments, by applying self-supervised learning with instance differentiation to learn a multi-purpose representation for use in radio astronomy. For a standard radio astronomy use case, our models exceed baseline supervised classification performance by a statistically significant margin for most label volumes in the in-distribution classification case and for all label volumes in the out-of-distribution case. I will also show how such learned representations can be more widely scientifically useful, for example in similarity searches that allow us to find hybrid radio galaxies without any pre-labelled examples.
Chair: Laurence Perreault Levasseur
Known for their impressive performance in generative modeling, diffusion models are attractive candidates for density-based anomaly detection. This paper investigates different variations of diffusion modeling for unsupervised and semi-supervised anomaly detection. In particular, we find that Denoising Diffusion Probability Models (DDPM) are performant on anomaly detection benchmarks yet computationally expensive. By simplifying DDPM in application to anomaly detection, we are naturally led to an alternative approach called Diffusion Time Probabilistic Model (DTPM). DTPM estimates the posterior distribution over diffusion time for a given input, enabling the identification of anomalies due to their higher posterior density at larger timesteps. We derive an analytical form for this posterior density and leverage a deep neural network to improve inference efficiency. Through empirical evaluations on the ADBench benchmark, we demonstrate that all diffusion-based anomaly detection methods perform competitively. Notably, DTPM achieves orders of magnitude faster inference time than DDPM, while outperforming it on this benchmark. These results establish diffusion-based anomaly detection as an interpretable and scalable alternative to traditional methods and recent deep-learning techniques.
Chair: Laurence Perreault Levasseur
2:50 PM Kaze Wong: Machine learning in gravitational wave
3:00 PM Ben Wandelt: Scientific Reasoning with ML
Chair: Laurence Perreault Levasseur
Generative modeling for high-dimensional data, such as images and audio, is extremely challenging due to the curse of dimensionality. To overcome this difficulty, we introduce a homotopic approach inspired by numerical equation solving, which involves designing a homotopy of probability distributions that smoothly progresses from simple noise distribution to complex data distribution. I will present two families of approaches that rely on such homotopies: score-based diffusion models and consistency models. Both approaches use a differential equation to convert data to noise and learn to estimate the time reversal with deep neural networks. These models allow for flexible neural networks, enable zero-shot image editing, and generate high-quality samples that achieve state-of-the-art performance in many generative modeling benchmarks.
Chair: Laurence Perreault Levasseur
I will demonstrate a framework for interpretable machine learning, using physically-motivated inductive biases and a technique we have termed “symbolic distillation”. This method allows a practitioner to translate a trained neural network model into an interpretable symbolic expression via the use of symbolic regression with a basis set of operators. I will first discuss the deep learning strategy for performing this distillation, and then review “symbolic regression,” an algorithm for optimizing symbolic expressions using evolutionary algorithms. In particular, I will describe the PySR/SymbolicRegression.jl software framework (github.com/MilesCranmer/PySR), which is an easy-to-use high-performance symbolic regression package in Python and Julia. Tangential to this, I will discuss several physically-motivated inductive biases which make this technique more effective. In the second half of this talk, I will review a variety of applications of this and other interpretable machine learning techniques, focusing on a few problems in astrophysics.
Chair: Laurence Perreault Levasseur
4:10 PM Tri Nguyen: FLORAH: Planting better merger trees with generative models
4:20 PM Yixin Wang: Representation Learning: A Causal Perspective
Chair: Shirley Ho
"The exploration of extrasolar planets, which are planets orbiting stars other than our own, holds great potential for unravelling long-standing mysteries surrounding planet formation, habitability, and the emergence of life in our galaxy. By studying the atmospheres of these exoplanets, we gain valuable insights into their climates, chemical compositions, formation processes, and past evolutionary paths. The recent launch of the James Webb Space Telescope (JWST) marks the beginning of a new era of high-quality observations that have already challenged our existing understanding of planetary atmospheres. Over its lifetime, the JWST will observe approximately 50 to 100 planets. Furthermore, in the coming decade, the European Space Agency's Ariel mission will build on this progress by studying in detail the atmospheres of an additional 1000 exoplanets.
In this talk, I will outline three fundamental challenges to exoplanet characterisation that lend themselves well to machine-learning approaches. Firstly, we encounter the issue of extracting useful information from data with low signal-to-noise ratios. When the noise from instruments surpasses the signal from exoplanets, we must rely on self-supervised deconvolution techniques to learn accurate instrument models that go beyond our traditional calibration methods. Secondly, in order to interpret these alien worlds, we must employ highly complex models encompassing climate, chemistry, stellar processes, and radiative transfer. However, these models demand significant computational resources, necessitating the use of machine learning surrogate modelling techniques to enhance efficiency. Lastly, the Bayesian inverse problem, which traditionally relies on methods like Markov Chain Monte Carlo (MCMC) and nested sampling, becomes particularly challenging in high-dimensional parameter spaces. In this regard, simulation-based inference techniques offer potential solutions.
It is evident that many of the modelling and data analysis challenges we face in the study of exoplanets are not unique to this field but are actively investigated within the machine learning community. However, interdisciplinary collaboration has often been hindered by jargon and a lack of familiarity with each other's domains. In order to bridge this gap, as part of the ESA Ariel Space mission, we have successfully organized four machine learning challenges hosted at ECML-PKDD and NeurIPS (https://www.ariel-datachallenge.space). These challenges aim to provide novel solutions to long-standing problems and foster closer collaboration between the exoplanet and machine learning communities. I will end this talk with a brief discussion of the lessons learned from running these interdisciplinary data challenges."
Chair: Shirley Ho
"Generative modeling of high-dimensional phenomena such as natural images has gone through a remarkable experimental revolution over the past years, spearheaded by diffusion-based models [Sohl-Dickstein et al~15, Song & Ermon, 20].
In this talk, I will describe two open questions raised by these models that I haven't been able to resolve yet: (i) what is the class of high-dimensional densities that can be provably learnt by diffusion models that other models cannot, and (ii) how can we provably leverage time-dependent scores to regularise inverse problems. I will describe joint work with Etienne Lempereur, Florentin Guth, Stephane Mallat, Carles Domingo, Jaume de Dios, Jiequn Han and Maarten de Hopp. "
Chair: Shirley Ho
Deep generative models parametrize very flexible families of distributions able to fit complicated datasets of images or text. These models provide independent samples from complex high-distributions at negligible costs. On the other hand, sampling exactly a target distribution, such a Bayesian posterior or the Boltzmann distribution of a physical system, is typically challenging: either because of dimensionality, multi-modality, ill-conditioning or a combination of the previous. In this talk, I will discuss recent works proposing to enhance traditional inference and sampling algorithms with learning. I will present in particular flowMC, an adaptive MCMC with normalizing flows.
Chair: Shirley Ho
10:10 AM David Shih: Via Machinae: Full-Sky, Model-Agnostic Search for Stellar Streams in Gaia DR2
10:20 AM Adrian Bayer: Sampling for Field-Level Inference
10:30 AM Tara Akhound-Sadegh: Lie Point Symmetry in Neural PDE Solvers
Chair: Shirley Ho
An outstanding issue is to understand how deep networks circumvent the curse of dimensionality to generate or classify data. Inspired by the renormalization group in physics, we explain how deep networks can separate phenomena which appear at different scales, and capture scale interactions. It provides high-dimensional model, which approximate the probability distribution of complex physical fields such as turbulences or structured images. For classification, learning becomes similar to a compressed sensing problem, where low-dimensional discriminative structures are identified with random projections. I will introduce a multiscale random model of deep networks for classification, and its numerical validation.
Chair: Shirley Ho
The application of Machine Learning in Cosmology nowadays is
pervasive (CNN-based classification, Bayesian ML, normalizing-flow based
inference, diffusion models, ...). Some concepts cherished by (parts
of) the ML community might not (yet?) have found their way to
astrophysics. In this talk, I will describe two of them, adversarial
robustness and dataset distillation, to motivate their possible utility
with the example of turbulence simulations and galaxy morphology extraction.
Chair: Shirley Ho
As we move towards the next generation of cosmological surveys, observational cosmology has reached an interesting stage in which making analytic predictions for the likelihood of the cosmological signal becomes intractable. Instead, physical models in the form of simulations offer an avenue to model the data in all of its complexity, but until very recently using such models to estimate physical fields and parameters remained an open problem.
In this talk, I will discuss two possible points of view on simulators, depending on whether they are “black-box” or “open-box” models, and the different methodologies and strategies which may be applied in each case to use these physical models within a Bayesian inference context.
In the case of black-box simulations (which can only be sampled from), I will discuss applications of deep generative models as a practical way to manipulate implicit distributions within a larger Bayesian framework. I will provide examples in particular of using a simulation-based prior captured by a neural score estimator to sample high-dimensional posterior distributions of cosmological fields.
In the case of open-box simulations, which can be seen as differentiable probabilistic models, with an explicit joint log probability, I will discuss strategies and challenges for building large scale differentiable physical models of the Universe touching in particular on distributed differentiable N-body solvers and building accelerated hybrid physical/ml simulations leveraging neural ODE methodologies.
Chair: Shirley Ho
12:10 PM Nayantara Ganapati Mudur: Diffusion generative models for astrophysical fields
12:20 PM Alex Adams: Posterior samples for general inverse problems with GFlowNet and score-based models
12:30 PM Matthew Liam Sampson: Score-matching diffusion models for prior informed galaxy deblending
Chair: François Lanusse
High dimensional sampling from a known target distribution is a ubiquitous problem across many fields of science and engineering. In astrophysics it is commonly used for Bayesian data analysis, and a few applications in Machine Learning are Bayesian Neural Networks, energy models and diffusion models. Many of the best known samplers have been inspired by physics ideas such as detailed balance, Langevin and Hamiltonian dynamics. Here I will describe two new physics inspired gradient based samplers called MicroCanonical Hamiltonian and Langevin Monte Carlo (MCHMC, MCLMC), and present some of their applications: 1) high dimensional sampling of initial conditions of our universe 2) field level inference of gravitational lensing of CMB 3) statistical physics and lattice QCD 4) Molecular Dynamics. On many of the applications MCLMC outperforms HMC by 1-3 orders of magnitude. I will also discuss the ongoing work with these samplers such as ensembling, stochastic gradients and optimization.
Chair: François Lanusse
2:30 PM Uddipta Bhardwaj: Peregrine: Sequential simulation based inference for gravitational waves
2:40 PM Aviad Levis: Neural Tomography of Gravitationally Lensed Emission Orbiting a Black Hole
2:50 PM Wolfgang Kerzendorf: Uncertainty Estimation in Machine Learning Emulators for Connecting Simulations and Observations
3:00 PM Yesukhei Jagvaral: Score-matching and denoising diffusion generative models for SO(3): an application for galaxy orientations/alignments
Chair: François Lanusse
The complexity of astrophysical data and the presence of unknowable systematics pose significant challenges to robustly extracting information about fundamental physics using conventional methods. I will describe how overcoming these challenges will require a qualitative shift in our approach to statistical inference, bringing together several recent advances in generative modeling, differentiable programming, and simulation-based inference. As case studies, I will show examples of using simulation-based inference to extract the dark matter content from dwarf galaxies, and the use of diffusion-based generative modeling to encode the likelihood of galaxy clustering statistics.
Chair: François Lanusse
3:50 PM Joonas Nattila: Do Neural Networks Dream of Electric Sheets: Using Computer Vision to Analyze MHD Turbulence
4:00 PM Justine Zeghal: Simulation-Efficient Implicit Inference with Differentiable Simulators
4:10 PM Vicente Amado Olivo: Computational Meta-Research in Astrophysics: Optimizing Resource Allocation and Enhancing Research Management through Machine Learning