Cosmic Connections: A ML X Astrophysics Symposium at Simons Foundation

Name: Cosmic Connections: A ML X Astrophysics Symposium at Simons Foundation
Start: 2023-05-22T08:50:00-04:00
End: 2023-05-24T17:00:00-04:00
Location: 162 5th Avenue

May 22, 2023, 8:50 AM → May 24, 2023, 5:00 PM America/New_York

Ingrid Daubechies Auditorium/2-IDA (162 5th Avenue)

Ingrid Daubechies Auditorium/2-IDA

162 5th Avenue

200

Description

Cosmic Connections: a Symposium to Explore the Intersection of Astrophysics and Machine Learning” at the Simons Foundation’s Flatiron Institute will take place from May 22 to May 24, 2023 in New York City. The symposium aims to bring together machine learning researchers and astrophysicists to discuss what are the most interesting astrophysical challenges and where state-of-the-art machine learning models are expected to outperform conventional methods. We aim for a relatively small symposium with ~100 participants to allow possibilities of brain-storming, collaboration building with a relatively informal atmosphere.

The overarching theme this year is unsupervised and generative models which have surprised us with recent successes in transformers, diffusion models and foundation models, with related areas of interest including simulation-based inference and DL-accelerated simulations.

Organizers:

Shirley Ho (Simons Foundation/NYU/Princeton)

Julia Kempe (NYU)

Kyunghyun Cho (NYU)

Francois Lanusse (CNRS)

Laurence Perreault Levasseur (MILA)

Uros Seljak (Berkeley)

Fatima Fall (CCA admin Contact) ffall@flatironinstitute.org

Invited Spekers include:

Peter Battaglia (Deepmind)

Josh Bloom (Berkeley)

Joan Bruna (NYU)

Diana Cai (Princeton)

Kyunghyun Cho (NYU)

Kate Storey-Fisher (NYU)

Marylou Gabrie (École Polytechnique (CMAP))

Shirley Ho (Simons Foundation/ NYU/ Princeton)

Tomasz Kacprzak (ETH)

Julia Kempe (NYU)

Francois Lanusse (CNRS)

Yann LeCun (NYU/ Meta)

Pablo Lemos (MILA/CCA)

Stephane Mallat (Simons Foundation/ ENS/ College De France)

Stephan Mandt (UCI)

Siddharth Mishra-Sharma (MIT)

Laurence Perreault Levasseur (University of Montreal / MILA)

Siamak Ravanbaksh (McGill / MILA)

Irina Rish (University of Montreal/ MILA)

Anna Scaife (University of Manchester)

Jeff Schneider (CMU)

Uros Seljak (UC Berkeley)

David Spergel (Simons Foundation)

Yang Song (OpenAI)

Kimberly Stachenfeld (Deepmind)

Ashley Villar (PSU)

Ingo Waldman (UCL)

Greg Yang (Microsoft Research)

Dmitrii Kochlov (Google Research)

Bert Chan (Google Research)

Fatima Fall (CCA admin Contact) ffall@flatironinstitute.org

Location:

Flatiron Institute (Center for Computational Astrophysics)

Ingrid Daubechies Auditorium

162 5th Avenue 2nd Floor,

New York, NY 10010

Monday, May 22
- Welcome: David Spergel and the Organizing Committee
  
  David Spergel
  Organizing Committee
- Plenary Talk: Ashley Villar: The Big Data Revolution in Time-Domain Astrophysics
  
  Chair: Julia Kempe
  
  Time-domain astrophysics, the study of cosmic phenomena which evolve on human timescales, is undergoing a Big Data revolution thanks to public, wide-field surveys of the night sky. Here, I will provide a broad overview on the technical and astrophysical challenges on the horizon. I will highlight recent examples which utilize unsupervised and generative models to identify, classify and analyze transient events in real time.
- Invited Talk: Kyunghyun Cho: Health system scale language models for clinical and operational decision making
  
  Chair: Julia Kempe
  
  Physicians make critical time-constrained decisions everyday. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data based clinical predictive models have limited use in everyday practice due to complexity in data processing, model development, and deployment. Here, we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing to train a large language model for medical language (NYUTron), and subsequently finetune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an AUC of 78.7%-94.9%, with an improvement of 5.36%-14.7% AUC compared to traditional models. We additionally demonstrate the benefits of pretraining with clinical text, potential for increasing generalizability to different sites through finetuning, and demonstrate full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.
- Invited Talk: Greg Yang: The unreasonable effectiveness of mathematics in large scale deep learning
  
  Chair: Julia Kempe
  
  Recently, the theory of infinite-width neural networks led to the first technology, muTransfer, for tuning enormous neural networks that are too expensive to train more than once. For example, this allowed us to tune the 6.7 billion parameter version of GPT-3 using only 7% of its pretraining compute budget, and with some asterisks, we get a performance comparable to the original GPT-3 model with twice the parameter count. In this talk, I will explain the core insight behind this theory. In fact, this is an instance of what I call the Optimal Scaling Thesis, which connects infinite-size limits for general notions of “size” to the optimal design of large models in practice, illustrating a way for theory to reliably guide the future of AI. I'll end with several concrete key mathematical research questions whose resolutions will have incredible impact on how practitioners scale up their NNs.
- Contributed Talks
  
  Chair: Julia Kempe
  
  10:10 AM Ching Yao Lai: Discovering blow-up solutions to fluids equations using PDE-constrained neural networks
  10:20 AM Biwei Dai: Multiscale Flow for Robust and Optimal Cosmological Analysis
  10:30 AM Christian Kragh Jespersen: Graphs and Galaxies
- 10:40 AM
  
  Coffee Break
- Plenary Talk: Peter Battaglia: Learning simulation: graphs, physics, and weather
  
  Chair: Julia Kempe
  
  Simulation is one of the most important tools in science and engineering. However accurate simulation faces two challenges: (1) heavy compute requirements, and (2) sophisticated underlying equations which require deep expertise to formulate. Recent advances in machine learning-based simulation are now addressing both challenges by (1) allowing dynamics to be modeled with cheaper representations and computations, and (2) learning dynamics models directly from data. This talk will survey advances in graph-based learned simulation from the past few years, then deep dive into recent advances in machine learning-based weather prediction that have resulted in learned simulators that outperform the top operational forecasting systems in the world.
- Invited Talk: Dmitrii Kochkov: Differentiable numerics + ML for simulation of atmospheric physics
  
  Chair: Julia Kempe
  
  In recent years, there have been significant advancements in data-driven approaches to global weather forecasting that have demonstrated accuracy competitive with modern operational systems. While current state-of-the-art learned models achieve lower errors at medium-range lead-times, physics-based models like IFS feature superior physical consistency. In this talk I’ll describe our ongoing research effort where we are developing a hybrid atmospheric model based on a differentiable dynamical core augmented with learned physics parameterizations, trained end-to-end. Specifically, I’ll discuss the rationale behind our model formulation and show preliminary results on accuracy, physical consistency and emergent long-term atmospheric phenomena.
- Invited Talk: Kate Storey-Fisher: Symmetry-preserving machine learning for cosmology
  
  Chair: Julia Kempe
  
  The laws of physics obey exact symmetries. Equivariant machine learning (ML) approaches aim to encode these symmetries in our models, which is important for ensuring our astrophysical analyses are physically motivated as well as improving performance. In this talk, I will outline the physical symmetries we care about, current equivariant ML techniques, and relevant astrophysical and cosmological problems. I will present a recently developed approach based on invariant scalars and its application to characterizing the dark matter distribution in cosmological simulations, and discuss future directions and opportunities.
- Contributed Talks
  
  Chair: Julia Kempe
  
  12:10 PM Ronan Legin: Posterior Sampling of the Initial Conditions of the Universe using Score-Based Generative Models
  12:20 PM Moritz Muenchmeyer: Learning the Matter Distribution of the Universe with Normalizing Flows and Diffusion Models
  12:30 PM Ruben Ohana: Direct Feedback Alignment as an alternative to Backpropagation
- 12:40 PM
  
  Lunch
- Plenary Talk: Yann LeCun: Towards Machines that can Learn, Reason, and Plan
  
  Chair: Uros Seljak
  
  How could machines learn as efficiently as humans and animals?
  How could machines learn how the world works and acquire common sense?
  How could machines learn to reason and plan?
  Current AI architectures, such as Auto-Regressive Large Language
  Models fall short. I will propose a modular cognitive architecture
  that may constitute a path towards answering these questions. The
  centerpiece of the architecture is a predictive world model that
  allows the system to predict the consequences of its actions and to
  plan a sequence of actions that optimize a set of objectives. The
  world model employs a Hierarchical Joint Embedding Predictive
  Architecture (H-JEPA) trained with self-supervised learning.
  The JEPA learns abstract representations of the percepts that are
  simultaneously maximally informative and maximally predictable.
  The corresponding working paper is available here:
  https://openreview.net/forum?id=BZ5a1r-kVsf
- Invited Talk: Bert Chan: Self-Organizing Patterns in Lenia, an Abstract Complex System
  
  Chair: Uros Seljak
  
  The universe is a system of small components with local interactions, forming self-organizing patterns like stars and galaxies. Like a virtual universe, Lenia is an abstract complex system that generates a huge variety of self-organizing patterns. These patterns demonstrate biology-like behaviors, e.g. self-replication, regeneration, swarming, and also physics-like ones, e.g. geometric symmetry, elastic collisions, and invariance under various substrates. I will describe several on-going research on Lenia, including searching for Lenia patterns using machine learning and evolutionary algorithms, measuring their spatio-temporal stability, and experimenting with new system dynamics like mass conservation and particle energy minimization.
- 3:10 PM
  
  Coffee Break
- Invited Talk: Laurence Perreault Levasseur: Strong Lensing Data Analysis in the Era of Large Sky Surveys
  
  Chair: Uros Seljak
- Invited Talk: Stephan Mandt: From Compression to Convection: a Latent Variable Perspective
  
  Chair: Uros Seljak
  
  Latent variable models have been an integral part of probabilistic machine learning, ranging from simple mixture models to variational autoencoders to powerful diffusion probabilistic models at the center of recent media attention. Perhaps less well-appreciated is the intimate connection between latent variable models and compression, and the potential of these models for advancing natural science. I will begin by showcasing connections between variational methods and the theory and practice of neural data compression, ranging from constructing learnable codecs to assessing the fundamental compressibility of real-world data, such as images and particle physics data. I will then connect this lossy compression perspective to climate science problems, which often involve distribution shifts between unlabeled datasets, such as simulation data from different models or data simulated under different assumptions (e.g., global average temperatures). I will show that a combination of non-linear dimensionality reduction and vector quantization can assess the magnitude of these shifts and enable intercomparisons of different climate simulations. Additionally, when combined with physical model assumptions, this approach can provide insights into the implications of global warming on extreme precipitation.
  
  Stephan Mandt is an Associate Professor of Computer Science and Statistics at the University of California, Irvine. From 2016 until 2018, he was a Senior Researcher and Head of the statistical machine learning group at Disney Research in Pittsburgh and Los Angeles. He held previous postdoctoral positions at Columbia University and Princeton University. Stephan holds a Ph.D. in Theoretical Physics from the University of Cologne in Germany, where he received the National Merit Scholarship. He received the NSF CAREER Award, the UCI ICS Mid-Career Excellence in Research Award, the German Research Foundation's Mercator Fellowship, and a Kavli Fellowship of the U.S. National Academy of Sciences. He is a member of the ELLIS Society and a former visiting researcher at Google Brain. Stephan will serve as the Program Chair of the AISTATS 2024 conference, currently serves as an Action Editor for JMLR and TMLR, and frequently serves as Area Chair for NeurIPS, ICML, AAAI, and ICLR.
- Contributed Talks
  
  Chair: Uros Seljak
  
  4:10 PM Helen Qu: Enabling Time Domain Science with Deep Learning: From CNNs to Foundation Models
  4:20 PM Niall Jeffrey: Solving scientific model comparison with Evidence Networks
- Panel Discussion: Yann LeCun, David Spergel, Bhuv Jain, and Irina Rish
  
  Moderator: Shirley Ho
  
  What can/ will AI/ AGI do for science?
- 6:00 PM
  
  Dinner
Tuesday, May 23
- Plenary Talk: Josh Bloom: AI Assisted Discovery: from UX to Eureka
  
  Chair: Kyunghyun Cho
  
  Democratization of AI/ML in astronomy has been fostered by increased awareness, powerful software tools, and improving education. Yet as diverse AI/ML methods begin to be infused into workflows and inference chains it is legitimate to ask how AI/ML has fundamentally and uniquely contributed to novel science. I address this question in the context of AI as an assistive tool in three contexts: 1) to leapfrog people-centric bottlenecks, 2) as a model-based computational accelerant, and 3) as a hypothesis generation engine. One recent effort of ours surfaces insights of large language models (LLMs) with a focus on user experience (UX). Another demonstrates an unexpected fundamental breakthrough in our understanding of the theory of microlensing via simulation-based inference.
- Invited Talk: Kimberly Stachenfeld: Learned simulation of fluid dynamics for prediction and design
  
  Chair: Kyunghyun Cho
  
  Simulation is important for countless applications in science and engineering, and there has been increasing interest in using machine learning for efficiency in prediction and optimization. In the first part of the talk, I will describe our work on training learned models for efficient turbulence simulation. Turbulent fluid dynamics are chaotic and therefore hard to predict, and classical simulators typically require expertise to produce and take a long time to run. We found that learned CNN-based simulators can learn to efficiently capture diverse types of turbulent dynamics at low resolutions, and that they capture the dynamics of a high-resolution classical solver more accurately than a classical solver run at the same low resolution. We also provide recommendations for producing stable rollouts in learned models, and improving generalization to out-of-distribution states. In the second part of the talk, I will discuss work using learned simulators for inverse design. In this work, we combine Graph Neural Network (GNN) learned simulators [Sanchez-Gonzalez et al 2020, Pfaff et al 2021] with gradient-based optimization in order to optimize designs in a variety of complex physics tasks. These include challenges designing objects in 2D and 3D to direct fluids in complex ways, as well as optimizing the shape of an airfoil. We find that the learned model can support design optimization across 100s of timesteps, and that the learned models can in some cases permit designs that lead to dynamics apparently quite different from the training data.
- Invited Talk: Pablo Lemos: Machine Learning Powered Inference in Cosmology
  
  Chair: Kyunghyun Cho
  
  The main goal of cosmology, is to perform parameter inference and model selection, from astronomical observations. But, uniquely, it is a field that has to do this limited to a single experiment, the Universe that we live in. With very powerful existing and upcoming cosmological surveys, we need to leverage state-of-the-art inference techniques to extract as much information as possible from our data. In this talk, I will begin present Machine Learning based methods to perform inference in cosmology, such as simulation-based inference, and stochastic control sampling approaches. I will finish by showing how these methods are being used to improve our knowledge of the Universe.
- Contributed Talks
  
  Chair: Kyunghyun Cho
  
  10:10 AM Chang Hoon Hahn: Simulation-Based Inference of Higher-Order Galaxy Clustering
  10:20 AM Devina Mohan: Bayesian Deep Learning for Radio Galaxy Classification
  10:30 AM Marc Huertas-Company: Exploring the diversity of galaxy morphology at z>3 with self supervised learning and JWST
- 10:40 AM
  
  Coffee Break
- Plenary Talk: Irina Rish: Recent Advances in Foundation Models: Scaling Laws, Emergent Behaviors, and AI Democratization
  
  Chair: Kyunghyun Cho
  
  The field of AI is advancing at unprecedented speed in the past few years, due to the rise of large-scale, self-supervised pre-trained models (a.k.a. “foundation models”), such as GPT-3, GPT-4, ChatGPT, Chinchilla, LLaMA, CLIP, DALL-e, StableDiffusion and many others. Impressive few-shot generalization capabilities of such models on a very wide range of novel tasks appear to emerge primarily due to the drastic increase in the size of the models, training data and compute resources. Thus, predicting how the model’s performance and other important characteristics (e.g., robustness, truthfulness, etc) scale with the data, model and compute became a rapidly growing area of research in the past couple of years. Neural scaling laws serve as “investment tools” suggesting optimal allocation of compute resources w.r.t. various aspects of a training process (e.g., model size vs data size ratio), and better compare different architectures and algorithms, predicting which ones will stand the test-of-time as larger compute resources become available. Last, but not least, accurate prediction of emerging behaviors in large-scale AI systems is essential from AI Safety perspective. In this talk, we will present a brief history of neural scaling laws, with the focus on our recent Broken Neural Scaling Laws that generalize previously observed scaling laws to more complex behaviors, metrics, and settings.
  
  While the recent advances in foundation models are truly exciting, they also pose a new challenge to academic and non-profit AI research organizations which historically had no access to the level of compute resources available in industry. This motivated us – a rapidly growing international collaboration across several Universities and non-profit organizations, including U of Montreal/Mila, LAION, EleutherAI, and many others – to join forces and initiate an effort towards developing common objectives and tools for advancing the field of open-source foundation models, in order to avoid accumulation of state-of-the-art AI in a small set of large companies and facilitate democratization of AI. We will overview our recent effort in obtaining large compute resources (e.g., Summit supercomputer) and ongoing large-scale projects we are working on.
- Invited Talk: Tomasz Kacprzak: Large Scale Structure Cosmology with Deep Learning
  
  Chair: Kyunghyun Cho
  
  In large scale structure cosmology, the information about the cosmological parameters governing the evolution of the universe is contained in the complex and rich structure of dark matter density field. To date, this information was probed using simple human-designed statistics, such as the 2-pt functions, which are not guaranteed or expected to capture the full information content of the LSS maps. I will discuss the recent applications of AI to map-level analysis of large scale structure probes: weak lensing analysis and galaxy clustering. I will present the AI-oriented CosmoGridV1 cosmological simulation set that we recently made available to the community.
- Invited Talk: Jeff Schneider: Reinforcement Learning for Nuclear Fusion
  
  Chair: Kyunghyun Cho
  
  Reinforcement Learning (RL) has had many recent successes in games and
  other applications where accurate, cheap simulations are available or a
  large amount of learning trials are possible. In doing so, it has
  demonstrated its ability to learn control and decision making policies for
  large-scale, nonlinear, hidden state systems that are too challenging for
  scientists to manually design policies. The problem of designing control
  policies for nuclear fusion in tokamaks contains all these features of a
  difficult dynamic system and the challenge of not having good simulators or
  the ability to run many experiments.
  
  In this talk, I will describe our recent efforts at overcoming these
  challenges on the DIII-D tokamak. I will first present our algorithms for
  learning dynamic models with uncertainty quantification and then the
  corresponding RL and Bayesian optimization algorithms that build on those
  models to produce control policies. Finally, I will summarize the results
  of testing these policies on the DIII-D tokamak over the past 6 months.
- Contributed Talks
  
  Chair: Kyunghyun Cho
  
  12:10 PM Benjamin Remy: Hierarchical Bayesian Models with generative and physical components for inference with corrupted data
  12:20 PM Daniel Tian Levy: Using Multiple Vector Channels Improves E(n)-Equivariant Graph Neural Networks
  12:30 PM Bingjie Wang: SBI++: Flexible, Ultra-fast Likelihood-free Inference Customized for Astronomical Applications
- 12:40 PM
  
  Lunch
- Plenary Talk: Anna Scaife: Towards the SKA: Foundation Models for Radio Astronomy
  
  Chair: Laurence Perreault Levasseur
  
  The Square Kilometre Array (SKA) will be the world's largest radio telescope, producing data volumes approaching exa-scale within a few years of operation. Extracting scientific value from those data in a timely manner will be a challenge that quickly goes beyond traditional analyses and instead requires robust domain-specific AI solutions. Here I will discuss how we have been building foundation models that can be adapted across different SKA precursor instruments, by applying self-supervised learning with instance differentiation to learn a multi-purpose representation for use in radio astronomy. For a standard radio astronomy use case, our models exceed baseline supervised classification performance by a statistically significant margin for most label volumes in the in-distribution classification case and for all label volumes in the out-of-distribution case. I will also show how such learned representations can be more widely scientifically useful, for example in similarity searches that allow us to find hybrid radio galaxies without any pre-labelled examples.
- Invited Talk: Siamak Ravanbakhsh: Diffusion Modeling for Anomaly Detection
  
  Chair: Laurence Perreault Levasseur
  
  Known for their impressive performance in generative modeling, diffusion models are attractive candidates for density-based anomaly detection. This paper investigates different variations of diffusion modeling for unsupervised and semi-supervised anomaly detection. In particular, we find that Denoising Diffusion Probability Models (DDPM) are performant on anomaly detection benchmarks yet computationally expensive. By simplifying DDPM in application to anomaly detection, we are naturally led to an alternative approach called Diffusion Time Probabilistic Model (DTPM). DTPM estimates the posterior distribution over diffusion time for a given input, enabling the identification of anomalies due to their higher posterior density at larger timesteps. We derive an analytical form for this posterior density and leverage a deep neural network to improve inference efficiency. Through empirical evaluations on the ADBench benchmark, we demonstrate that all diffusion-based anomaly detection methods perform competitively. Notably, DTPM achieves orders of magnitude faster inference time than DDPM, while outperforming it on this benchmark. These results establish diffusion-based anomaly detection as an interpretable and scalable alternative to traditional methods and recent deep-learning techniques.
- Contributed Talks
  
  Chair: Laurence Perreault Levasseur
  
  2:50 PM Kaze Wong: Machine learning in gravitational wave
  3:00 PM Ben Wandelt: Scientific Reasoning with ML
- 3:10 PM
  
  Coffee Break
- Invited Talk: Yang Song: Breaking the Curse of Dimensionality in Generative Modeling: A Homotopic Approach
  
  Chair: Laurence Perreault Levasseur
  
  Generative modeling for high-dimensional data, such as images and audio, is extremely challenging due to the curse of dimensionality. To overcome this difficulty, we introduce a homotopic approach inspired by numerical equation solving, which involves designing a homotopy of probability distributions that smoothly progresses from simple noise distribution to complex data distribution. I will present two families of approaches that rely on such homotopies: score-based diffusion models and consistency models. Both approaches use a differential equation to convert data to noise and learn to estimate the time reversal with deep neural networks. These models allow for flexible neural networks, enable zero-shot image editing, and generate high-quality samples that achieve state-of-the-art performance in many generative modeling benchmarks.
- Invited Talk: Miles Cranmer: Symbolic Distillation of Neural Networks
  
  Chair: Laurence Perreault Levasseur
  
  I will demonstrate a framework for interpretable machine learning, using physically-motivated inductive biases and a technique we have termed “symbolic distillation”. This method allows a practitioner to translate a trained neural network model into an interpretable symbolic expression via the use of symbolic regression with a basis set of operators. I will first discuss the deep learning strategy for performing this distillation, and then review “symbolic regression,” an algorithm for optimizing symbolic expressions using evolutionary algorithms. In particular, I will describe the PySR/SymbolicRegression.jl software framework (github.com/MilesCranmer/PySR), which is an easy-to-use high-performance symbolic regression package in Python and Julia. Tangential to this, I will discuss several physically-motivated inductive biases which make this technique more effective. In the second half of this talk, I will review a variety of applications of this and other interpretable machine learning techniques, focusing on a few problems in astrophysics.
- Contributed Talks
  
  Chair: Laurence Perreault Levasseur
  
  4:10 PM Tri Nguyen: FLORAH: Planting better merger trees with generative models
  4:20 PM Yixin Wang: Representation Learning: A Causal Perspective
- Breakout Discussion on Challenges
- 6:00 PM
  
  Poster Session and Refreshments
Wednesday, May 24
- Plenary Talk: Ingo Waldmann: Machine Learning in Exoplanet Characterisation
  
  Chair: Shirley Ho
  
  "The exploration of extrasolar planets, which are planets orbiting stars other than our own, holds great potential for unravelling long-standing mysteries surrounding planet formation, habitability, and the emergence of life in our galaxy. By studying the atmospheres of these exoplanets, we gain valuable insights into their climates, chemical compositions, formation processes, and past evolutionary paths. The recent launch of the James Webb Space Telescope (JWST) marks the beginning of a new era of high-quality observations that have already challenged our existing understanding of planetary atmospheres. Over its lifetime, the JWST will observe approximately 50 to 100 planets. Furthermore, in the coming decade, the European Space Agency's Ariel mission will build on this progress by studying in detail the atmospheres of an additional 1000 exoplanets.
  
  In this talk, I will outline three fundamental challenges to exoplanet characterisation that lend themselves well to machine-learning approaches. Firstly, we encounter the issue of extracting useful information from data with low signal-to-noise ratios. When the noise from instruments surpasses the signal from exoplanets, we must rely on self-supervised deconvolution techniques to learn accurate instrument models that go beyond our traditional calibration methods. Secondly, in order to interpret these alien worlds, we must employ highly complex models encompassing climate, chemistry, stellar processes, and radiative transfer. However, these models demand significant computational resources, necessitating the use of machine learning surrogate modelling techniques to enhance efficiency. Lastly, the Bayesian inverse problem, which traditionally relies on methods like Markov Chain Monte Carlo (MCMC) and nested sampling, becomes particularly challenging in high-dimensional parameter spaces. In this regard, simulation-based inference techniques offer potential solutions.
  
  It is evident that many of the modelling and data analysis challenges we face in the study of exoplanets are not unique to this field but are actively investigated within the machine learning community. However, interdisciplinary collaboration has often been hindered by jargon and a lack of familiarity with each other's domains. In order to bridge this gap, as part of the ESA Ariel Space mission, we have successfully organized four machine learning challenges hosted at ECML-PKDD and NeurIPS (https://www.ariel-datachallenge.space). These challenges aim to provide novel solutions to long-standing problems and foster closer collaboration between the exoplanet and machine learning communities. I will end this talk with a brief discussion of the lessons learned from running these interdisciplinary data challenges."
- Invited Talk: Joan Bruna: Mysteries in High-dimensional Generative Modeling
  
  Chair: Shirley Ho
  
  "Generative modeling of high-dimensional phenomena such as natural images has gone through a remarkable experimental revolution over the past years, spearheaded by diffusion-based models [Sohl-Dickstein et al~15, Song & Ermon, 20].
  In this talk, I will describe two open questions raised by these models that I haven't been able to resolve yet: (i) what is the class of high-dimensional densities that can be provably learnt by diffusion models that other models cannot, and (ii) how can we provably leverage time-dependent scores to regularise inverse problems. I will describe joint work with Etienne Lempereur, Florentin Guth, Stephane Mallat, Carles Domingo, Jaume de Dios, Jiequn Han and Maarten de Hopp. "
- Invited Talk: Marylou Gabrié: Approximate transport with flows for sampling: training without data and MC(MC) correction
  
  Chair: Shirley Ho
  
  Deep generative models parametrize very flexible families of distributions able to fit complicated datasets of images or text. These models provide independent samples from complex high-distributions at negligible costs. On the other hand, sampling exactly a target distribution, such a Bayesian posterior or the Boltzmann distribution of a physical system, is typically challenging: either because of dimensionality, multi-modality, ill-conditioning or a combination of the previous. In this talk, I will discuss recent works proposing to enhance traditional inference and sampling algorithms with learning. I will present in particular flowMC, an adaptive MCMC with normalizing flows.
- Contributed Talks
  
  Chair: Shirley Ho
  
  10:10 AM David Shih: Via Machinae: Full-Sky, Model-Agnostic Search for Stellar Streams in Gaia DR2
  10:20 AM Adrian Bayer: Sampling for Field-Level Inference
  10:30 AM Tara Akhound-Sadegh: Lie Point Symmetry in Neural PDE Solvers
- 10:40 AM
  
  Coffee Break
- Plenary Talk: Stéphane Mallat: Multiscale Random Projections of Physical Energies and Deep Networks
  
  Chair: Shirley Ho
  
  An outstanding issue is to understand how deep networks circumvent the curse of dimensionality to generate or classify data. Inspired by the renormalization group in physics, we explain how deep networks can separate phenomena which appear at different scales, and capture scale interactions. It provides high-dimensional model, which approximate the probability distribution of complex physical fields such as turbulences or structured images. For classification, learning becomes similar to a compressed sensing problem, where low-dimensional discriminative structures are identified with random projections. I will introduce a multiscale random model of deep networks for classification, and its numerical validation.
- Invited Talk: Julia Kempe: Some ML concepts for cosmology
  
  Chair: Shirley Ho
  
  The application of Machine Learning in Cosmology nowadays is
  pervasive (CNN-based classification, Bayesian ML, normalizing-flow based
  inference, diffusion models, ...). Some concepts cherished by (parts
  of) the ML community might not (yet?) have found their way to
  astrophysics. In this talk, I will describe two of them, adversarial
  robustness and dataset distillation, to motivate their possible utility
  with the example of turbulence simulations and galaxy morphology extraction.
- Invited Talk: Francois Lanusse: Implicit and Explicit Simulation-Based Bayesian Inference for Cosmology
  
  Chair: Shirley Ho
  
  As we move towards the next generation of cosmological surveys, observational cosmology has reached an interesting stage in which making analytic predictions for the likelihood of the cosmological signal becomes intractable. Instead, physical models in the form of simulations offer an avenue to model the data in all of its complexity, but until very recently using such models to estimate physical fields and parameters remained an open problem.
  
  In this talk, I will discuss two possible points of view on simulators, depending on whether they are “black-box” or “open-box” models, and the different methodologies and strategies which may be applied in each case to use these physical models within a Bayesian inference context.
  
  In the case of black-box simulations (which can only be sampled from), I will discuss applications of deep generative models as a practical way to manipulate implicit distributions within a larger Bayesian framework. I will provide examples in particular of using a simulation-based prior captured by a neural score estimator to sample high-dimensional posterior distributions of cosmological fields.
  
  In the case of open-box simulations, which can be seen as differentiable probabilistic models, with an explicit joint log probability, I will discuss strategies and challenges for building large scale differentiable physical models of the Universe touching in particular on distributed differentiable N-body solvers and building accelerated hybrid physical/ml simulations leveraging neural ODE methodologies.
- Contributed Talks
  
  Chair: Shirley Ho
  
  12:10 PM Nayantara Ganapati Mudur: Diffusion generative models for astrophysical fields
  12:20 PM Alex Adams: Posterior samples for general inverse problems with GFlowNet and score-based models
  12:30 PM Matthew Liam Sampson: Score-matching diffusion models for prior informed galaxy deblending
- 12:40 PM
  
  Lunch
- Plenary Talk: Uros Seljak: Physics for ML: new physics inspired Monte Carlo samplers
  
  Chair: François Lanusse
  
  High dimensional sampling from a known target distribution is a ubiquitous problem across many fields of science and engineering. In astrophysics it is commonly used for Bayesian data analysis, and a few applications in Machine Learning are Bayesian Neural Networks, energy models and diffusion models. Many of the best known samplers have been inspired by physics ideas such as detailed balance, Langevin and Hamiltonian dynamics. Here I will describe two new physics inspired gradient based samplers called MicroCanonical Hamiltonian and Langevin Monte Carlo (MCHMC, MCLMC), and present some of their applications: 1) high dimensional sampling of initial conditions of our universe 2) field level inference of gravitational lensing of CMB 3) statistical physics and lattice QCD 4) Molecular Dynamics. On many of the applications MCLMC outperforms HMC by 1-3 orders of magnitude. I will also discuss the ongoing work with these samplers such as ensembling, stochastic gradients and optimization.
- Contributed Talks
  
  Chair: François Lanusse
  
  2:30 PM Uddipta Bhardwaj: Peregrine: Sequential simulation based inference for gravitational waves
  2:40 PM Aviad Levis: Neural Tomography of Gravitationally Lensed Emission Orbiting a Black Hole
  2:50 PM Wolfgang Kerzendorf: Uncertainty Estimation in Machine Learning Emulators for Connecting Simulations and Observations
  3:00 PM Yesukhei Jagvaral: Score-matching and denoising diffusion generative models for SO(3): an application for galaxy orientations/alignments
- 3:10 PM
  
  Coffee Break
- Invited Talk: Siddharth Mishra Sharma: Generative modeling for inference and emulation in cosmology
  
  Chair: François Lanusse
  
  The complexity of astrophysical data and the presence of unknowable systematics pose significant challenges to robustly extracting information about fundamental physics using conventional methods. I will describe how overcoming these challenges will require a qualitative shift in our approach to statistical inference, bringing together several recent advances in generative modeling, differentiable programming, and simulation-based inference. As case studies, I will show examples of using simulation-based inference to extract the dark matter content from dwarf galaxies, and the use of diffusion-based generative modeling to encode the likelihood of galaxy clustering statistics.
- Contributed Talks
  
  Chair: François Lanusse
  
  3:50 PM Joonas Nattila: Do Neural Networks Dream of Electric Sheets: Using Computer Vision to Analyze MHD Turbulence
  4:00 PM Justine Zeghal: Simulation-Efficient Implicit Inference with Differentiable Simulators
  4:10 PM Vicente Amado Olivo: Computational Meta-Research in Astrophysics: Optimizing Resource Allocation and Enhancing Research Management through Machine Learning