Panelist: Molei Tao, Georgia Tech
Title: Can diffusion generative modeling help physical sciences?
Abstract:
In this presentation the speaker will share some personal perspectives, largely on a recent generative AI methodology known as denoising diffusion model. It already demonstrated great success in generating images/videos/audios, and there are also significant ongoing efforts to explore its potential as a candidate for the next generation of large language models. Therefore, this discussion will no longer aim at pure data science applications, but rather explore together with the DDDAS community the following question: can denoising diffusion help physical sciences?
Some natural directions include Bayesian inferences, data assimilation, and uncertainty quantification. For example, the technique of denoising diffusion was recently applied to construct new Monte Carlo methods that can sample very efficiently from multimodal distributions, which is critical for these tasks especially if they are needed in real time. It also provides a competitive way to estimate the probability distribution of data in high dimensions, which is helpful for uncertainty quantification.
The above, however, are applications of denoising diffusion after some adaptation. As a generative model per se, it also recently started attracting attention in the direction of motion/path planning. Meanwhile, if one would like to close the loop and put it in a control context in real time, for example, one could imagine computational efficiency matters a lot. Improvements of the generation speed of diffusion models have, in fact, come a long way, and the discussion will briefly summarize approaches based on both numerical integration and the technique of distillation.
In addition, many physical problems correspond to non-Euclidean spaces. For example, in robotics one often has to deal with SO(3) or SE(3) Lie groups; physics may require various quantities to be in some constrained sets; perhaps certain variables can even take discrete values. Selected progress of diffusion generative modeling in these non-Euclidean spaces will be briefly discussed.
Finally, can we trust the information, knowledge and decision produced by diffusion generative models? Rigorous, quantitative error controls of standard diffusion models are in fact possible. But can these models be world models that learn from data how the world operates? In general, it might be preferable to hardwire domain knowledge (along the lines of DDDAS-based methods) into a (deep) learning model, rather than accounting for everything by a black box (i.e., a generic neural network). Should and how do we enable this for the diffusion model, possibly even inside a feedback loop? This open question hopefully could be elaborated by the DDDAS community.
Panelist Bio-Overview:
Molei Tao received B.S. in Math & Physics in 2006 from Tsinghua Univ. (Beijing) and Ph.D. in Control & Dynamical Systems with a minor in Physics in 2011 from Caltech. Afterwards, he was a postdoc in Computing & Mathematical Sciences at Caltech from 2011 to 2012, and then a Courant Instructor at NYU from 2012 to 2014. From 2014 on, he was an assistant, and then associate professor in School of Math at Georgia Tech. He is also a core faculty member of GT Machine Learning Center, Machine Learning Ph.D. Program, GT Algorithms & Randomness Center (ARC), Algorithms, Combinatorics & Optimization (ACO) Ph.D. Program, and GT Decision & Control Lab.
His recent research mainly focuses on the theoretical and algorithmic foundations of machine learning. Topics include: generative modeling, sampling, and measure transport; non-Euclidean machine learning; optimization and deep learning theory (often through the lens of dynamical systems); scientific machine learning (i.e. AI4Science/Engineering/Computing).
He is a recipient of the W.P. Carey Ph.D. Prize in Applied Mathematics (2011), American Control Conference Best Student Paper Finalist (2013), NSF CAREER Award (2019), AISTATS best paper award (2020), IEEE EFTF-IFCS Best Student Paper Finalist (2021), Cullen-Peck Scholar Award (2022), GT-Emory AI.Humanity Award (2023), Plenary Speaker at Georgia Scientific Computing Symposium (2024), Keynote Speaker at the 2024 International Conference on Scientific Computing and Machine Learning, and SONY Faculty Innovation Award (2024). He holds various editorial roles, including area chairs of premier machine learning conferences such as NeurIPS and ICLR.