Michael Y. Li

I am a second year Computer Science PhD student at Stanford University, where I'm advised by Noah Goodman and Emily Fox.

Previously, I graduated summa cum laude, Phi Beta Kappa, and Tau Beta Pi from Princeton University, where I was fortunate to work with Tom Griffiths and Ryan Adams.

This summer, I'll be interning at Microsoft Research in Redmond.

Email  /  GitHub  /  Google Scholar  /  LinkedIn

profile photo


I'm broadly interested in probabilistic modeling/inference and understanding large language models. Recently, I've also worked on language model driven statistical model discovery.

Automated Statistical Model Discovery with Language Models

Michael Y. Li, Emily B. Fox, Noah D. Goodman
arxiv preprint, 2024

We propose a language model driven automated statistical model discovery system.

NAS-X: Neural Adaptive Smoothing via Twisting

Dieterich Lawson* Michael Y. Li*, Scott W. Linderman
NeurIPS, 2023
Advances in Approximate Bayesian Inference, 2023 [Oral Presentation]
paper website

We introduce a new method for inference and model learning that combines reweighted-wake sleep and smoothing Sequential Monte Carlo. We theoretically analyze the bias and consistency of our method and then apply it to discrete latent variable modeling and fitting mechanistic models of neural dynamics.

Why think step-by-step? Reasoning emerges from the locality of experience

Ben Prystawski, Michael Y. Li, Noah D. Goodman
NeurIPS, 2023 [Oral Presentation, top 0.5%]

We empirically and theoretically study when chain-of-thought reasoning emerges in large language models.

Gaussian Process Surrogate Models for Neural Networks

Michael Y. Li, Erin Grant, Thomas L. Griffiths
UAI, 2023

We propose a framework that uses Gaussian processes to approximate neural networks. We use this framework to analyze neural network training dynamics and identify influential data points.

Learning to Learn Functions

Michael Y. Li, Fred Callaway, William D. Thompson, Ryan P. Adams, Thomas L. Griffiths
Cognitive Science, 2023

We propose hierarchical Bayesian models of how people learn to learn functions and validate our model in behavioral experiments.

Design and source code from Jon Barron's website