Lei Wu (吴磊)

Lei Wu 


Assistant Professor
School of Mathematical Sciences
Center for Machine Learning Research
Peking University

Office: 静园6院 205
Email: leiwu (at) math (dot) pku (dot) edu (dot) cn

Google Scholar     Github      CV

About Me

I am an Assistant Professor in the School of Mathematical Sciences and Center for Machine Learning Research at Peking University.

Previously, I was a postdoctoral researcher at PACM, Princeton University and at the Wharton School, University of Pennsylvania. I received my Ph.D. in Computational Mathematics from Peking University in 2018, advised by Prof. Weinan E, and my B.S. in Mathematics from Nankai University in 2012.

Research Vision

My research studies the interplay between representation, optimization, and generalization in modern machine learning, viewed through the lens of scaling.

  • Representation (function spaces): Characterize neural networks via their induced function spaces, including expressivity, approximation, and inductive bias.

  • Optimization (stochastic dynamical systems): Analyze training algorithms as high-dimensional stochastic dynamical systems, focusing on stability, implicit bias, and critical phenomena.

  • Generalization (statistical learning): Quantify finite-sample generalization and its underlying mechanisms.

Research Highlights

(See also: Full publication list)

Recruiting

We are actively seeking self-motivated postdocs, Ph.D. students, and undergraduate interns to join my group. If you are interested, please email me your CV, transcript, and a brief description of your background and research interests.

Recent News

  • 2026-02: Constant-depth network with smooth activations released on arXiv.

    • Establishes that smooth activations (e.g., GELU, SiLU) enable smoothness adaptivity in constant-depth neural networks, achieving optimal approximation and statistical rates.

  • 2026-02: Fast catch-up, late switching accepted to ICLR 2026.

    • Studies batch-size scheduling under FSL, revealing a fast catch-up effect, which holds across linear regression and LLM pre-training.

  • 2025-09: Functional Scaling Laws accepted to NeurIPS 2025 (Spotlight).

    • Introduces a functional scaling law (FSL) framework that—in contrast to classical scaling laws, which only describe final-step behavior—characterizes the entire loss trajectory, spanning from linear regression to LLM pre-training.