Department of Mathematics

University of California, San Diego

9500 Gilman Dr.

La Jolla, CA 92093

Phone: 858-534-2640

E-mail: wez243@ucsd.edu

Office: AP&M 6131

I am an Assistant Professor in the Department of Mathematics at the University of California, San Diego [Google Scholar].

My research uses tools and ideas from probability theory (concentration phenomenon, empirical process theory), functional and geometric analysis, and numerical optimization to understand high-dimensional and/or large-scale estimation and inference problems as well as complex machine learning tasks. The driving force of my research is in addressing several core challenges in statistics and data science, such as robustness, heterogeneity, model uncertainty, and statistical and computational trade-offs. Questions of this sort include: (a) Can we develop statistical methods that are robust to violations of the classical yet stringent assumptions, such as normality and homogeneity? (b) Given a complex statistical problem, how much data is required (sample size versus model complexity) to guarantee an effective solution? (c) For a given statistical problem, can we develop a statistically optimal method that can be solved via computationally efficient algorithms?

In a democracy it is important to discriminate influence from authority.

-- Charles W. Eliot

[syllabus] [website]

MATH 281C: Mathematical Statistics (modern)

[syllabus]

MATH 287D: Statistical Learning

[syllabus]

MATH 185: Introduction to Computational Statistics

[syllabus]

MATH 189: Exploratory Data Analysis and Inference

[syllabus]

MATH 181A: Introduction to Mathematical Statistics I

MATH 181B: Introduction to Mathematical Statistics II

[syllabus]

with

Preprint, 2019

[pdf] [software]

On the asymptotic distribution of the scan statistic for empirical distributions

with

Preprint, 2019

[arXiv]

Nonconvex regularized robust regression with oracle properties in polynomial time

with

Preprint, 2019

[arXiv] [software]

A new principle for tuning-free Huber regression

with

Preprint, 2018

[pdf] [supplement] [software] [slides]

with

[pdf] [software]

Robust inference via multiplier bootstrap

with

[pdf] [supplement] [Matlab code]

Adaptive Huber regression

with

[DOI] [arXiv] [software] [slides]

FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control

with

[DOI] [software]

User-friendly covariance estimation for heavy-tailed distributions

with

[DOI]

Principal component analysis for big data

with

[DOI] [arXiv]

A new perspective on robust

with

[DOI] [arXiv]

Are discoveries spurious? Distributions of maximum spurious correlations and their applications

with

[DOI] [arXiv] [slides]

Max-norm optimization for robust matrix recovery

with

[DOI] [arXiv]

On Gaussian comparison inequality and its application to spectral analysis of large random matrices

with

[DOI] [arXiv]

Simulation-based hypothesis testing of high dimensional means under covariance heterogeneity

with

[DOI] [arXiv] [slides]

Self-normalization: Taming a wild population in a heavy-tailed world

with

[DOI]

Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering

with

[DOI] [arXiv]

Two-sample smooth tests for the equality of distributions

with

[DOI] [arXiv]

Guarding against spurious discoveries in high dimensions

with

[DOI] [arXiv] [slides]

Nonparametric covariate-adjusted regression

with

[DOI]

Cramér-type moderate deviations for Studentized two-sample

with

[DOI] [slides]

Matrix completion via max-norm constrained optimization

with

[DOI]

Cramér type moderate deviation theorems for self-normalized processes

with

[DOI]

Stein’s method for nonlinear statistics: A brief survey and recent progress

with

[DOI]

Nonparametric and parametric estimators of prevalence from group testing data with aggregated covariates

with

[DOI]

Necessary and sufficient conditions for the asymptotic distributions of coherence of ultra-high dimensional random matrices

with

[DOI] [arXiv] [slides]

A max-norm constrained minimization approach to 1-bit matrix completion

with

[DOI]

University of California, San Diego

9500 Gilman Dr.

La Jolla, CA 92093

Phone: 858-534-2640

E-mail: wez243@ucsd.edu

Office: AP&M 6131

I am an Assistant Professor in the Department of Mathematics at the University of California, San Diego [Google Scholar].

My research uses tools and ideas from probability theory (concentration phenomenon, empirical process theory), functional and geometric analysis, and numerical optimization to understand high-dimensional and/or large-scale estimation and inference problems as well as complex machine learning tasks. The driving force of my research is in addressing several core challenges in statistics and data science, such as robustness, heterogeneity, model uncertainty, and statistical and computational trade-offs. Questions of this sort include: (a) Can we develop statistical methods that are robust to violations of the classical yet stringent assumptions, such as normality and homogeneity? (b) Given a complex statistical problem, how much data is required (sample size versus model complexity) to guarantee an effective solution? (c) For a given statistical problem, can we develop a statistically optimal method that can be solved via computationally efficient algorithms?

In a democracy it is important to discriminate influence from authority.

-- Charles W. Eliot

## Teaching

MATH 281A: Mathematical Statistics (classical)[syllabus] [website]

MATH 281C: Mathematical Statistics (modern)

[syllabus]

MATH 287D: Statistical Learning

[syllabus]

MATH 185: Introduction to Computational Statistics

[syllabus]

MATH 189: Exploratory Data Analysis and Inference

[syllabus]

MATH 181A: Introduction to Mathematical Statistics I

MATH 181B: Introduction to Mathematical Statistics II

[syllabus]

## Preprints

FarmTest: An R package for factor-adjusted robust multiple testingwith

Preprint, 2019

[pdf] [software]

On the asymptotic distribution of the scan statistic for empirical distributions

with

Preprint, 2019

[arXiv]

Nonconvex regularized robust regression with oracle properties in polynomial time

with

Preprint, 2019

[arXiv] [software]

A new principle for tuning-free Huber regression

with

Preprint, 2018

[pdf] [supplement] [software] [slides]

## Publications

Multiplier bootstrap for quantile regression: Non-asymptotic theory under random designwith

*Information and Inference: A Journal of the IMA*, to appear, 2020+

[pdf] [software]

Robust inference via multiplier bootstrap

with

*The Annals of Statistics*, to appear, 2020+

[pdf] [supplement] [Matlab code]

Adaptive Huber regression

with

*Journal of the American Statistical Association*,

**115**, 254-265, 2020

[DOI] [arXiv] [software] [slides]

FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control

with

*Journal of the American Statistical Association*,

**114**, 1880-1893, 2019

[DOI] [software]

User-friendly covariance estimation for heavy-tailed distributions

with

*Statistical Science*,

**34**, 454-471, 2019

[DOI]

Principal component analysis for big data

with

*Wiley StatsRef: Statistics Reference Online*, 2018

[DOI] [arXiv]

A new perspective on robust

*M*-estimation: Finite sample theory and applications to dependence-adjusted multiple testing

with

*The Annals of Statistics*,

**46**, 1904-1931, 2018

[DOI] [arXiv]

Are discoveries spurious? Distributions of maximum spurious correlations and their applications

with

*The Annals of Statistics*,

**46**, 989-1017, 2018

[DOI] [arXiv] [slides]

Max-norm optimization for robust matrix recovery

with

*Mathematical Programming, Series B*,

**167**, 5-35, 2018

[DOI] [arXiv]

On Gaussian comparison inequality and its application to spectral analysis of large random matrices

with

*Bernoulli*,

**24**, 1787-1833, 2018

[DOI] [arXiv]

Simulation-based hypothesis testing of high dimensional means under covariance heterogeneity

with

*Biometrics*,

**73**, 1300-1310, 2017

[DOI] [arXiv] [slides]

Self-normalization: Taming a wild population in a heavy-tailed world

with

*Applied Mathematics - A Journal of Chinese Universities*,

**32**, 253-269, 2017

[DOI]

Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering

with

*Biometrics*,

**73**, 31-41, 2017

[DOI] [arXiv]

Two-sample smooth tests for the equality of distributions

with

*Bernoulli*,

**23**, 951-989, 2017

[DOI] [arXiv]

Guarding against spurious discoveries in high dimensions

with

*Journal of Machine Learning Research*,

**17**(203), 1-34, 2016

[DOI] [arXiv] [slides]

Nonparametric covariate-adjusted regression

with

*The Annals of Statistics*,

**44**, 2190-2220, 2016

[DOI]

Cramér-type moderate deviations for Studentized two-sample

*U*-statistics with applications

with

*The Annals of Statistics*,

**44**, 1931-1956, 2016

[DOI] [slides]

Matrix completion via max-norm constrained optimization

with

*Electronic Journal of Statistics*,

**10**, 1493-1525, 2016

[DOI]

Cramér type moderate deviation theorems for self-normalized processes

with

*Bernoulli*,

**22**, 2029-2079, 2016

[DOI]

Stein’s method for nonlinear statistics: A brief survey and recent progress

with

*Journal of Statistical Planning and Inference*,

**168**, 68-89, 2016

[DOI]

Nonparametric and parametric estimators of prevalence from group testing data with aggregated covariates

with

*Journal of the American Statistical Association*,

**110**, 1785-1796, 2015

[DOI]

Necessary and sufficient conditions for the asymptotic distributions of coherence of ultra-high dimensional random matrices

with

*The Annals of Probability*,

**42**, 623-648, 2014

[DOI] [arXiv] [slides]

A max-norm constrained minimization approach to 1-bit matrix completion

with

*Journal of Machine Learning Research*,

**14**, 3619-3647, 2013

[DOI]

## Bio

**2017-Present**: Assistant Professor, Department of Mathematics, University of California, San Diego

**2015-17**: Postdoctoral Research Associate, Department of Operations Research and Financial Engineering, Princeton University

**2013-15**: Research Fellow, School of Mathematics and Statistics, University of Melbourne

**2009-13**: Phd Student, Department of Mathematics, Hong Kong University of Science and Technology