Tim Kunisky - Fall 2024: Random Matrix Theory in Data Science and Statistics (EN.553.796)

Fall 2024: Random Matrix Theory in Data Science and Statistics (EN.553.796)

Course Description

This is a first course in random matrix theory, the study of the eigenvalues and eigenvectors of matrices with random entries that is foundational to high-dimensional statistics and data science. Aside from the main ideas and modern applications of random matrices, a key goal will be to introduce you to the main concepts of probability in high dimensions: concentration of measure, the geometry of high-dimensional spaces and convex sets, Gaussian measure, and sharp transitions and threshold phenomena. The following is a (very) tentative ordered list of specific topics to be covered:

1. Gaussian matrices and dimensionality reduction

Direct analysis with concentration inequalities
Invariance and relation to random projection
Application: Johnson-Lindenstrauss transform and dimensionality reduction
Application: Compressed sensing

2. Classical theory of i.i.d. random matrices

Moment method for eigenvalue limit theorems
Semicircle law for Wigner matrices
Marchenko-Pastur law for Wishart matrices
Elements of universality
Elements of free probability theory
Application: Neural networks and random optimization landscapes
Application: Subsampling in frame and coding theory

3. Spiked matrix models and principal component analysis (PCA)

BBP phase transition in the performance of PCA
Application: Community detection
General covariance estimation, eigenvalue shrinkage, and free deconvolution
PCA variants, computational challenges, and statistical-computational gaps

4. Matrix concentration inequalities

General concentration inequalities and applications to random matrix norms (bounded differences, Lipschitz concentration, etc.)
General-purpose bounds on expected random matrix norms (matrix Chernoff, Bernstein, etc.)
Application: Spectral sparsification of graphs
Non-commutative Khintchine inequality and its exciting recent extensions

Contact & Office Hours

I am the instructor of this course, Tim Kunisky, and the teaching assistant is AMS PhD student Yue Wu.

The best way to contact us is by email, at kunisky [at] jhu.edu and ywu166 [at] jhu.edu, respectively. Our office hours are as follows:

TA: Mondays 3-4pm, Wyman S425 (tutorial room)
Me: Tuesdays 2:30-4pm, Wyman N438 (my office)

Schedule

Class will meet Tuesdays and Thursdays, 9:00am to 10:15am in Gilman 55.

Below is a tentative schedule, to be updated as the semester progresses.

Date	Details
Week 1
Aug 27	1. Course logistics. Review of PCA and SVD as exploratory data analysis tools. Eckart-Young-Mirsky theorem. Multiplying by a Gaussian matrix: what does it do? What is it good for? How to think about Gaussian processes.
Aug 29	2. Random matrices for dimensionality reduction: the Johnson-Lindenstrauss transform and lemma. Concentration inequalities and union bound arguments.
Week 2
Sep 3	3. Application of Johnson-Lindenstrauss to nearest neighbors. Ailon and Chazelle's fast Johnson-Lindenstrauss transform. Connection between Gaussian matrices and random projection. Uniformity of singular vectors.
Sep 5	4. Concentration of singular values of short wide matrices. Interpretation as a "matrix concentration" inequality. Epsilon net arguments for discretizing matrix norms.
Week 3
Sep 10	5. Non-constructive existence of epsilon nets over unit sphere. Finish singular values of short wide matrices.
Sep 12	6. Application: compressed sensing with random sensing matrices and the restricted isometry property. First steps of limit theorems: what does convergence of empirical spectral distribution mean? Statistical meaning of Wishart matrix limit theorems.
Week 4
Sep 17	7. Formal definitions of random weak convergence. Statement of Wigner's semicircle limit theorem. Intuition for moment method for proving limit theorems.
Sep 19	8. Review of proof of central limit theorem by Carleman's criterion and moment method. Moment calculations with Catalan numbers for semicircle limit theorem.
Week 5
Sep 24	9. Finish proof of weak convergence in probability for semicircle limit theorem. Discussion of extensions: universality and controlling extreme eigenvalues by moments.
Sep 26	10. Brief discussion of Marchenko-Pastur limit theorem. Introduction to free probability. Renormalization proof of central limit theorem and matrix generalization. Difficulties in dealing with "tangled" matrix products.
Week 6
Oct 1	11. Free probability continued. Definition of asymptotic freeness and additive free convolution. Wigner's semicircle limit theorem as the free central limit theorem.
Oct 3	12. Review and interpretation of free central limit theorem. Application: spectral graph theory and the spectra of locally tree-like graphs.
Week 7
Oct 8	13. Application: landscapes and eigenvalues of Hessians of neural networks.
Oct 10	14. Transform methods and multiplicative free convolution. Application: covariance estimation.
Week 8
Oct 15	15. El Karoui's covariance denoising method. Phase transition in spiked additive models and statement of BBP transition theorem.
Oct 17	Fall Break: No class.
Week 9
Oct 22	16. Resolvent derivation of additive BBP transition. Application: community detection in dense stochastic block models.
Oct 24	17. Application: nonlinear PCA and denoising spiked matrix models.
Week 10
Oct 29	18. Zoom Lecture: Using general purpose concentration inequalities for random matrix theory: Efron-Stein and bounded difference inequalities. Links: Zoom Recording, Blackboard Image
Oct 31	19. Concentration inequalities continued: Efron-Stein for random matrices, Poincare inequalities.
Week 11
Nov 5	20. Subgaussian rates of concentration: McDiarmid inequality, log-Sobolev inequalities, and Gaussian Lipschitz concentration.
Nov 7	21. Applications of Gaussian Lipschitz concentration. Non-commutative Khintchine inequality.
Week 12
Nov 12	22. Proof of non-commutative Khintchine inequality. Gaussian-to-Rademacher and symmetrization tricks.
Nov 14	23. Derivation of matrix Chernoff and Bernstein inequalities. Applications: covariance estimation, sparse matrix approximation.
Week 13
Nov 19	24. Applications: Sparse matrix approximation continued, random graphs and Laplacians.
Nov 21	25. Application: Spectral sparsification of graphs using effective resistances.
Fall Recess: Nov 25–29
Week 14
Dec 3	26. Recent developments in non-asymptotic random matrix theory: improved non-commutative Khintchine inequalities and free probability connections.
Dec 5	27. Recent developments continued: universality, a nascent unified non-asymptotic theory, and open problems of current interest.
Final Exam Period
Dec 17	Final presentations: Gilman 55, 6-9pm. Schedule: Ben - Freedman's inequality for matrix martingales Nick - Sparser Johnson-Lindenstrauss transforms Yutong - Non-asymptotic analysis of approximate message passing Yuxin - Diagram analysis of iterative algorithms Caio - Stability of graph neural networks Shengyi - Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning Kaibo - Fastfood: Approximating kernel expansions in loglinear time Mary Versa - Random features for large-scale kernel machines Yash - Around the circular law Xiangyi - Largest eigenvalues of sparse inhomogeneous Erdos-Renyi graphs

Lecture Notes and Materials

You do not need to buy any books for this course. You can find my lecture notes here. They currently cover all course materials, but will be revised more in the (near) future.

The following are freely available books or lecture notes that cover some similar material and might be useful to you in addition to my notes:

Grading

Grades will be based on a small number of written homework assignments, class participation, and a final project concerning a recent research paper, open problem, or topic of interest related to the material we cover.

Assignments

Homework will be posted here, and is to be submitted through Gradescope (see Canvas announcements for details). Please try to talk to me in advance if you need more time for an assignment.

Assigned	Due	Link
Sep 12	Sep 30	Assignment 1
Oct 7	Oct 23	Assignment 2
Nov 4	Nov 22	Assignment 3
Oct 7	Dec 18	Final Project

Final Project

Your final project is to do one of the following on a topic related to the content of this course: (1) read and digest a paper and present its content in your own words and style, with some elaboration that is not present in the original paper; (2) perform an interesting computational experiment motivated by something we have seen in class or something you read in a paper and report in detail on the results and their interpretation; or (3) for the intrepid, find an open problem related to something we have seen in class, try to work on it, and report your findings.

The project submission will have two parts:

A short in-class presentation when we meet on December 17 in Gilman 55 from 6-9pm. Your presentation should last about 10 minutes, plus a bit of time at the end to take questions from me and the other students. You are required to present on the blackboard, unless you have a very good reason not to. If you think that applies to you, please contact me in advance.
A short written report, due by the end of the day on December 18. I absolutely cannot grant extensions on this deadline. Your report should be at least 3 and no more than 6 pages long (not including any figures), summarizing what you learned together with any necessary background information. The report is to be prepared in LaTeX, formatted in one single-spaced column with size 12 font. If in doubt, use my lecture notes as a reference for acceptable formatting.

The following are reasonable categories from which to choose a topic, along with a few references you might look into. You are also welcome to choose a topic of your own, provided that you describe it and its relevance to the course convincingly on Assignment 2.

Extensions of and details on the Johnson-Lindenstrauss transform
Random features in machine learning
Random matrices in low-rank matrix approximation and dictionary learning
Derandomizing the restricted isometry property and compressed sensing
Derandomizing free probability and associated universality phenomena
Finite and/or computational free probability
Statistical physics and the replica trick in random matrix theory
Circle law limit theorem and asymmetric Wigner matrices
Tracy-Widom limit theorem and/or superconcentration of random matrix norms
Computational and statistical properties of PCA
Norms of structured random matrices
Random matrices for analyzing random optimization landscapes with the Kac-Rice formula
Random matrix theory for random graphs
Matrix concentration