Spring 2023: Modern Probability for Theoretical Computer Science (CPSC 664)

Course Description

This course presents a selection of advanced topics of recent interest in probability theory and their applications to the theory of algorithms and computational complexity. We will try to cover a wide range of techniques and ideas from probability; while we do not have time to go into great depth on any particular one, my goal is to put students into contact with many useful and practical methods for analyzing computational problems involving randomness. The main unifying theme is understanding asymptotic methods in probability theory for deriving the approximate behavior of large random objects, often using a combination of combinatorics, analysis, and bits of "physicists' reasoning" borrowed from statistical physics.

I am aware that several similar-looking probability courses have been taught recently and are being taught at the same time as this one. While some of the topics listed below may appear close to what was covered in those courses, I will make an effort to avoid any substantial overlap with their specific material.

The course will tentatively be organized into four "vignettes" covering different themes from probability. I outline the topics I hope to cover below, but this list is subject to change depending on both time and student background and interests.

Part 1: "Entropy vs. energy" calculations

  • First and second moment methods
  • Discrete application: phase transitions in random constraint satisfisfaction problems (CSPs)
  • Continuous application: Kac-Rice formula and random low-dimensional optimization landscapes
  • Time permitting: A bit of large deviations theory

Part 2: Advanced notions of convergence

  • Methods of moments for convergence of probability distributions
  • Elements of random matrix theory and free probability
  • Discrete application: locally tree-like graphs, the Kesten-McKay law and its role in free probability, and heuristics for problems on sparse graphs
  • Continuous application: working with random matrices and the Kac-Rice formula revisited in high dimension
  • Time permitting: local weak convergence and applications in analysis of combinatorial optimization problems

Part 3: Statistical physics

  • Statistical physics dictionary: Gibbs measures, partition functions, entropies, energies, glassiness, etc., and why computer scientists should care about them
  • Physicists' toy models: random energy and mean-field approximations
  • Basics of the replica and cavity methods
  • Discrete application: survey of the Sherrington-Kirkpatrick model and its role as a limit of sparse random graphs
  • Continuous application: informal "physicists' derivations" in random matrix theory
  • Time permitting: message passing algorithms inspired by physics methods
  • Time permitting: random CSPs revisited

Part 4: Advanced properties of Gaussian measure

  • Gaussian interpolation
  • Gaussian comparison inequalities
  • Universality: Lindeberg exchange principle and/or Stein's method
  • Discrete application: ideas of rigorous analysis of the Sherrington-Kirkpatrick model
  • Continuous application: Gaussian analysis of regression and compressed sensing
  • Time permitting: Gaussian moment and cumulant combinatorics
Contact & Office Hours

The best way to contact me is by email, at dmitriy [dot] kunisky [at] yale.edu.

Office hours will be held on Wednesdays from 3-4pm. If you have a very specific question, it might be helpful and more efficient to email me in advance so that I can look up some information for you if need be. My office is Room 339 at 17 Hillhouse Avenue (YINS, the Yale Institute for Network Science). If this time does not work for you, you are welcome to email me to make a separate appointment, or to ask questions after class.

Schedule

Class will meet Tuesdays and Thursdays, 11:35am to 12:50pm, in Dunham Laboratory (DL) 120.

Below is a tentative schedule, to be updated as the semester progresses.

Date Details
Week 1
Jan 17 Course logistics. Phase transition in random integer partitioning. ("The easiest hard problem.") Heuristic threshold derivation. First moment method using Laplace method asymptotics. References (for this class and the next two): Notes: PDF | TeX
Jan 19 Integer partitioning II. Completing first moment method calculation. Second moment method. Overview of derivations of more detailed statistics. Notes: PDF | TeX
Week 2
Jan 24 Random energy model for integer partitioning. Phase transition in random \( k \)-SAT. Some intuition about integer partitioning algorithms from the optimization landscape and the "random energy model" analogy. Basic setup of random \( k \)-SAT. References (on \( k \)-SAT):
  • The Nature of Computation, Section 14.4
Notes: PDF | TeX
Jan 26 Random \( k \)-SAT II. First moment method. Ways to repair loose first moment calculations: restricted counting and conditioning. References: Notes: PDF | TeX
Week 3
Jan 31 Random \( k \)-SAT III. Direct second moment method calculation and its failure. Intuition about attraction between satisfying assignments. Notes: PDF | TeX
Feb 2 Random \( k \)-SAT IV. Ways to repair loose second moment calculations: symmetrization and weighting. Proofs of Achlioptas-Moore and Achlioptas-Peres threshold asymptotics. References:
Week 4
Feb 7 Moment problems. How to extract distributions and distributional limit theorems from moments. Example of the central limit theorem. A few ideas for working with random measures in random matrix theory. Lecture video: Google Drive link References: Notes: PDF | TeX
Feb 9

Continuous counting: Kac-Rice formula. Counting roots and critical points of random functions. The first moment method over uncountably many "possible occurrences" (it still works!).

(Note: For those who attended my reading group presentation about this a few semesters ago, the presentation will be substantially different and probably still interesting to you.) References: Notes: PDF | TeX
Week 5
Feb 14 Kac-Rice formula II. Kac-Rice analysis of a "noisy well" model (also the Hamiltonian of an "elastic manifold"). Decomposition into a random determinant and a signal-dependent "volume term". Appearance of sums of decorrelated random matrices. References: Notes: PDF | TeX
Feb 16 Semicircle law. Derivation of semicircle law for Gaussian orthogonal ensemble (GOE) using Catalan number combinatorics for moments and Stieltjes transform. Universality over Wigner matrices. References: Notes: PDF | TeX
Week 6
Feb 21 Free probability. "Renormalization approach" to the central limit theorem, Gaussian distribution, and the non-commutative analog of the GOE. Asymptotic freeness as a criterion for additive free convolution. When and why to expect asymptotic freeness. References (this lecture and next): Notes: PDF
Feb 23 Free probability II. Transform methods for computing additive free convolution. Rotational invariance and asymptotic freeness. Multiplicative free convolution. Examples of random matrix analysis. Notes: PDF | TeX
Week 7
Feb 28 Free probability and landscape complexity. Kac-Rice analysis of "noisy well" revisited. Solving for asymptotics of critical points using free convolution. Role of large deviations reasoning. Time permitting: analysis of generalized linear models. References: Notes: PDF | TeX
Mar 2 Free probability and graph theory. Abstract algebraic view of free probability. Spectrum of regular graphs of large girth and the regular tree. The Kesten-McKay law and its free probability interpretation. References:
Week 8
Mar 7 Replica method for random matrices. Our first "physicist's argument." Basic physics notions: temperature, partition function, free energy. Using the replica trick to calculate limiting eigenvalue statistics. GOE semicircle law revisited. References:
Mar 9 Replica method II. Some background on statistical physics language. Replica trick for spiked Wigner matrix model and predicting the BBP phase transition. Appearance of the "overlap matrix" formalism. References:
Spring Break: March 13-24
Week 9
Mar 28 Introduction to Sherrington-Kirkpatrick (SK) model. Basic definitions. Combinatorial motivation as limit of Max-Cut on sparse random graphs, via semicircle distribution as limit of Kesten-McKay distributions. References: Notes: PDF | TeX
Mar 30 Replica analysis of SK model. The initial replica calculations, again encountering the overlap matrix, and the replica-symmetric ansatz used by Sherrington and Kirkpatrick (1975). Some remarks on large deviations and generalizations of Cramér's theorem. References (some from next week also relevant here): Notes: PDF | TeX
Week 10
Apr 4 Replica symmetry breaking. Final details of replica symmetric "SK solution" of the SK model. Problems with SK solution: wrong ground state, negative entropy, instability. Basic premise of replica symmetry breaking ansatz. Notes: PDF
Apr 6 Replica symmetry breaking II. Parisi's replica symmetry breaking scheme and sketch of the derivation of the Parisi formula. Time permitting, some discussion of convexity and duality. References: Notes: PDF | TeX
Week 11
Apr 11 Rigorous analysis of SK model. Gaussian interpolation arguments. Sudakov-Fernique inequality. Recovering the replica-symmetric ground state as an upper bound. References:
  • Probability in High Dimension (van Handel), Section 6.1
  • High-Dimensional Probability (Vershynin), Chapter 7
Notes: PDF | TeX
Apr 13 Rigorous analysis of SK model II. Final details of replica-symmetric bound on SK ground state. Related results: bounds on random matrix eigenvalues and singular values. Gordon min-max theorem. Existence of limiting free energy in SK model. References:
Week 12
Apr 18 Cavity method. Cavity method analysis of replica-symmetric regime of SK model. Derivation of cavity relation for magnetizations and self-consistency equation for overlap. References: Notes: PDF | TeX
Apr 20 Cavity method II. Derivation of TAP equations for SK model. Interpretation as deformed power method and approximate message passing. Sketch of application to spiked matrix model with state evolution. References:
Week 13 (Student Presentations)
Apr 25 Student presentations. Schedule below.
  1. Peixin You: Perfect partitions of a random set of integers
  2. Alex Wardle-Solano: Going after the \( k \)-SAT threshold
  3. Jinzhao Wu: Proof of the satisfiability conjecture for large \( k \)
  4. Dantong Li: The algorithmic phase transition of random \( k \)-SAT for low degree polynomials
Apr 27 Student presentations II. Schedule below.
  1. Psi Vesely: Continuous learning with errors
  2. Abhinav Bhardwaj: How many zeros of a random sparse polynomial are real?
  3. Curtis McDonald: Optimization of the Sherrington-Kirkpatrick Hamiltonian
Lecture Notes and Materials

You do not need to buy any books for this course. I will write lecture notes as the course proceeds, based on edited versions of scribe notes written by enrolled students.

The following are books, courses, or lecture notes that I have ended up referring to explicitly above or basing some lectures on:

These are supplementary references that might also be helpful:

Grading

Grades will be based on a small number of written homework assignments, scribe notes to be prepared by a different student each week based on my lectures and typeset in LaTeX, and a final presentation concerning a recent research paper, open problem, or topic of interest related to the material we cover.

The relative weight of written assignments and scribe notes will be determined depending on enrollment: if there is a small number of students and each student needs to prepare more sets of notes, the homework assignments will count less (or not at all) towards grades, to keep the total workload reasonable.

Assignments

Homework will be posted here, and is to be submitted to me by email, in PDF format produced with LaTeX, by the due date indicated. Please try to talk to me in advance if you need more time for an assignment.

Assigned Due Link
Feb 15 Mar 10 Assignment 1
Feb 28 Apr 4 Final presentation paper choice (by email)
Apr 11 May 4 Assignment 2
Scribe Notes

When it is your turn to take scribe notes, make sure to attend the lecture, and take especially careful notes. Then, within about a week of the lecture, fill in the LaTeX template (see below) with your notes. To the best of your ability, try to fill in missing details, correct anything that seems not to make sense (or ask me about it), and add references.

When you are ready, email me the LaTeX file. I may ask you to make some minor revisions, and then will post your notes in the schedule above.

The template files for your notes are linked here:

scribe-template.tex   |   scribe-template.pdf

Final Presentation Topics

As the course goes on, I will list some reasonable options for your final presentation here (organized by topic); you are also welcome to find your own paper, or to present about a combination of papers or an open problem, but in that case please confirm with me first.

Number partitioning and REM conjectures:

\( k \)-SAT:

Kac-Rice formula:

Free probability:

Replica method:

Limits of random combinatorial problems:

Sherrington-Kirkpatrick model and related combinatorics: