Spring 2023: Modern Probability for Theoretical Computer Science (CPSC 664)

Course Description

This course presents a selection of advanced topics of recent interest in probability theory and their applications to the theory of algorithms and computational complexity. We will try to cover a wide range of techniques and ideas from probability; while we do not have time to go into great depth on any particular one, my goal is to put students into contact with many useful and practical methods for analyzing computational problems involving randomness. The main unifying theme is understanding asymptotic methods in probability theory for deriving the approximate behavior of large random objects, often using a combination of combinatorics, analysis, and bits of "physicists' reasoning" borrowed from statistical physics.

I am aware that several similar-looking probability courses have been taught recently and are being taught at the same time as this one. While some of the topics listed below may appear close to what was covered in those courses, I will make an effort to avoid any substantial overlap with their specific material.

The course will tentatively be organized into four "vignettes" covering different themes from probability. I outline the topics I hope to cover below, but this list is subject to change depending on both time and student background and interests.

Part 1: "Entropy vs. energy" calculations

  • First and second moment methods
  • Discrete application: phase transitions in random constraint satisfisfaction problems (CSPs)
  • Continuous application: Kac-Rice formula and random low-dimensional optimization landscapes
  • Time permitting: A bit of large deviations theory

Part 2: Advanced notions of convergence

  • Methods of moments for convergence of probability distributions
  • Elements of random matrix theory and free probability
  • Discrete application: locally tree-like graphs, the Kesten-McKay law and its role in free probability, and heuristics for problems on sparse graphs
  • Continuous application: working with random matrices and the Kac-Rice formula revisited in high dimension
  • Time permitting: local weak convergence and applications in analysis of combinatorial optimization problems

Part 3: Statistical physics

  • Statistical physics dictionary: Gibbs measures, partition functions, entropies, energies, glassiness, etc., and why computer scientists should care about them
  • Physicists' toy models: random energy and mean-field approximations
  • Basics of the replica and cavity methods
  • Discrete application: survey of the Sherrington-Kirkpatrick model and its role as a limit of sparse random graphs
  • Continuous application: informal "physicists' derivations" in random matrix theory
  • Time permitting: message passing algorithms inspired by physics methods
  • Time permitting: random CSPs revisited

Part 4: Advanced properties of Gaussian measure

  • Gaussian interpolation
  • Gaussian comparison inequalities
  • Universality: Lindeberg exchange principle and/or Stein's method
  • Discrete application: ideas of rigorous analysis of the Sherrington-Kirkpatrick model
  • Continuous application: Gaussian analysis of regression and compressed sensing
  • Time permitting: Gaussian moment and cumulant combinatorics
Contact & Office Hours

The best way to contact me is by email, at dmitriy [dot] kunisky [at] yale.edu.

Office hours will be held by appointment at first, until enrollment and attendance stabilize. Email me to schedule a time to talk. We will discuss choosing a time for regular office hours at one of the early class meetings. My office is Room 339 at 17 Hillhouse Avenue (YINS, the Yale Institute for Network Science); outside office hours and appointments, you're welcome to try your luck finding me there.

Schedule

Class will meet Tuesdays and Thursdays, 11:35am to 12:50pm, in Dunham Laboratory (DL) 120.

Below is a tentative schedule, to be updated as the semester progresses.

Date Details
Week 1
Jan 17 Course logistics. Phase transition in random integer partitioning. ("The easiest hard problem.") Heuristic threshold derivation. First moment method using Laplace method asymptotics. References (for this class and next):
  • The Nature of Computation, Section 14.5
  • Hayes (2002) - The easiest hard problem
  • Borgs, Chayes, Pittel (2001) - Phase transition and finite size scaling for the number partitioning problem
  • Mertens (2000) - A physicist's approach to number partitioning
Notes: PDF | TeX
Jan 19 Integer partitioning II. Completing first moment method calculation. Second moment method. Overview of derivations of more detailed statistics. Notes: PDF | TeX
Week 2
Jan 24 Random energy model for integer partitioning. Phase transition in random k-SAT. Some intuition about integer partitioning algorithms from the optimization landscape and the "random energy model" analogy. Basic setup of random k-SAT.
Jan 26 Random k-SAT II. First moment method. Ways to repair loose first moment calculations: restricted counting and conditioning. If time: failure of naive second moment method. Intuition about non-REM-like landscape structure.
Week 3
Jan 31 Random k-SAT III. Ways to repair loose second moment calculations: symmetrization and weighting. Proofs of Achlioptas-Moore and Achlioptas-Peres threshold asymptotics. Clustering / replica symmetry breaking transition.
Feb 2

Continuous counting: Kac-Rice formula in one dimension. A geometric viewpoint on counting real zeroes of random univariate polynomials. The first moment method over uncountably many "possible occurrences" (it still works!).

(Note: For those who attended my reading group presentation about this a few semesters ago, the presentation will be substantially different and probably still interesting to you.)
Week 4
Feb 7

Moment problems. How to extract distributions and distributional limit theorems from moments. Reminder of the central limit theorem. Details of integer partitioning landscape revisited. Ideas for working with random measures in random matrix theory.

(Note: This will most likely be a remote session—hopefully the only one of the semester—to be held either on Zoom or pre-recorded.)
Feb 9 Kac-Rice formula II: high dimension. Counting critical points ("landscape complexity") of random multivariate polynomials. Appearance of random matrix theory problems.
Week 5
Feb 14 Random matrix theory. Convergence of random measures. Basic calculations: Gaussian orthogonal ensemble spectrum. Wigner matrix universality. Challenges in bounding extreme eigenvalues (if time: Füredi-Komlós enumeration). Challenges in dealing with less symmetric ensembles.
Feb 16 Free probability. Basic notions: noncommutative probability spaces, freeness, asymptotic freeness, and why they are useful. Heuristics and examples about when to expect freeness or not.
Lecture Notes and Materials

You do not need to buy any books for this course. I will write lecture notes as the course proceeds, based on edited versions of scribe notes written by enrolled students.

The following are some books, courses, and lecture notes with content similar to some of the material we will cover:

Grading

Grades will be based on a small number of written homework assignments, scribe notes to be prepared by a different student each week based on my lectures and typeset in LaTeX, and a final presentation concerning a recent research paper, open problem, or topic of interest related to the material we cover.

The relative weight of written assignments and scribe notes will be determined depending on enrollment: if there is a small number of students and each student needs to prepare more sets of notes, the homework assignments will count less (or not at all) towards grades, to keep the total workload reasonable.

Scribe Notes

When it is your turn to take scribe notes, make sure to attend the lecture, and take especially careful notes. Then, within about a week of the lecture, fill in the LaTeX template (see below) with your notes. To the best of your ability, try to fill in missing details, correct anything that seems not to make sense (or ask me about it), and add references.

When you are ready, email me the LaTeX file. I may ask you to make some minor revisions, and then will post your notes in the schedule above.

The template files for your notes are linked here:

scribe-template.tex   |   scribe-template.pdf

Final Presentation Topics

As the course goes on, I will list some reasonable options for your final presentation here (organized by topic); you are also welcome to find your own paper, or to present about a combination of papers or an open problem, but in that case please confirm with me first.

Number partitioning and REM conjectures: