170 likes | 257 Views
Matrix Concentration. Nick Harvey University of British Columbia. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A. The Problem. Given any random n x n , symmetric matrices Y 1 ,…, Y k . Show that i Y i is probably “close” to E[ i Y i ].
E N D
Matrix Concentration Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA
The Problem Given any random nxn, symmetric matrices Y1,…,Yk.Show that iYi is probably “close” to E[iYi]. Why? • A matrix generalization of the Chernoff bound. • Much research on eigenvalues of a random matrix with independent entries. This is more general.
Chernoff/Hoeffding Bound • Theorem:Let Y1,…,Yk be independent random scalars in [0,R].Let Y = iYi. Suppose that ¹L· E[Y] ·¹U.Then
Rudelson’s Sampling Lemma • Theorem: [Rudelson ‘99]Let Y1,…,Yk be i.i.d. rank-1, PSD matrices of size nxns.t.E[Yi]=I, kYik·R. Let Y = iYi, so E[Y]=k¢I.Then • Example: Balls and bins • Throw k balls uniformly into n bins • Yi = Uniform over • If k = O(n log n / ²2), all bins same up to factor 1§²
Rudelson’s Sampling Lemma • Theorem: [Rudelson ‘99]Let Y1,…,Yk be i.i.d. rank-1,PSD matrices of size nxns.t.E[Yi]=I, kYik·R. Let Y = iYi, so E[Y]=k¢I.Then • Pros:We’ve generalized to PSD matrices • Mild issue:We assume E[Yi] = I. • Cons: • Yi’s must be identically distributed • rank-1 matrices only
Rudelson’s Sampling Lemma • Theorem: [Rudelson-Vershynin ‘07] Let Y1,…,Yk be i.i.d. rank-1,PSD matrices s.t.E[Yi]=I, kYik·R. Let Y = iYi, so E[Y]=k¢I.Then • Pros:We’ve generalized to PSD matrices • Mild issue:We assume E[Yi] = I. • Cons: • Yi’s must be identically distributed • rank-1 matrices only
Rudelson’s Sampling Lemma • Theorem: [Rudelson-Vershynin ‘07]Let Y1,…,Yk be i.i.d. rank-1, PSD matrices s.t.E[Yi]=I. Let Y=iYi, so E[Y]=k¢I.Assume Yi¹R¢I. Then • Notation: • A ¹ B , B-A is PSD • ® I ¹ A ¹¯ I , all eigenvalue of A lie in [®,¯] • Mild issue:We assume E[Yi] = I. E[Yi]=I
Rudelson’s Sampling Lemma • Theorem: [Rudelson-Vershynin ‘07]Let Y1,…,Yk be i.i.d. rank-1, PSD matrices.Let Z=E[Yi], Y=iYi, so E[Y]=k¢Z.Assume Yi¹R¢Z. Then • Apply previous theorem to { Z-1/2 Yi Z-1/2 : i=1,…,k }. • Use the fact that A ¹ B, Z-1/2AZ-1/2¹ Z-1/2BZ-1/2 • So (1-²) k Z ¹ iYi¹ (1+²) k Z, (1-²) k I ¹ iZ-1/2YiZ-1/2¹ (1+²) k I
Ahlswede-Winter Inequality • Theorem: [Ahlswede-Winter ‘02] Let Y1,…,Yk bei.i.d. PSD matrices of size nxn.Let Z=E[Yi], Y=iYi, so E[Y]=k¢Z.Assume Yi¹R¢Z. Then • Pros: • We’ve removed the rank-1 assumption. • Proof is much easier than Rudelson’s proof. • Cons: • Still need Yi’s to be identically distributed.(More precisely, poor results unless E[Ya] = E[Yb].)
Tropp’s User-Friendly Tail Bound • Theorem: [Tropp ‘12]Let Y1,…,Yk beindependent, PSD matrices of size nxn.s.t. kYik·R. Let Y=iYi. Suppose ¹L¢I¹E[Y]¹¹U¢I. Then • Pros: • Yi’s do not need to be identically distributed • Poisson-like bound for the right-tail • Proof not difficult (but uses Lieb’s inequality) • Mild issue: Poor results unless ¸min(E[Y]) ¼¸max(E[Y]).
Tropp’s User-Friendly Tail Bound • Theorem: [Tropp ‘12]Let Y1,…,Yk be independent, PSD matrices of size nxn.Let Y=iYi. Let Z=E[Y].Suppose Yi ¹R¢Z. Then
Tropp’s User-Friendly Tail Bound • Theorem: [Tropp ‘12]Let Y1,…,Yk be independent, PSD matrices of size nxn.s.t. kYik·R. Let Y=iYi. Suppose ¹L¢I¹E[Y]¹¹U¢I. Then • Example: Balls and bins • For b=1,…,n • For t=1,…,8 log(n)/²2 • With prob ½, throw a ball into bin b • Let Yb,t= with prob ½, otherwise 0.
Additive Error • Previous theorems give multiplicative error: (1-²) E[iYi] ¹iYi¹ (1+²) E[iYi] • Additive error also useful: kiYi - E[iYi]k·² • Theorem: [Rudelson & Vershynin ‘07]Let Y1,…,Yk be i.i.d. rank-1, PSD matrices.Let Z=E[Yi]. Suppose kZk·1, kYik·R. Then • Theorem: [Magen & Zouzias ‘11]If instead rank Yi· k := £(R log(R/²2)/²2), then
Proof of Ahlswede-Winter k • Key idea: Bound matrix moment generating function • Let Sk = i=1 Yi treA+B·treA¢eB Golden-Thompson Inequality Weakness:This is brutal By induction,
How to improve Ahlswede-Winter? • Golden-Thompson InequalitytreA+B·treA¢eB for all symmetric matrices A, B. • Does not extend to three matrices!treA+B+C·treA¢eB¢eCis FALSE. • Lieb’s Inequality: For any symmetric matrix L,the map f : PSD Cone !R defined byf(A) = tr exp( L + log(A) )is concave. • So f interacts nicely with Expectation and Jensen’s inequality
Beyond the basics • Hoeffding (non-uniform bounds on Yi’s) [Tropp ‘12] • Bernstein (use bound on Var[Yi]) [Tropp ‘12] • Freedman (martingale version of Bernstein) [Tropp ‘12] • Stein’s Method (slightly sharper results) [Mackey et al. ‘12] • Pessimistic Estimators for Ahlswede-Winter inequality [Wigderson-Xiao ‘08]
Summary • We now have beautiful, powerful, flexible extension of Chernoff bound to matrices. • Ahlswede-Winter has a simple proof;Tropp’sinequality is very easy to use. • Several important uses to date;hopefully more uses in the future.