120 likes | 337 Views
Distributed Nuclear Norm Minimization for Matrix Completion. Morteza Mardani, Gonzalo Mateos and Georgios Giannakis ECE Department, University of Minnesota Acknowledgments : MURI (AFOSR FA9550-10-1-0567) grant. Cesme, Turkey June 19, 2012. 1. Learning from “Big Data”.
E N D
Distributed Nuclear Norm Minimization for Matrix Completion Morteza Mardani, Gonzalo Mateos and Georgios Giannakis ECE Department, University of Minnesota Acknowledgments: MURI (AFOSR FA9550-10-1-0567) grant Cesme, Turkey June 19, 2012 1
Learning from “Big Data” `Data are widely available, what is scarce is the ability to extract wisdom from them’ Hal Varian, Google’s chief economist Fast BIG Productive Ubiquitous Revealing Messy Smart 2 K. Cukier, ``Harnessing the data deluge,'' Nov. 2011.
Context Goal: Given few incomplete rows per agent, impute missing entries in a distributed fashion by leveraging low-rank of the data matrix. Preference modeling • Imputation of network data Smart metering Network cartography 3 3
Low-rank matrix completion Noise-free s.t. • Consider matrix , set • Sampling operator ? ? ? ? • Given incomplete (noisy) data ? ? ? ? ? ? (as) has low rank • Goal: denoise observed entries, impute missing ones ? ? • Nuclear-norm minimization [Fazel’02],[Candes-Recht’09] Noisy 4
Problem statement • Network: undirected, connected graph ? ? ? ? n ? ? ? ? ? ? Goal: Given per node and single-hop exchanges, find (P1) • Challenges • Nuclear norm is not separable • Global optimization variable 5
Separable regularization • Key result [Recht et al’11] Lxρ ≥rank[X] • New formulation equivalent to (P1) (P2) • Nonconvex; reduces complexity: Proposition 1.If stationary pt. of (P2) and , then is a global optimum of (P1). 6
Distributed estimator (P3) Consensus with neighboring nodes • Network connectivity (P2)(P3) • Alternating-directions method of multipliers (ADMM) solver • Method [Glowinski-Marrocco’75], [Gabay-Mercier’76] • Learning over networks [Schizas et al’07] • Primal variables per agent : n • Message passing: 7
Attractive features • Highly parallelizable with simple recursions • Unconstrained QPs per agent • No SVD per iteration • Low overhead for message exchanges • is and is small • Comm. cost independent of network size Recap: (P1)(P2)(P3) Consensus Nonconvex Sep. regul. Nonconvex Centralized Convex Stationary (P3) Stationary (P2) Global (P1) 9
Optimality • Proposition 2.If converges to • and , then: • i) • ii) is the global optimum of (P1). • ADMM can converge even for non-convex problems [Boyd et al’11] • Simple distributed algorithm for optimal matrix imputation • Centralized performance guarantees e.g., [Candes-Recht’09] carry over 10
Synthetic data • Random network topology • N=20, L=66, T=66 • Data • , • , 11
Real data • Network distance prediction [Liau et al’12] • Abilene network data (Aug 18-22,2011) • End-to-end latency matrix • N=9, L=T=N • 80%missing data Relative error: 10% 12 Data: http://internet2.edu/observatory/archive/data-collections.html