130 likes | 290 Views
Learning. Probabilistic Graphical Models. Parameter Estimation. Maximum Likelihood Estimation. Biased Coin Example. P is a Bernoulli distribution: P(X=1) = , P(X=0) = 1- . Tosses are independent of each other Tosses are sampled from the same distribution (identically distributed).
E N D
Learning Probabilistic Graphical Models Parameter Estimation MaximumLikelihoodEstimation
Biased Coin Example P is a Bernoulli distribution: P(X=1) = , P(X=0) = 1- • Tosses are independent of each other • Tosses are sampled from the same distribution (identically distributed) sampled IID from P
IID as a PGM X . . . X[1] X[M] Data m
L(D:) 0 0.2 0.4 0.6 0.8 1 Maximum Likelihood Estimation • Goal: find [0,1] that predicts D well • Prediction quality = likelihood of D given
Maximum Likelihood Estimator • Observations: MH heads and MT tails • Find maximizing likelihood • Equivalent to maximizing log-likelihood • Differentiating the log-likelihood and solving for :
Sufficient Statistics • For computing in the coin toss example, we only needed MH and MT since • MH and MT are sufficient statistics
Datasets Statistics Sufficient Statistics • A function s(D) is a sufficient statistic from instances to a vector in k if for any two datasets D and D’ and any we have
Sufficient Statistic for Multinomial • For a dataset D over variable X with k values, the sufficient statistics are counts <M1,...,Mk> where Mi is the # of times that X[m]=xi in D • Sufficient statistic s(x) is a tuple of dimension k • s(xi)=(0,...0,1,0,...,0) i
Sufficient Statistic for Gaussian • Gaussian distribution: • Rewrite as • Sufficient statistics for Gaussian: s(x)=<1,x,x2>
Maximum Likelihood Estimation • MLE Principle: Choose to maximize L(D:) • Multinomial MLE: • Gaussian MLE:
Summary • Maximum likelihood estimation is a simple principle for parameter selection given D • Likelihood function uniquely determined by sufficient statistics that summarize D • MLE has closed form solution for many parametric distributions