Jump to content

User:ChrisSwetenham/PPCA

From Wikipedia, the free encyclopedia

Probabilistic Principal Component Analysis is a topic in machine learning and computer vision . Probabilistic PCA is a latent variable model. Since the latent variable space is smaller than the data space, it is a form of dimensionality reduction.


Model[edit]

The model for a -dimensional data space with an -dimensional latent space is

Where is an -dimensional latent random variable distributed according to a multi-dimensional gaussian distribution with zero mean and unit variance , is a -dimensional constant vector, and is a -dimensional random variable distributed according to a multi-dimensional gaussian distribution with zero mean and identically distributed indepentend components .

Since linear transformations of gaussian random variables are gaussian, the random variable is then distributed according to a gaussian with the covariance matrix:

.

The term implies that the resulting covariance matrix remains unchanged under orthonormal transformations of , since .

Inference[edit]

Given a value for , we can infer the distribution in the latent space:

Where:


Parameter Estimation[edit]

Given a set of samples from , we can find the maximum likelihood solution for the parameters of the model. It can be shown that the maximum likelihood estimation of the matrix is given by:

Where is a x matrix of the eigenvectors of the sample covariance matrix with the largest eigenvalues, and is the -dimensional diagonal matrix of the corresponding eigenvalues.

The remaining parameter of the model, , has a maximum likelihood estimation of:

Where are the remaining eigenvalues.

The parameters of the model can also be estimated using the EM algorithm, in which case the solution can be extended to mixtures of PPCA models.

Relationship to Other Models[edit]

The maximum likelihood solution of the Probabilistic PCA model above corresponds to the projection performed in classical Principal Components Analysis.

Since the components of the data vector are independent given the latent variable components, Probabilistic PCA can be seen as an instance of the Naive Bayes model.

Factor Analysis is a similar model to Probabilistic PCA, but it allows each component of the error term to have a different variance.

Applications[edit]

The Eigenface technique consists of applying PCA to the recognition of human faces. Using Probabilistic PCA can make the technique more robust to outliers (for example, images in the dataset which are not faces). More generally, PPCA can be used to model an underlying space of features that contribute to the appearance of an object.

Bibliography[edit]

  • de la Torre, F.; Black, M.J. (2001). "Robust principal component analysis for computer vision" (PDF). Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001. Vol. 1. pp. 362–369. doi:10.1109/ICCV.2001.937541. ISBN 0-7695-1143-0.{{cite book}}: CS1 maint: date and year (link)

<!-- DISABLED, as categories are not allowed on User: pages --> [[Category:Image processing]] [[Category:Artificial intelligence]] [[Category:Machine learning]]