Pca why maximize variance




















Another answer is, "we really don't care at all about maximizing variance. For example, if the PC coefficients are very similar, like. Rather than define PCs as linear combinations that maximize variance, we summarize a somewhat obscure stream of literature that defines them as linear combinations that maximize average squared correlation between the linear combinations and the original variables. Then neither variance maximization nor the unit length constraint is needed. Re-scaling is allowed encouraged even , and non-singular rotations are also allowed; all provide maximum average squared correlation with the original variables.

YAA yet another answer. The first distribution we encounter is the normal distribution. In many important examples of statistics in use linear regression, I'm looking at you some hypothesis of normality is tacit. When statistics is taught, the normal distribution is the taught as the first and foremost example of a continuous distribution.

The connection is that the normal distribution is completely characterized by it's mean and variance, which are it's first and second moment. So in any case of trying to understand some distribution in terms of it's moments, using any higher moment won't work sensibly if the normal distribution is among the possible outcomes.

Sign up to join this community. The best answers are voted up and rise to the top. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Learn more. Why do we want to maximize the variance in Principal Component Analysis? Ask Question. Asked 4 years, 1 month ago. Active 3 years, 11 months ago. Viewed 6k times.

Improve this question. Tian Tian 4 4 silver badges 8 8 bronze badges. A somewhat flippant answer would be that if you were to sequentially maximize any other measure of spread, you wouldn't be doing PCA. Lurking behind this is a deeper idea: many methods to analyze and reduce the dimensions of a multivariate dataset have been invented, each with their own purposes, mathematical and statistical properties, and intended applications.

PCA could be considered one of them. This suggests interpreting your question as concerning what properties distinguish PCA from the rest. That turns our attention away from moments and towards other aspects of PCA. Add a comment. Active Oldest Votes. Improve this answer. If, to exaggerate, you were to select a single principal component, you would want it to account for the most variability possible: hence the search for maximum variance, so that the one component collects the most "uniqueness" from the data set.

Note that PCA does not actually increase the variance of your data. Rather, it rotates the data set in such a way as to align the directions in which it is spread out the most with the principal axes. This enables you to remove those dimensions along which the data is almost flat. This decreases the dimensionality of the data while keeping the variance or spread among the points as close to the original as possible.

Maximizing the component vector variances is the same as maximizing the 'uniqueness' of those vectors. Thus you're vectors are as distant from each other as possible. That way if you only use the first N component vectors you're going to capture more space with highly varying vectors than with like vectors.

Think about what Principal Component actually means. Take for example a situation where you have 2 lines that are orthogonal in a 3D space. You can capture the environment much more completely with those orthogonal lines than 2 lines that are parallel or nearly parallel. When applied to very high dimensional states using very few vectors, this becomes a much more important relationship among the vectors to maintain.

In a linear algebra sense you want independent rows to be produced by PCA, otherwise some of those rows will be redundant. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. Why do we maximize variance during Principal Component Analysis? Ask Question. Asked 9 years, 2 months ago. Active 1 year ago. Viewed 13k times. Improve this question. I think it can be easier to understand if you think of it as maximising the explained variance.

Add a comment. Active Oldest Votes. Improve this answer. LSerni LSerni



0コメント

  • 1000 / 1000