PCA Trying to encode m points in R^n: X, m x n matrix points as rows f g R^n -> R^l -> R^n with l << n so that g(f(X)) as close as possible to X if g is linear, g(c) = Dc, and D has orthogonal columns, then f(x) = c = D^Tx is the best. (in general, could it be divided by D^TD somewhere?) Now choose D to minimize sum ||x - g(f(x)||_2 = ||x - DD^Tx||_2 over all x rows of X. D will be the eigenvectors of X^TX