Linear Algebra Recap
Almost every operation in machine learning is a matrix multiplication. This page is a fast tour of the linear-algebra concepts that recur throughout the rest of the curriculum: vector spaces, the four fundamental subspaces, matrix factorisations (eigendecomposition, SVD), and the geometric interpretations that let you read matrix expressions as transformations rather than indices.
Vectors and inner products
Vectors
A basis of
Matrices as linear maps
A matrix
- Column view —
is a linear combination of 's columns. - Row view —
is the inner product of the -th row with .
Matrix multiplication
The four fundamental subspaces
Every
- Column space
— span of the columns. - Row space
. - Null space
. - Left null space
.
Two orthogonality relations:
Eigendecomposition
For square
with
For symmetric
Singular Value Decomposition
For any
where
- PCA — principal components are right singular vectors of the centred data matrix (see PCA & SVD).
- Pseudo-inverse —
, the right thing to use for over-/under-determined least squares. - Low-rank approximation — Eckart–Young theorem: the best rank-
approximation is (top- singular triplets). - Numerical conditioning —
is the condition number; large values mean inversion is unstable.
SVD is the one matrix factorisation worth being fluent in.
Norms and conditioning
For a matrix
- Frobenius norm
. - Spectral norm
. - Nuclear norm
— the convex envelope of rank.
Each plays a different role in regularisation: Frobenius for ridge, nuclear for low-rank, spectral for stability constraints (e.g., Lipschitz networks, WGAN).
What to read next
- Probability & Statistics — random vectors and covariance live in this language.
- Multivariate Calculus & Gradients — gradients are best understood as covectors / 1-forms.
- PCA & SVD — the most important application.