Now lets talk about another related problem: Ax = b, where A is an nx3, x is a 3x2, and b is a nx2. This is how I'm used to solving the affine transformation matrix between two points. The nx3 matrix is all my points in one image in homogeneous coordinates, and my nx2 matrix is all my points in the other image, and I'm trying to solve x: the affine matrix. If I tried to use Gaussian elimination between A and b (an nx3 and a nx2, where n > 3), I'd realize that A's column space isn't anywhere near the level of dimensionality I need to solve b. Its hard to do Gaussian elimination when the matrices aren't squares.
So if we can't solve for b, we can solve for proj(b, A) instead. Then our solution is going to be the best x, such that Ax = proj(b, A). As a reminder, the proj(b, a) is (b.a/b.b)*a, so we're trying to maximize that (b.a/b.b)*a and minimize the distance between that projection and b. We can solve this with geometry or calculus. Geometry tends to be easier, but calculus will come back up for non-linear least squares.
We're solving for the new 'best' x which makes Ax = proj(b, A). We use the constraint that A and the error (b - A*x) must be orthogonal since the error was in that higher dimensional space that we could solve for before. Using some fancy math, we now have an expression for a new x that is about as optimal as we can possibly get.
**Note that when I solve for the affine, I usually think of it as points2 = Affine*points1, but to put it in terms of a linear least squares, I have to put the affine in the x position of b = Ax. Matrix multiplication isn't commutative, but (AB)' = (B'A'). That's why, in matlab, I usually have a lot of these: Affine = (points1'\points2')'.




