§5.1 Introduction
Today we will talk about orthogonality and least squares. Given a basis for a vector space, a linear algebra student should begin salivating, much like Pavlov's dog, and ask, can I convert this into an orthonormal basis? (Please avoid salivating on the computer equipment). The idea is, of course, that orthogonal bases are much easier to work with and lend themselves nicely to performing various computations such as finding vector coordinates with respect to this basis and projecting vectors on various subspaces of a vector space. Closely tied to this idea is the concept of orthogonal matrices: square matrices whose columns form an orthonormal set. Despite this simple definition, they possess many nice properties, some of which we will discuss in this lab. As far as applications go, we will talk about least squares. It is a problem that often arises with experimental data wherein we try to find a function that most closely models our collected data. So, sit back, relax, and enjoy the last lab of the quarter.
§5.2 Orthogonality
We start with vector orthogonality. Two vectors x and y are said to be orthogonal if their inner product is 0; i.e.,
< x, y > = x • y = xTy = 0.Remember that the MATLAB command that calculates inner products looks like
>> x'*y
or alternatively, you could enter
>> dot(x,y)
where, "dot" stands for "dot product".
Enter the following vectors into MATLAB. Remember to use semicolons between entries so that you produce column vectors.

Let us take a closer look at the resulting matrix W. It is an example of what are called orthogonal matrices. They are square matrices whose columns form an orthonormal set.
Remark 5.1 Orthogonal matrices are incredibly important in geometry and physics. One of the reasons why, is that orthogonal 2 x 2 matrices with positive determinant represent rigid motions of the plane that keep the origin fixed. Similarly, 3 x 3 orthogonal matrices with positive determinant represent rigid motions of space keeping the origin fixed. If you allow the determinant to be negative, you also allow reflections through a line in the case of the plane, or through a plane in the case of 3-space. In any case, n x n orthogonal matrices are precisely the distance preserving transformations of n-space that keep the origin fixed. As an easy exercise, try to see geometrically why all distance-preserving transformations of the plane that keep the origin fixed must either be reflections or rotations. As a more challenging exercise, try to classify all distance preserving transformations of the plane.
§5.3 Orthogonal Projections and Gram-Schmidt Process
First, we will look at orthogonal projections onto a vector. Hopefully, you were able to determine in §5.2 above that the vectors v and w are not orthogonal. (In this section, we will look at vectors that are not orthogonal. This is because orthogonal projections are trivial if the vectors we start with are orthogonal. Projecting a vector onto an orthogonal vector gives 0, the zero vector, as its orthogonal projection.) Suppose we want to find the orthogonal projection of v onto w, let's denote it by v. The projection formula tells us that
v = ( < v, w > / < w, w > ) w
Note that if w is normalized to length 1, you need not divide by <w, w> since it will be equal to 1 already.
Here's a pictorial depiction of what we are doing

It is not possible to enter the symbol v into MATLAB. Instead, when defining the vector v, name the variable "vbar".
The idea of projecting a vector onto a subspace is easily extended from the above case by looking at an orthogonal basis for the subspace, projecting the vector onto each basis element, and then adding them all up to form the orthogonal projection.
|
|
Exercise 5.4 Use MATLAB to find the projection of the vector (3, 3, 3)T onto the subspace spanned by the vectors {x, y}. (These vectors are from exercise 5.1.) |
As you can see, in order for us to project a vector onto a subspace, we must be able to come up with an orthogonal basis for that subspace. As you know the Gram-Schmidt process accomplishes just that. The MATLAB command that performs the Gram-Schmidt process is called qr(). (Note that this is qr, and not gr.) Given a matrix A, it performs the QR-factorization algorithm, and returns an orthogonal matrix Q whose columns are obtained from columns of A by using Gram-Schmidt, and an upper triangular matrix R, such that A = QR.
Example 5.1
Let's try to find an orthonormal basis for the subspace spanned by the vectors a and b from exercise 5.2. We start by defining a new matrix A with a and b as its columns.
>> A = [a b]
A =
1 2
1 0
0 3
Next, we apply the qr() command.
>> [Q,R]=qr(A)
Q =
-0.7071 0.3015 -0.6396
-0.7071 -0.3015 0.6396
0
0.9045 0.4264
R =
-1.4142 -1.4142
0 3.3166
0 0
Remark 5.2 There is a slight ambiguity above that results from the fact that our matrix is not square. The output consists of a 3 x 3 matrix Q. However, when we perform Gram-Schmidt on our two vectors we should get two (instead of three) orthonormal vectors. That is why we are going to use a shortened version of the command. Type the following and notice the difference in output
>> [Q,R]=qr(A,0)
Q =
-0.7071 0.3015
-0.7071 -0.3015
0 0.9045
R =
-1.4142 -1.4142
0 3.3166
We get that the orthonormal basis for the subspace
spanned by a and b is

Remark 5.3 (Euler's
Theorem) It is always the case that the eigenvalues of any
orthogonal matrix have absolute value 1. Now, suppose we have a 3x3
orthogonal matrix T whose determinant is equal to 1. Its
characteristic polynomial will be of degree 3, and hence will have a
real root. This root must be ±1, since the absolute value of any
eigenvalue of an orthogonal matrix is 1. Thus we have the following
options for the set of eigenvalues of T:
{±1, ±1, ±1} or {±1, z,
z}
where z is a complex number and z is its conjugate. In any case, the product of eigenvalues must equal to the determinant of the matrix which is 1, and so the only possibilities are:
{1, 1, 1}, {1, -1, -1} and {1, z, z}.
It follows, that any 3x3 orthogonal matrix whose determinant is equal to one, must have an eigenvalue 1. If v is an eigenvector corresponding to the eigenvalue 1, then T(tv) = tTv = tv for any real number t. So T fixes the entire axis on which v lies. Now, remember that T is an orthogonal matrix so it represents some rigid motion of space, and since it leaves some axis fixed, it must be a rotation around that axis!
This fact lies in the heart of Euler's celebrated theorem which states that a motion of a rigid body with one point fixed must be a rotation around some axis. This is quite striking if you think about it. Imagine that you are moving your linear algebra textbook around in your hands in such a way that its center always stays in the same place. You may be doing very chaotic things to the book: turning it to the right or left, rotating it up or down. You may stop moving it completely, only to start again moments later. But as long as the center of the book stays in the same place, in the end of all the moving around, you will have done nothing to the book but a very orderly rotation about some axis.
§5.4 Least Squares
Let us now turn our attention to something a bit
more practical. Suppose we are interested in studying the
relationship between temperature and pressure. The first step is
to obtain some experimental data. In order to do that, we attach
a temperature and pressure sensor to a metal container. Starting
at room temperature (75°F) we
slowly heat the air inside our container to 195°F and record temperature and pressure
readings. We get a total of five data points :
| Temperature (°F) |
Pressure (lb/sq in.) |
| 75 | 15 |
| 100 | 23 |
| 128 | 26 |
| 159 | 34 |
| 195 | 38 |
Since we suspect the relationship to be linear, we will be looking for a straight line with the equation y = mx +b, where y is the pressure, x the temperature, m the slope, and b the y-intercept. We would like this line to pass through the data points we collected from the experiment. Plugging in these data points into the equation of the line, we obtain the following system of equations:
b + 75m = 15
b + 100m = 23
b + 128m = 26
b + 159m = 34
b + 195m = 38
We can attempt to solve this system of equations, by first rewriting it in the matrix form:
Unfortunately, this system does not have a solution, even though the physical law is of linear order. This is due to the fact that it is not possible to get perfect accuracy when recording data.
Since it is not possible to solve the above system, we use the least squares method to find the closest solution. That is, we want to get the best fit line.
Remark 5.4 By "best fit line", we mean a line that minimizes the sum of the squares of the distances from each point to the line.
To simplify notation let us assign names to the matrices and vectors of the above system.

To remind you of the idea behind least squares, recall that we are trying to find a vector c that minimizes the distance || Bc - d ||. Of course, we can never hope for this norm to be zero since Bc = d has no solutions. However, the equation Bc = projV d, where V is the column space of B, does have a solution. This is because projV d lies in the column space of B, and B hits every vector in its column space. The solution to this equation turns out to minimize the norm || Bc - d || (You can look up why in your linear algebra book). In the following exercise we will solve the equation Bc = projV d directly.
In the next exercise we will plot our data
points, along with the best-fit line, to illustrate the effectiveness
of the least squares method.
Remark
5.5 polyval()returns
a polynomial of specific degree with coefficient specified by a vector.
§5.6
Conclusion
Last Modified:
06/10/2009