This is part of what has become a series on discovering some fairly basic mathematical results, and/or discovering their proofs. It’s mostly intended so that I start finding the results intuitive - having once found a proof myself, I hope to be able to reproduce it without too much effort in the exam.
Statement of the theorem
Sylvester’s Law of Inertia states that given a quadratic form on a real finite-dimensional vector space , there is a diagonal matrix , with entries , to which is congruent; moreover, and are the same however we transform into this diagonal form.
The very first thing we need to know is that is diagonalisable. (If it isn’t diagonalisable, we don’t have a hope of getting into this nice form.) We know of a few classes of diagonalisable matrices - symmetric, Hermitian, etc. All we know about is that it is a real quadratic form. What does that mean? It means that if we move into some coordinate system; transposing gives us that , but the left-hand-side is scalar so is symmetric, whence (because was arbitrary). Hence has a symmetric matrix and so is diagonalisable: there is an orthogonal matrix such that , where is diagonal. (Recall that a matrix is orthogonal if it satisfies .)
Now we might as well consider in diagonal form. Some of the elements are positive, some negative, and some zero - it’s easy to transform so that the positive ones are all together, the negative ones are all together and the zeros are all together, by swapping basis vectors. (For instance, if we want to swap diagonal elements in positions , just swap .) Now we can scale every diagonal element down to , by scaling the basis vectors - if we scale by , calling the resulting basis we’ll get as required. (The squaring comes from the fact that is a *quadratic* form, so .)
Hence we’ve got into the right form. But how do we show that the number of positive and negative elements is an invariant?
All I remember from the notes is that there’s something to do with positive definite subspaces. It turns out that’s a really big hint, and I haven’t been able to think up how you might discover it. Sadly, I’ll just continue as if I’d thought it up for myself rather than remembering it.
The following section was my first attempt. My supervisor then told me that it’s a bit inaccurate (and some of it doesn’t make sense). In particular, I talk about the dimension of for a subspace of - but isn’t even a space (it doesn’t contain ). During the supervision I attempted to refine it by using the complement of in , but even that is vague, not least because complements aren’t unique.
We have a subspace on which is positive definite - namely, make diagonal and then take the first basis vectors. (Remember, positive definite iff unless ; but for because is a sum of positive things.) Similarly, we have a subspace on which is negative semi-definite (namely “everything which isn’t in ”). Then what we want is: for any other diagonal form of , there is the same number of 1s on the diagonal, and the same number of -1s, and the same number of 0s. That is, we want to ensure that just by changing basis, we can’t alter the size of the subspace on which is positive-definite.
We’ll show that for any subspace on which is positive-definite, we must have . Indeed, let’s take on which is positive definite. The easiest way to ensure that its dimension is less than that of is to show that it’s contained in . Now, that might be hard - we don’t know anything about what’s in - but we might do better in showing that nothing in is also in , because we know is negative semi-definite on , and that’s inherently in tension with the positive-definiteness on .
Suppose and . Then (by the first condition) and (by the second condition, since is positive-definite) - contradiction.
That was quick - we showed, for all subspaces on which is positive-definite, that .
We have a subspace on which is positive-definite - namely, make diagonal and take the first basis vectors. We’ll call the set of basis vectors ; then is spanned by .
Now, let’s take any subspace on which is positive-definite. We want ; to that end, take spanned by . We show that . Indeed, if , with , then:
But this is a contradiction. Hence is the zero space, and so because while .
Notice that my original version is conceptually quite close to correct: “take something in a positive-definite space, show that it can’t be in the negative-semi-definite bit and hence must be in ”. I was careless in not checking that what I had written made sense. I am slightly surprised that no alarm bells were triggered by my using as a space - I hope that now my background mental checks will come to include this idea of “make sure that when you transform objects, you retain their properties”.
Completion (original and hopefully correct)
Identically we can show that for all subspaces on which is negative-definite, that (with defined analogously to but with negative-definiteness instead of positive-definiteness). And we already know that congruence preserves matrix rank (because matrix rank is a property of the eigenvalues, and basis change in this way only alters eigenvalues by multiples of squares), so we have that the number of zeros in any diagonal representation of is the same.
Hence in any diagonal representation of with the number of respectively on the diagonal, we need - but because the diagonal is the same size on each matrix (since the matrices don’t change dimension), we must have equality throughout.