Friday, May 09, 2008

Another Brain Workout. Matrix Rank Problem

Thanks to Prof. Israel who helped me out twice.

**********************************************Wei
Suppose X is a random data matrix with dimension n by t

M=inv(X ' * X), i.e. M is the inverse of product of (transpose X and
X)

and P=I-X * M * X'

I is an identity matrix (n by n), is rank (P) = n-t?

I wrote a testing MATLAB program to verify it.

%%%%%%%%%%%rank_test.m
for n=7:15
for t=2:7
X=randn(n,t);
M=inv(X'*X);
I=eye(n,n);
P=I-X*M*X';
r=rank(P);
fprintf('the rank of p at n=%d, t=%d is %d\n',n,t,r);
end
end

And the result IS correct:
the rank of p at n=7, a=2 is 5
the rank of p at n=7, a=3 is 4
the rank of p at n=7, a=4 is 3
the rank of p at n=7, a=5 is 2
.........................

%%%%%%%%%%%%%%%%%

So far, I vaguely feel this identity is right by calculating the trace

tr(P)=tr(I)-tr(X * M * X')=tr(I)-tr(X'*X*M)=n-t

But is there any rigorous calculation or proof that anybody knows?
Thanks
*******************************************Robert Israel
writes:
> And, I verified that X does not have to have full rank by setting some
> rows of X to trivial one. It doesn't affect much except for occasional
> MATLAB complaining for singularities.

Rank testing using floating point arithmetic is dangerous.
> On May 8, 11:05 am, wrote:
> > Hum.... There ARE some exceptions. I think it's due to the numerical
> > subtlety.

> > On May 8, 10:58 am, wrote:

> > > Suppose X is a random data matrix with dimension n by t

> > > M=inv(X ' * X), i.e. M is the inverse of product of (transpose X and
> > > X)

... which of course requires X to have rank t.
> > > and P=I-X * M * X'

> > > I is an identity matrix (n by n), is rank (P) = n-t?

Note that PX = X - X M X' X = 0, while if Px = 0, x = X M X' x.
Thus Ker(P) = Ran(X). P is the orthogonal projection on the
orthogonal complement of Ran(X), so its rank is n - rank(X).
**********************************************Wei
OK. Dumb as I am, I think I've figured out your solution:

Because PX=0, Thus Ker(P) \inc Ran(X)

And, because for any x, Px=0, x is included in Ran(X)

so Ker(P) is included in Ran(X)

Thus Ker(P)=Ran(X).

Because of theorem (Dimension of kernel of A ) + (dimension of range
of A )=(dimension of U).

We get the answer.

That's smooth. I feel good.

No comments: