Monday, April 28, 2008

A pitfall that bothered me a whole weekend.

This question looks easy at the first glance, the problem is that I worked out two paradox solutions. Fortunately, I figured it out with the help of Ray Vikson in Google group sci.math:

******************************************************Wei
Question: x1, x2, x3 are I.I.D. samples from normal distribution
N(0,cosi^2).

y1=x1+x2, y2=x2+x3, then what is the conditional distribution of y1
given y2, i.e. p(y1|y2)?

I work out two solutions, but they contradicts, can anybody help me to
figure out why?

Solution1:

Given y2=a, then x2=a-x3, so y1=x1+a-x3. So given y2=a, it's easy to
verify that y1 is still a normal distribution
N(a,2*cosi^2).

Solution2:

vector [x1 x2 x3]' is a multivariate normal distribution and its mean
is [0, 0, 0]' and covariance:
| cosi^2 0 0 |
S= | 0 cosi^2 0 |
| 0 0 cosi^2 |

vector [y1 y2]'= A * [x1 x2 x3]'

A is a matrix
| 1 1 0 |
| 0 1 1 |

So, according to the theorem (Mardia, Multivariate Analysis, 3.2.1,
3.2.4) , a linear transformation of multivariate normal distribution
is still a normal distribution and it's mean and variance matrix can
be obtained by:

mean=A* [0,0,0]'=[0,0,0]';

variance=A*S*A'
| 2* cosi^2 cosi^2 |
= | cosi^2 2*cosi^2|

And given y2=a, y1 is still a normal distribution, and the mean and
covariance is

mean = u1 + Sigma21 * invert(Sigma 22 ) * (a -u 2) = 0 + 1/2 * (a
-0)=1/2 a?????? which is not equal to solution 1???

And the same problem happened with the variance matrix.

How could this happen? And which solution is right?

Please help.

Thanks

*************************************************Ray Vickson
et's eliminate the constant cosi^2, because it serves no useful
purpose (and anyway it is not clear whether this should be cos(i^2) or
cos(i)^2); thus, take Xi ~ N(0,1). Starting from the multivariate
distribution of (X1,X2,X3) and using standard methods (I used moment-
generating functions, but there are other ways) we get the
multivariate distribution of (Y1,Y2) as
f(y1,y2) = C*exp(-(1/6)[y1,y2]M[y1,y2]'), where [y1,y2]' = column
vector and M is the matrix M = [[2,-1],[-1,2]] (that is, row(1) =
[2,-1] and row(2) = [-1,2]). The variance-covariance matrix is [[2,1],
[1,2]], whose inverse is (1/3)M. We have C = sqrt(3)/(6*pi). The
conditional distribution of Y1, given Y2 = y2 is f(y1|y2) = f(y1,y2)/
f_Y2(y2), where f_Y2(y2) = integral(f(y1,y2) dy1, y1=-inf..inf) = exp(-
y2^2/4)/(2*sqrt(pi)), so we get f(y1|y2) = exp(-(2y1 - y2)^2/12)/
sqrt(3*pi). This gives E(Y1|Y2=y2] = y2/2 and var(Y1|Y2=y2) = 3/2
(letting Maple 9.5 do all the computations).

So, Solution 2 appears to be correct. This leaves the question: what
is wrong with Solution 1? That is a good question, and I don't see the
answer at the moment. I am pretty sure that Solution 2 is OK because
it's obtained using known, well-proved formulas applied in a detailed
step-by-step manner.

R.G. Vickson

*************************************************Wei

Thanks. Ray. I think I figure out this problem now.

Solution 1 is flawed, the reason is:

given y2=a, x2=a-x3, so y1= x1+a-x3. No problem, but this time the
distribution of x3 is not N(0,1) anymore. It's p(x3|y2=a) and the
variance is bounded somehow by this condition, and the conditional
distribution of x3 is like N(0,1/2). p(x1|y2=a) is still N(0,1)
because x1 and y2 are independent.

So y1 is NOT what I assumed.

Thank you. This paradox bothered me for a whole day!
***************************************Ray Vickson
Of course! I could kick myself for not seeing it. This is yet another
instance showing how careful one must be when using probability
arguments, and that is why the detailed, long-winded approach is
sometimes the best.
R.G. Vickson

No comments: