[RsR] Question re' MCD, MVE and Pearson's correlation

Uri Hasson uh@@@on @end|ng |rom uch|c@go@edu
Tue Oct 3 05:40:53 CEST 2006


Dear group members,

I have posted this question on the R-help mailing list, but
did not get any reply. Could anyone here point me to the right
direction?

Cheers,
Uri Hasson

Message:
I am trying to use MCD and MVE methods in the analysis of
functional imaging
(fMRI) data. But, before doing that, I want to understand the
sampling distribution of the correlation parameter given by
MCD and MVE (cov.mcd$cor, cov.mve$cor).

To this end, I conducted a simulation where in each of 100000
epochs, I
a.    construct a matrix from two vectors, each containing 40
numbers
randomly sampled from a normal distribution.
b.    apply cov.mve and cov.mcd to the resulting matrix.
c.    obtain the correlations in the subsets selected by
cor.mve: e.g., if
the matrix is called cormat20.ans, I request:

current.mve20 <- round(cov.mve(cormat20.ans, cor=T)$cor[[2]] ,3)

At the end of the day, I have the sampling distribution for
these correlations [i.e., what correlations exist in the
subsets that MVE and MCD tend to pick up when sampling from
normal distribution].

Here is my question: Because MVE and MCD select the most
central 20 points (of the 40), I wanted to compare the
resulting sampling distributions to that of a Pearson's "r"
correlation coefficient (i.e., a Pearson's r with
N=20; the goal was to establish whether the significance
thresholds are similar).  However the three sampling
distributions are quite different. That is, the sampling
distribution of Pearson's R (N=20) is very different
than that of cov.mve and cov.mcd (with N=20 [20 being the
subset selected of the 40 points]).  The sampling distribution
of Pearson's R with N=40 is also very different than that of
MVE and MCD.

If anyone knows, or could point me to sources information that
discuss the issue of the sampling distribution of of
cov.mve$cor and cov.mcd$cor and their relations to the
pearson's R, I would be very grateful.

I have put the simulation code I used here:
http://home.uchicago.edu/~uhasson/pearson-mcd-mve.R.txt

And an image of the resulting sampling distributions here:
http://home.uchicago.edu/~uhasson/correl.comparison.tiff


Sincerely,
Uri Hasson
The Brain Research Imaging Center
The University of Chicago




More information about the R-SIG-Robust mailing list