[R] [PlainText Attempt] Sampling distribution of correlation estimations derived from robust MCD and MVE methods

Uri Hasson zvalim at gmail.com
Mon Sep 25 20:06:45 CEST 2006

```Dear R users,

I am trying to use MCD and MVE methods in the analysis of functional imaging
(fMRI) data. But, before doing that, I want to understand the sampling
distribution of the correlation parameter given by MCD and MVE (cov.mcd\$cor,
cov.mve\$cor).

To this end, I conducted a simulation where in each of 100000 epochs, I
a.    construct a matrix from two vectors, each containing 40 numbers
randomly sampled from a normal distribution.
b.    apply cov.mve and cov.mcd to the resulting matrix.
c.    obtain the correlations in the subsets selected by cor.mve: e.g., if
the matrix is called cormat20.ans, I request:

current.mve20 <- round(cov.mve(cormat20.ans, cor=T)\$cor[] ,3)

At the end of the day, I have the sampling distribution for these
correlations [i.e., what correlations exist in the subsets that MVE and MCD
tend to pick up when sampling from normal distribution].

Here is my question: Because MVE and MCD select the most central 20 points
(of the 40), I wanted to compare the resulting sampling distributions to
that of a Pearson's "r" correlation coefficient (i.e., a Pearson's r with
N=20; the goal was to establish whether the significance thresholds are
similar).  However the three sampling distributions are quite different.
That is, the sampling distribution of Pearson's R (N=20) is very different
than that of cov.mve and cov.mcd (with N=20 [20 being the subset selected of
the 40 points]).  The sampling distribution of Pearson's R with N=40 is also
very different than that of MVE and MCD.

If anyone knows, or could point me to sources information that discuss the
issue of the sampling distribution of of cov.mve\$cor and cov.mcd\$cor and
their relations to the pearson's R, I would be very grateful.

I have put the simulation code I used here:
http://home.uchicago.edu/~uhasson/pearson-mcd-mve.R.txt

And an image of the resulting sampling distributions here:
http://home.uchicago.edu/~uhasson/correl.comparison.tiff

Sincerely,
Uri Hasson
The Brain Research Imaging Center
The University of Chicago

```