[R] test for whether dataset comes from a known MVN

Desmond Campbell desmondcampbell at yahoo.com
Thu Oct 11 20:28:28 CEST 2007

Dear Ben Bolker,


Thanks for replying and offering advice, unfortunately it doesn't solve my 

1) The mshapiro.test() in the mvnormtest package appears only applicable 
for datasets containing 3-5000 samples, whereas my dataset contains 100,000 

2) As you said in your email if my data is from the real world then any 
test is likely to reject the null hypothesis, because of the power of such a 
large dataset.

However my data is not from the real world. I am conducting validation 
studies, and if the program I am testing is working correctly then the dataset 
will be perfectly normally distributed.

Thanks anyway.



Desmond Campbell



> Campbell, Desmond wrote:
> Dear all,
> I 
have a multivariate dataset containing 100,000 or more points.
> I want 
find the p-value for the dataset of points coming from a
> particular 
multivariate normal distribution
> With
> mean vector u
Covariance matrix s2
> So
> H0: points ~ MVN( u, s2)
> H1: 
points not ~ MVN( u, s2)
> How do I find the p-value in R?

> Ben Bolker wrote: 

> >    Googling for "Shapiro-Wilk multivariate" brings up 
> > in the mvnormtest package.  However, I would 
strongly suspect that
> > if your data are from the real world that you 
will reject the null
> > hypothesis
> > of multivariate 
normality when you have 100,000 points -- the power
> > to detect tiny 
(unimportant?) deviations from MVN will be very 
> > 
> >   cheers
> >     Ben Bolker
It's about the oil, stupid!
                                           . . `; -._    )-;-,_`)
                                          (v_,)'  _  )`-.\  ``-'
                                         _.- _..-_/ / ((.'
                                       ((,.-'   ((,/

Want ideas for reducing your carbon footprint? Visit Yahoo! For Good  http://uk.promotions.yahoo.com/forgood/environment.html

More information about the R-help mailing list