[R] test for whether dataset comes from a known MVN
bolker at ufl.edu
Fri Oct 12 19:33:04 CEST 2007
Desmond Campbell wrote:
> Dear Ben Bolker,
> Thanks for replying and offering advice, unfortunately it doesn't solve my
> 1) The mshapiro.test() in the mvnormtest package appears only applicable
> for datasets containing 3-5000 samples, whereas my dataset contains
> 2) As you said in your email if my data is from the real world then any
> test is likely to reject the null hypothesis, because of the power of such
> large dataset.
> However my data is not from the real world. I am conducting validation
> studies, and if the program I am testing is working correctly then the
> will be perfectly normally distributed.
> Thanks anyway.
I would be tempted in this case to contact the package author and find
out what limits the size of the input data set. It does look like the
method requires a matrix inversion, in which case you might be in big
trouble (if it were sparse you could see if you could substitute in SparseM
functions, but I kind of doubt it would be ...).
Do you know if anyone has come up with a method that will do this
test for this size data set? i.e., is this a problem of developing a
method or a problem of implementation in R? (Are the methods discussed
in http://support.sas.com/ctx/samples/index.jsp?sid=480 or
as Mardia's multivariate skew or kurtosis appropriate and less numerically
intensive? I don't know how to calculate MV skew, and R site search brings
up a lot about the MV skew-normal distribution but not a lot about MV skew
itself. I found an SPSS macro http://www.columbia.edu/~ld208/Mardia.sps but
that's as far as I got.)
Do you have to test the whole data set at once? Could you hack it
by testing subsets of the data and (e.g.) using Fisher's combined p values?
View this message in context: http://www.nabble.com/test-for-whether-dataset-comes-from-a-known-MVN-tf4609195.html#a13177063
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help