[R] Finding non-normal distributions per row of data frame?

Greg Snow Greg.Snow at imail.org
Fri Feb 4 22:34:14 CET 2011


It is fine, you just overthought the solution and used both the applys and for loops (see another thread today where I made the same mistake of overthinking and combining 2 different methods).  I was just pointing out the errors so you could improve for next time.

But here is some things to think about.  If every row were from a normal distribution and you use an alpha of 0.05, then with 20,000 rows you would expect to have 1,000 significant tests just by chance alone.  Also, what if all your samples are from non-normal distributions, how much power do you have with only 20 points to detect the non-normality?

So, with the large number of type I and type II errors, what meaning can you get from all of this? 

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of DB1984
> Sent: Friday, February 04, 2011 12:00 PM
> To: r-help at r-project.org
> Subject: Re: [R] Finding non-normal distributions per row of data
> frame?
> 
> 
> Hi Greg,
> 
> In addition to the reply above, to address your questions - I fully
> appreciate that my understanding of the code is basic - this is my
> first
> attempt at putting this together...
> 
> My starting point is a data frame with numeric and text columns, but I
> can
> cut columns to make a fully numeric matrix if that is easier to handle.
> 
> "apply(y, 1, shapiro.test)" works for a second dataframe, yes. I guess
> that
> I chose a bad example dataset for 'nt'!
> 
> 
> The overall aim is to test the normality of the distribution of the
> values
> in each row. I would then subset out the non-normal distributions to
> interrogate further. The shapiro.test seems a simple first pass at
> this. I'd
> like to move on to plotting residuals of a QQplot next, to see if that
> is
> more or less sensitive at detecting non-normal distributions in the
> dataset.
> 
> If you would recommend an alternative approach, I'd appreciate the
> input,
> thanks..
> --
> View this message in context: http://r.789695.n4.nabble.com/Finding-
> non-normal-distributions-per-row-of-data-frame-tp3259439p3260812.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list