[R] Normal distribution

mandalic05 kopernik at sbb.rs
Mon Nov 9 00:18:10 CET 2009


Normal distribution check within R can be done with functions available in
nortest package. This package consists of several normality tests. In order
to install package type install.packages('nortest'). Afterwards, you should
consider running ks.test() only if mu and sigma parameters are known (these
stand for population arithmetic mean and variance) - and that's only
applicable if your data is gathered from the population. Therefor I
recommend using lillie.test() function, which is Lilliefors' modification of
Kolmogorov-Smirnov statistic. It's applicable both for data gathered from a
sample, but can also be applied to population data. You can run ks.test(x,
pnorm), but don't worry if you get several ties - these occur due to
rounding of values, or if your data come from descrete probability
function...

You can also try shapiro.test() function if your sample counts less then 50
responses (Shapiro-Wilks' test for small samples), or ad.test() for
Anderson-Darling normality test. You should revise these statistical
procedures in official literature, but there's also a lot of info on
wikipedia about stated statistical techniques.

If absolute value of skewness is larger than 1.96 * standard error of
skewness, your distribution significantly differs from normal. Also stands
for kurtosis. Value 1.96 implies p-value lower than .05, and 2.58 lower than
.01
Skewness function is called with skew(), and kurtosis with kurtosi()
function.

Standard error of skewness is calculated from formula se.sk <-
sqrt(6/length(x)) and standard error of kurtosis from formula se.ku <-
sqrt(24/length(x))

If mean is not located in the middle of the range, this can also indicate a
violation of normality.

I strongly recommend reading official help for nortest package, and
consulting an official statistical literature!

P.S.

shapiro.test() is located in stats package, but running it on a large sample
(N >> 50) is not quite applicable, hence use lillie.test() for those
purposes, or ks.test(x, pnorm) - where x argument is variable which
normality you're about to check, and pnorm stands for normal distribution
function.



Noela Sánchez-2 wrote:
> 
> Hi,
> 
> I am dealing with how to check in R if some data that I have belongs to a
> normal distribution or not. I am not interested in obtaining the
> theoreticall frequencies. I am only interested in determing if (by means
> of
> a test as Kolmogorov, or whatever), if my data are normal or not.
> 
> But I have tried with ks.test() and I have not got it.
> 
> 
> -- 
> Noela
> Grupo de Recursos Marinos y Pesquerías
> Universidad de A Coruña
> 
> 	[[alternative HTML version deleted]]
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://old.nabble.com/Normal-distribution-tp25702570p26259120.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list