[R] : automated levene test and other tests for variable datasets
Joachim Audenaert
Joachim.Audenaert at pcsierteelt.be
Wed Apr 15 14:23:34 CEST 2015
Thank you very much for the reply Thierry,
It was very useful for me, currently I updated my script as follows, to be
able to use the same script for different datasets:
adapting my dataset : y <- melt(dataset, na.rm=TRUE) where "na.rm = true"
ommits missing data points
variable <- y[,1]
value <- y[,2]
and then for the tests
leveneTest(value~variable,y)
apply(dataset,MARGIN=2,FUN=function(x) ks.test(x,pnorm)$p.value)
pairwise.t.test(value,variable,p.adjust.method = "none")
pairwise.wilcox.test(value,variable,p.adjust.method = "none")
Met vriendelijke groeten - With kind regards,
Joachim Audenaert
onderzoeker gewasbescherming - crop protection researcher
PCS | proefcentrum voor sierteelt - ornamental plant research
Schaessestraat 18, 9070 Destelbergen, België
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audenaert at pcsierteelt.be | W: www.pcsierteelt.be
From: Thierry Onkelinx <thierry.onkelinx at inbo.be>
To: Joachim Audenaert <Joachim.Audenaert at pcsierteelt.be>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Date: 15/04/2015 13:31
Subject: Re: [R] : automated levene test and other tests for
variable datasets
Dear Joachim,
Storing your data in a long format will make this a lot easier.
library(reshape2)
long.data <- melt(dataset, measure.var = c("A", "B", "C", "D", "E"))
library(car)
leveneTest(value ~ variable, data = long.data)
library(plyr)
ddply(long.data, "variable", function(x){ks.test(x$value})
Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data. ~ John Tukey
2015-04-14 10:07 GMT+02:00 Joachim Audenaert <
Joachim.Audenaert at pcsierteelt.be>:
Hello all,
I am writing a script for statistical comparison of means. I'm doing many
field trials with plants, where we have to compare the efficacy of
different treatments on, different groups of plants. Therefore I would
like to automate this script so it can be used for different datasets of
different experiments (which will have different dimensions). An example
dataset is given here under, I would like to compare if the data of 5
columns (A,B,C,D,E) are statistically different from each other, where A,
B, C, D and A are different treatments of my plants and I have 5
replications for this experiment
dataset <- structure(list(A = c(62, 55, 57, 103, 59), B = c(36, 24, 61,
19, 79), C = c(33, 97, 54, 48, 166), D = c(106, 82, 116, 85, 94), E =
c(32, 16, 9, 7, 46)), .Names = c("A", "B", "C", "D", "E"), row.names =
c(NA, 5L), class = "data.frame")
1) First I would like to do a levene test to check the equality of
variances of my datasets. Currently I do this as follows:
library("car")
attach(dataset)
y <- c(A,B,C,D,E)
group <- as.factor(c(rep(1, length(A)), rep(2, length(B)),rep(3,
length(C)), rep(4, length(D)),rep(5, length(E))))
leveneTest(y, group)
Is there a way to automate this for all types of datasets, so that I can
use the same script for a datasets with any number of columns of data to
compare? My above script only works for a dataset with 5 columns to
compare
2) For my boxplots I use
boxplot(dataset)
which gives me all the boxplots of each dataset, so this is how I want it
3) To check normality I currently use the kolmogorov smirnov test as
follows
ks.test(A,pnorm)
ks.test(B,pnorm)
ks.test(C,pnorm)
ks.test(D,pnorm)
ks.test(E,pnorm)
Is there a way to replace the A, B, C, ... on the five lines into one line
of entry so that the kolmogorov smirnov test is done on all columns of my
dataset at once?
4) if data is normally distributed and the variances are equal I want to
do a t-test and do pairwise comparison, currently like this
pairwise.t.test(y,group,p.adjust.method = "none")
if data is not normally distributed or variances are unequal I do a
pairwise comparison with the wilcoxon test
pairwise.wilcox.test(y,group,p.adjust.method = "none")
But again I would like to make this easier, is there a way to replace the
y and group in my datalineby something so it works for any size of
dataset?
5) Once I have my paiwise comparison results I know which groups are
statistically different from others, so I can add a and b and c to
different groups in my graph. Currently I do this on a sheet of paper by
comparing them one by one. Is there also a way to automate this? So R
gives me for example something like this
A: a
B: a
C: b
D: ab
E: c
All help and commentys are welcome. I'm quite new to R and not a
statistical genious, so if I'm overseeing things or thinking in a wrong
way please let me know how I can improve my way of working. In short I
would like to build a script that can compare the means of different
groups of data and check if they are statistically diiferent
Met vriendelijke groeten - With kind regards,
Joachim Audenaert
onderzoeker gewasbescherming - crop protection researcher
PCS | proefcentrum voor sierteelt - ornamental plant research
Schaessestraat 18, 9070 Destelbergen, België
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audenaert at pcsierteelt.be | W: www.pcsierteelt.be
Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green,
keep it on the screen!
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green,
keep it on the screen!
[[alternative HTML version deleted]]
More information about the R-help
mailing list