[R] how to check if a variable is preferentially present in a sample
Tania Oh
tania.oh at bnc.ox.ac.uk
Tue Apr 8 17:24:25 CEST 2008
Dear All,
I do apologise if this question is out of place for this list but I've
tried searching mailing lists and read "Introductory Statistics with
R" by Peter Dalgaard, but couldn't find any hints on solving my
question below:
I have a data frame (d) of values which I will rank in decreasing
order of "val". Each value belongs to a group, either 'A', 'B', 'C',
'D', or 'E'. I then take the first 10 entries in data frame 'd' and
count the number of occurrences for each of the groups. I want to
test if certain groups occur more frequently than by chance in my
first 10 entries. Would a chi-square test or a hypergeometric test be
more suitable? If neither, what would be an alternative solution in
R? Below is my data:
## data
L5 <- LETTERS[1:5]
d <- data.frame(cbind(val= rnorm(1:10)^2, group=sample(L5,100,
repl=TRUE)))
str(d)
##'data.frame': 100 obs. of 2 variables:
##$ val : Factor w/ 10 levels "0.000169268449333046",..: 10 3 5 6 1 2
7 8 4 9 ...
##$ group: Factor w/ 5 levels "A","B","C","D",..: 4 4 4 5 3 1 5 2 1
2 ...
Many thanks in advance and apologies again,
tania
D. phil student
Department of Physiology, Anatomy and Genetics
University of Oxford
More information about the R-help
mailing list