[R] Determining Total Number of Multiple Comparisons
Nick Negovetich
nj.negovetich at gmail.com
Fri Mar 14 19:15:22 CET 2014
Greetings,
I'm running a series of Chi-square tests to examine differences across
categorical variables. The situation is this:
I have three variables: sex (M/F), habitat (5 levels), season
(W,Sp,Su,F). A Cochran-Mantel-Haenzel test detects non-indepedence
across my sex strata. I then subsetted my data into males (mat.M) and
females (mat.F). Within each sex, I investigated independence between
habitat and seasons (ex., chisq.test(mat.M)). This is essentially a
multiple comparison test, so I'm correcting my p-value using
p.adjust(). My question pertains to 'n' in this function, and how 'n'
is calculated as subsets of data are used to tease out the differences
in habitat use across seasons.
Q1. Am I correct to specify 'n=2' when performing the test of
independence for both male and female data?
example: p.adjust(chisq.test(mat.M)$p.value,n=2,method='bonferroni')
Non-independence was detected for both male and female subsets. Now, I'm
interested in seasonal changes in habitat use, which would require
additional multiple comparison tests. Thus, I have another question
regarding the specification of 'n'.
Q2. If I examined the seasonal changes within males using prop.test(),
do I add up all multiple comparisons that will be performed (female
included), or just the number of tests that will be performed using the
male data? The difference is n=5 for male only vs n=10 for both sexes.
Here's an example. Habitat types are Forest, Field, Crops, River,
Other, and these are the rownames of my matrix (males only)
pval <- prop.test(mat.M['Forest',], colSums(mat.M))$p.value
p.adjust(pval,n=5,method='bonferroni')
Lastly, I have detected differences in habitat use across seasons. I now
want to determine which seasons are different within a specific habitat
type. Like before, I can pull out the count data and run a series of
prop.test() for all 6 comparisons (W vs Sp, W vs Su, W vs F, Sp vs Su,
Sp vs F, Su vs F). This leads to my final questions.
Q3. Does 'n' in this case refer to only the 6 comparisons within a
habitat type within a sex, or will I need to account for ALL tests that
will be performed (n=2 sex * 5 habitats * 6 pairwise seasonal
comparisons = 60 max)? I will not run pairwise seasonal comparisons for
any habitat type that gives a non-significant p-value according to Q2
above.
Thanks for the help...
More information about the R-help
mailing list