[R-sig-teaching] I need your thoughts on teaching with R
Bret Larget
larget at stat.wisc.edu
Tue Mar 31 20:11:10 CEST 2009
> I tend to use t-tests after examining normal probability plots and,
> possibly, considering transformation. I believe they would be more
> powerful than permutation tests but that may be incorrect. Can you
> describe situations in which you would prefer permutation tests to
> t-tests?
Here is the reason I prefer permutation tests, besides the conceptual
simplicity. T-tests are based on a normal sampling model, and my
perception is that very few data sets to which people apply t-tests
actually arise from random samples. Here I mean specifically that the
person gathering data used a genuine random sampling method to select
observations from a population. Survey samples would be an exception. I
think it is far more frequently the case that a study takes whatever
units/subjects are at hand and separates these into groups either through
random assignment or based on some categorical variable. The permutation
test directly answers the question about how a measured difference between
group averages may have been different than if the groups had been formed
in a different way. Then, instead of making the objectional argument that
"we will treat the data as if it were a representative sample from the
population of interest", and then using a t-test justified by a false
random sampling argument and making inferences to some larger population,
I find it much more justifiable to model the randomness that was truly
part of the data gathering (random assignment). Any inference to other
populations is then justified on the basis of background information (the
groups I am interested in are similar to the groups in the study, so maybe
the results there apply here too) and not by random sampling. It is
important to describe how the units/subjects were selected and let the
reader determine how applicable the results are to other populations.
When data is not collected by a random sample, the t-test can still be
justified either as an approximation to the permutation test (but, as Doug
would say, why approximate when you can use the computer to do the real thing)
or if a normal model for the data is ASSUMED and not concluded in reference to
the central limit theorem and random sampling that did not occur.
I would be very interested if readers of this message can send me specific
reference to the use of a t-test with real data in an introductory text
book for which the individual objects were genuinely sampled at random
from populations.
-Bret
More information about the R-sig-teaching
mailing list