[R] analyze summary data
Ben Bolker
bolker at ufl.edu
Sun Jun 25 19:13:46 CEST 2006
Thierry Girard <thierry.girard <at> unibas.ch> writes:
> I do have summary data (mean, standard deviation and sample size n)
> and want to analyze this data.
> The summary data is supposed to be from a normal distribution.
>
> I need the following calculations on this summary data (no, I do not
> have the original data):
>
> - one sample t-test against a known mu
> - two sample t-test
> - analysis of variance between 4 groups.
>
> I would appreciate any help available.
>
> One possible solution could be to simulate the data using rnorm with
> the appropriate n, mu and sd, but I don't know if there would be a
> more accurate solution.
this is the kind of situation where you need to go back to the basics --
knowing what computations these statistical tests are _actually
doing_ -- which you should be able to find in any basic stats book,
or by digging
into the guts of the R functions. The only other thing you need to
know is the R functions for cumulative distribution functions, pt
(for the t distribution) and pf (for the F dist.)
For example:
stats:::t.test.default
has lots of complicated stuff inside but the key lines are
(for a one sample test)
nx <- length(x)
df <- nx - 1
stderr <- sqrt(vx/nx)
# if you already have the standard deviation then you want
# sqrt(sd^2/nx)
tstat <- (mx - mu)/stderr ## mu is the known mean you're testing against
pval <- 2 * pt(-abs(tstat), df)
(assuming 2-tailed)
you will find similar stuff for the two-sample t-test,
depending on your particular choices.
The 1-way ANOVA might be harder to dig out of the R code;
there you're better off going back and (re)learning from
a basic stats treatment how to
compute the between-group and (pooled) within-group variances.
Bottom line is that, except for knowing about pt and pf,
this is really a basic statistics question rather than an
R question.
good luck
Ben Bolker
PS: it is too bad, but the increasing sophistication of R is
making it harder for beginners to explore the guts --- e.g.
knowing to look for "stats:::t.test.default" in order to find
the code ...
More information about the R-help
mailing list