[R] analyze summary data
Uwe Ligges
ligges at statistik.uni-dortmund.de
Tue Jun 27 11:38:54 CEST 2006
Ben Bolker wrote:
> Thierry Girard <thierry.girard <at> unibas.ch> writes:
>
>> I do have summary data (mean, standard deviation and sample size n)
>> and want to analyze this data.
>> The summary data is supposed to be from a normal distribution.
>>
>> I need the following calculations on this summary data (no, I do not
>> have the original data):
>>
>> - one sample t-test against a known mu
>> - two sample t-test
>> - analysis of variance between 4 groups.
>>
>> I would appreciate any help available.
>>
>> One possible solution could be to simulate the data using rnorm with
>> the appropriate n, mu and sd, but I don't know if there would be a
>> more accurate solution.
>
>
> this is the kind of situation where you need to go back to the basics --
> knowing what computations these statistical tests are _actually
> doing_ -- which you should be able to find in any basic stats book,
> or by digging
> into the guts of the R functions. The only other thing you need to
> know is the R functions for cumulative distribution functions, pt
> (for the t distribution) and pf (for the F dist.)
>
> For example:
>
> stats:::t.test.default
>
> has lots of complicated stuff inside but the key lines are
> (for a one sample test)
>
> nx <- length(x)
> df <- nx - 1
> stderr <- sqrt(vx/nx)
> # if you already have the standard deviation then you want
> # sqrt(sd^2/nx)
> tstat <- (mx - mu)/stderr ## mu is the known mean you're testing against
> pval <- 2 * pt(-abs(tstat), df)
>
> (assuming 2-tailed)
>
> you will find similar stuff for the two-sample t-test,
> depending on your particular choices.
>
> The 1-way ANOVA might be harder to dig out of the R code;
> there you're better off going back and (re)learning from
> a basic stats treatment how to
> compute the between-group and (pooled) within-group variances.
>
> Bottom line is that, except for knowing about pt and pf,
> this is really a basic statistics question rather than an
> R question.
>
> good luck
> Ben Bolker
>
> PS: it is too bad, but the increasing sophistication of R is
> making it harder for beginners to explore the guts --- e.g.
> knowing to look for "stats:::t.test.default" in order to find
> the code ...
Thanks for the hint, I already had in mind writing an R Help Desk about
"Finding the code" meaning both, R source code as described above as
well as C code corresponding to the .Primitive, .C, .Call and friends'
entry points.
Maybe for the next R News issue, if nobody is willing to contribute to
the Help Desk column (hint, hint!!!).
Uwe Ligges
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list