[R-sig-eco] classical statistics in R

Thu Nov 13 20:55:15 CET 2008

Although the subject line is 'classical statistics in R', the
discussion of sums of squares leads me to believe Tyler is looking for
a book on linear models in R .  There are several relevant books on
this page ( http://cran.r-project.org/other-docs.html ), and one that
comes to mind is Julian Faraway's.

Some confusion may come from the fact that historically ANOVA and
linear regression have been treated separately, when they are each
cases of the general linear model.  I don't think it's a bad thing
that ANOVA sums-of-squares tables are still taught -- understanding
how sources of variation are partitioned at various levels, along with
associated degrees of freedom is important, no matter which software
is chosen to ultimately estimate parameters. Also it's worth noting
that even if you learn about linear models from a linear algebra
approach, which is what I am guessing Tyler means by 'the linear
models framework used in R', the code to fit models has little
resemblance to what is taught in a linear models book -- i.e. you
won't see something like solve(t(X) %*% X) %*% t(X) %*% y to estimate
the betas because there are more computationally efficient methods for
matrix inversion and cross-products.

In general, I agree with the earlier statement that it's better to
first learn the statistics and then try to learn how to write the R
code.  My belief is that learning about AN(C)OVA-type analyses via
sums of squares table is effective and not at odds with using R (with
one notable exception -- estimating variance components -- see that
last paragraph of the following post for my take on that:
http://tolstoy.newcastle.edu.au/R/e4/help/08/05/11410.html ).  If you
have gotten a solid grasp on the statistics, and it is the code you
are having trouble with, then hopefully Julian's book or one of the
many others will help to clarrify things.

hope this helps,

Kingsford Jones

On Mon, Nov 10, 2008 at 3:45 PM, tyler <tyler.smith at mail.mcgill.ca> wrote:
> "Sebastian P. Luque" <spluque at gmail.com>
> writes:
>
>> In general, I would not choose a book to learn basic statistics based on
>> whether it has R content or not.  What's important is to learn the
>> concepts.  Learning how to use them in a particular software is useful,
>> but secondary.  If we're careless about this distinction, we risk
>> falling into habits promoted by most commercial software, where one
>> points and click without understanding what one is doing.  The risk is
>> there even in GNU R, as the number of functions and packages keeps
>> growing to help us save time developing procedures.  There's a balance
>> to be reached between the help received and intellectual independence.
>> For classical statistics, many books have long series of editions that
>> have made them superb with age (like good wine).  Zar's Biostatistical
>> Analysis is my favorite in this domain, but I enjoyed Sokal & Rolf too.
>>
>>
>
> That's an important point. I should clarify that, for myself, it's not
> so important to have actual R code. But the 'sums of squares' framework
> presented in S&R is, or at least appears to be, at odds with the linear
> model framework used in R. I would appreciate a reference that takes the
> same approach as that used in R, so that I can focus on learning the
> statistics.
>
> To use S&R as written, I can read through the examples, and implement
> them in low-level R code. This is tedious and inflexible. If I properly
> understood the linear modelling approach used in R, I expect I could use
> higher-level functions, and wouldn't have to re-implement each variation
> of a test from scratch. But there's a conceptual gap between R and S&R
> that I'm missing.
>
> Cheers,
>
> Tyler
>
>> Seb
>>
>>
>>
>> On Mon, 10 Nov 2008 16:11:47 -0500,
>> Brian Campbell <jacarebrazil98 at hotmail.com> wrote:
>>
>>> I conceded to R shift (mostly) last year and began Crawley (2005)
>>> Statistics: An Introduction using R.  Quinn and Keough: Experimental
>>> Design and Data Analysis for Biologists is very useful, but if given a
>>> choice of the two with the emphasis on learning R, Crawley might be
>>> preferable.  Better yet might even be the "R Book".
>>
>>> -Brian
>>
>>>> Date: Mon, 10 Nov 2008 12:30:22 -0800 From:
>>>> cparker at pdx.edu To:
>>>> r-sig-ecology at r-project.org Subject: Re:
>>>> [R-sig-eco] classical statistics in R
>>
>>>> I agree with Jordan and will also throw in Gelman and Hill's "Data
>>>> Analysis Using Regression and Multilevel/Hierarchical Models". Its a
>>>> social science based book but is very relevant to ecologists and
>>>> includes R code (and bugs code).  -Chris
>>
>>
>>>> Jordan Mayor wrote:
>>>> > Personally, I found G&E to be very helpful at only a cursory
>>>> interest level.  > Quinn & Keough's "Experimental Design and Data
>>>> Analysis for Biologists" is > a practical in-depth text that covers
>>>> allot more detail - but, alas no > R-code is provided.  In fact, it
>>>> is quite program-independent.
>>
>>>> > Cheers
>>
>>>> > On Mon, Nov 10, 2008 at 3:10 PM, tyler <tyler.smith at mail.mcgill.ca> wrote:
>>
>>
>>> >>Hi,
>>
>>>> >>I've just received my copy of Ben Bolker's new book, "Ecological
>>>> Models >>and Data in R". I was a little surprised to see he
>>>> recommended Sokal and >>Rohlf's "Biometry" as an introduction to
>>>> classical stats. Not because >>there's anything wrong with S&R, it's
>>>> comprehensive and well-written.  >>My problem with this book is that
>>>> it's written from the perspective of >>filling out tables of sums of
>>>> squares according to fixed recipes, while >>R is geared towards more
>>>> flexible linear models. Trying to translate the >>more complex
>>>> recipes into R code is not a trivial task.
>>
>>>> >>In response to an email, Ben suggested that Gotelli and Ellison's
>>>> >>"Primer of Ecological Statistics" provides a more modern take on
>>>> the >>subject than S&R. I have to agree, G&E is one of the best
>>>> intros I've >>seen for ecologists. But it doesn't really go very far
>>>> into the possible >>complexities of ANOVA and linear regression, and
>>>> doesn't specifically >>address implementing tests in R.
>>
>>>> >>Ben and I are both curious as to what other r-sig-eco readers think
>>>> >>about this issue. What are the best sources for learning about
>>>> classical >>statistics as implemented in R? S&R has been the standard
>>>> reference for >>quite a while, but it now appears to be dated. Is
>>>> there a good standard >>text that covers the same breadth of material
>>>> with a modern, R-compatible >>approach? Ben also recommended several
>>>> books by Michael Crawley - any >>strong feelings on these, or other
>>>> suggestions?
>>
>>>> >>Thanks!
>>
>>>> >>Tyler
>>
>>>> >>-- >>Research is what I'm doing when I don't know what I'm doing.
>>>> >> --Wernher von Braun
>>
>>>> >>_______________________________________________ >>R-sig-ecology
>>>> mailing list >>R-sig-ecology at r-project.org
>>>> >>https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>>
>>
>>
>>
>>
>>>> _______________________________________________ R-sig-ecology mailing
>>>> list R-sig-ecology at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>>> _________________________________________________________________
>>
>>
>>>      [[alternative HTML version deleted]]
>>
>>
>>
>> Cheers,
>
> --
> Better a botanist than a sociopath.
>                                       --Charlane Bishop
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>