[R] Some questions about R's modelling algebra

Hadley Wickham hadley at rice.edu
Fri Jul 2 15:59:53 CEST 2010


Hi all,

In preparation for teaching a class next week, I've been reviewing R's
standard modelling algebra. I've used it for a long time and have a
pretty good intuitive feel for how it works, but would like to
understand more of the technical details. The best (online) reference
I've found so far is the section in "An Introduction to R"
(http://cran.r-project.org/doc/manuals/R-intro.html#Formulae-for-statistical-models).
Does anyone have any other suggestions?

I have a few questions about the definitions given in "An Introduction to R":

 * "M_1 : M_2 - The tensor product of M_1 and M_2. If both terms are
factors, then the “subclasses” factor."

   From my reading, the usual interpretation of a tensor product when
x and y are vectors is the outer product.  I don't see how that would
work here - how does a matrix work as an predictor in a linear model?
In what sense is the tensor product of x with itself equal to x?

  What is the subclasses factor? Is it interaction(M_1, M_2, sep = "")?

 * "M_1 %in% M_2 - Similar to M_1:M_2, but with a different coding."

  How is the coding different?

  Where is %in% documented within R?  I'm pretty sure it's a different
action to ?"%in%, and it's not mentioned in ?formula

I have also read G. N. Wilkinson and C. E. Rogers. Symbolic
descriptions of factorial models for analysis of variance. Journal of
the Royal Statistical Society. Series C (Applied Statistics),
22:392–399, 1973. - Can anyone comment on any important differences to
R's modelling algebra? What does %in% correspond to in Wilkinson and
Rogers' framework?

Thanks!

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/



More information about the R-help mailing list