R-beta: formula() and model formulae

Ross Ihaka ihaka at stat.auckland.ac.nz
Fri May 9 00:42:30 CEST 1997

Bill Venables writes:
 > There is an anomaly in the way : and ^ terms are handled in the
 > sense that the logical and useful thing is obvious but does not
 > happen.  Let me give an example.  Suppose a and b are factors, x
 > and y are not.
 > A term such as (a + b + x + y)^2 should be expanded out binomial
 > fashion, coefficients stripped away and the remaining products
 > treated as : products.  Then S copes with terms like a:a, a:b and
 > a:x fine, even x:y is handled by having it generate a column of
 > xy-products, as it should.
 > But a term such as x:x does not generate a column of x-squares,
 > it is merely removed as it would be if it were a factor.  This is
 > a complete anomaly, and one that I don't think would be hard or
 > dangerous for R to rectify.  Indeed it would be very useful to
 > generate a complete second degree regression in three variables
 > using y ~ (1 + x1 + x2 + x3)^2.  As it is now it generates linear
 > and product terms only and omits the powers.  Go figure.

I agree that this is a problem (its certainly bitten me), however
there is problem in implementing things this way.  All the model
manipulations are carried out by the "terms" function which only sees
a formula and is blissfully unaware of whether variables are factors
or numeric.  It then has no basis for deciding between
	a:a -> a	(factors)
	x:x -> x^2	(variables)
I supose that we could make "terms" a bit more context aware, but
sometimes it's useful to use it for its purely symbolic effect.
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list