[Rd] terms function bug?
Patrick O'Reilly
patrick.a.oreilly at gmail.com
Wed Oct 29 12:31:20 CET 2014
Hi,
I've noticed something strange when using the terms {stats} function.
R documentation describes the factors attribute of the terms.object as follows:
A matrix of variables by terms showing which variables appear in which terms.
The entries are 0 if the variable does not occur in the term, 1 if it does occur
and should be coded by contrasts, and 2 if it occurs and should be coded via
dummy variables for all levels (as when an intercept or lower-order term is
missing). If there are no terms other than an intercept and offsets, this is
numeric(0).
(http://stat.ethz.ch/R-manual/R-patched/library/stats/html/terms.object.html)
In the example below, I would expect Species to have a value of 2 since the
intercept is omitted. Indeed, when using model.matrix it is clear that Species
has been coded with dummy variables for all three levels.
f <- ~ -1 + Species
attr(terms(f, data=iris), "factors")
# Species
#Species 1
levels(iris$Species)
#[1] "setosa" "versicolor" "virginica"
colnames(model.matrix(f, iris))
#[1] "Speciessetosa" "Speciesversicolor" "Speciesvirginica"
Is this a bug?
Many thanks in advance,
Pat
More information about the R-devel
mailing list