[Rd] Model frame when LHS is cbind (PR#14189)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu Jan 21 20:01:11 CET 2010
A few points.
0) This seems a Wishlist item, but it does not say so (see the section
on BUGS in the FAQ).
1) A formula does not need to have an lhs, and it is an assumption
that the response is the first element of 'variables' (an assumption
not made a couple of lines later when 'resp' is used).
2) I don't think this is the best way to get names. If I do
fm <- lm(cbind(a=qsec,b=log(hp),sqrt(disp))~wt, data=mtcars)
I want a and b as names, but that is not what your code gives. And if
I do
> X <- with(mtcars, cbind(a = qsec, b = log(hp), c=sqrt(disp)))
> fm <- lm(X ~ wt, data=mtcars)
> model.frame(fm)[[1]]
[,1] [,2] [,3]
You've lost the names that the current code gives.
The logic is that if you use a lhs which is a matrix with column
names, then those names are used. If (as you did), you use one with
empty column names, that is what you get in the model frame. This
seems much more in the spirit of R than second-guessing that the
author actually meant to give column names and create them, let alone
renaming the columns to be different than the names supplied.
3) It looks to me as if you wanted
cbind(qsec, log(hp), sqrt(disp), deparse.level=2)
but that does not give names (despite the description). And that is I
think a bug that can easily be changed. That way we can fulfil yoour
wish without breaking other people's code.
On Tue, 19 Jan 2010, arnima at hafro.is wrote:
> The model frame shows the response and predictors in a data frame with
> nicely labelled columns:
>
> fm <- lm(wt~qsec+log(hp)+sqrt(disp), data=mtcars)
> model.frame(fm) # ok
>
> When the left hand side consists of more than one response, those response
> variables still look good, inside a matrix:
>
> fm <- lm(cbind(qsec,hp,disp)~wt, data=mtcars)
> model.frame(fm)[[1]] # ok
>
> A problem arises when some of the response variables are transformed:
>
> fm <- lm(cbind(qsec,log(hp),sqrt(disp))~wt, data=mtcars)
> model.frame(fm)[[1]] # ugh, missing column names
>
> The model frame is useful for many things, even more so when all column
> names are legible. Therefore I propose adding two new lines to
> model.frame.default() between lines 371 and 372 in
> R-patched_2010-01-12/src/library/stats/R/models.R:
>
> varnames <- sapply(vars, deparse, width.cutoff = 500)[-1L]
> variables <- eval(predvars, data, env)
>
> NEW if (is.matrix(variables[[1L]]))
> NEW colnames(variables[[1L]]) <- as.character(formula[[2L]])[-1L]
>
> if (is.null(rownames) && (resp <- attr(formula, "response")) >
> 0L) {
>
> With this fix, the above example returns legible column names:
>
> fm <- lm(cbind(qsec,log(hp),sqrt(disp))~wt, data=mtcars)
> model.frame(fm)[[1]] # nice column names
>
> I hope the R development team can either commit this fix or improve it.
>
> Thanks,
>
> Arni
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list