[Rd] issue with model.frame()

William Dunlap wdunl@p @ending from tibco@com
Tue May 1 20:38:18 CEST 2018


You run into the same problem when using 'non-syntactical' names:

> mfB <- model.frame(y ~ `Temp(C)` + `Pres(mb)`,
data=data.frame(check.names=FALSE, y=1:10, `Temp(C)`=21:30,
`Pres(mb)`=991:1000))
> match(attr(terms(mfB), "term.labels"), names(mfB))   # gives NA's
[1] NA NA
> attr(terms(mfB), "term.labels")
[1] "`Temp(C)`"  "`Pres(mb)`"
> names(mfB)
[1] "y"        "Temp(C)"  "Pres(mb)"

Note that names(mfB) does not give a hint as whether they represent R
expressions or not (in this case they do not).  When they do represent R
expressions then one could parse() them and compare them to
as.list(attr(mfB),"variables")[-1]).


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, May 1, 2018 at 6:11 AM, Therneau, Terry M., Ph.D. via R-devel <
r-devel at r-project.org> wrote:

> A user sent me an example where coxph fails, and the root of the failure
> is a case where names(mf) is not equal to the term.labels attribute of the
> formula -- the latter has an extraneous newline. Here is an example that
> does not use the survival library.
>
> # first create a data set with many long names
> n <- 30  # number of rows for the dummy data set
> vname <- vector("character", 26)
> for (i in 1:26) vname[i] <- paste(rep(letters[1:i],2), collapse='')  #
> long variable names
>
> tdata <- data.frame(y=1:n, matrix(runif(n*26), nrow=n))
> names(tdata) <- c('y', vname)
>
> # Use it in a formula
> myform <- paste("y ~ cbind(", paste(vname, collapse=", "), ")")
> mf <- model.frame(formula(myform), data=tdata)
>
> match(attr(terms(mf), "term.labels"), names(mf))   # gives NA
>
> ----
>
> In the user's case the function is ridge(x1, x2, ....) rather than cbind,
> but the effect is the same.
> Any ideas for a work around?
>
> Aside: the ridge() function is very simple, it was added as an example to
> show how a user can add their own penalization to coxph.  I never expected
> serious use of it.  For this particular user the best answer is to use
> glmnet instead.   He/she is trying to apply an L2 penalty to a large number
> of SNP * covariate interactions.
>
> Terry T.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]




More information about the R-devel mailing list