[Rd] backquotes and term.labels
Ben Bolker
bbolker at gmail.com
Wed Mar 7 14:39:08 CET 2018
I knew I had seen this before but couldn't previously remember where.
https://github.com/lme4/lme4/issues/441 ... I initially fixed with
gsub(), but (pushed by Martin Maechler to do better) I eventually
fixed it by storing the original names of the model frame (without
backticks) as an attribute for later retrieval:
https://github.com/lme4/lme4/commit/56416fc8b3b5153df7df5547082835c5d5725e89.
On Wed, Mar 7, 2018 at 8:22 AM, Therneau, Terry M., Ph.D. via R-devel
<r-devel at r-project.org> wrote:
> Thanks to Bill Dunlap for the clarification. On follow-up it turns out that
> this will be an issue for many if not most of the routines in the survival
> package: a lot of them look at the terms structure and make use of the
> dimnames of attr(terms, 'factors'), which also keeps the unneeded
> backquotes. Others use the term.labels attribute. To dodge this I will
> need to create a fixterms() routine which I call at the top of every single
> routine in the library.
>
> Is there a chance for a fix at a higher level?
>
> Terry T.
>
>
>
> On 03/05/2018 03:55 PM, William Dunlap wrote:
>>
>> I believe this has to do terms() making "term.labels" (hence the dimnames
>> of "factors")
>> with deparse(), so that the backquotes are included for non-syntactic
>> names. The backquotes
>> are not in the column names of the input data.frame (nor model frame) so
>> you get a mismatch
>> when subscripting the data.frame or model.frame with elements of
>> terms()$term.labels.
>>
>> I think you can avoid the problem by adding right after
>> ll <- attr(Terms, "term.labels")
>> the line
>> ll <- gsub("^`|`$", "", ll)
>>
>> E.g.,
>>
>> > d <- data.frame(check.names=FALSE, y=1/(1:5), `b$a$d`=sin(1:5)+2, `x y
>> z`=cos(1:5)+2)
>> > Terms <- terms( y ~ log(`b$a$d`) + `x y z` )
>> > m <- model.frame(Terms, data=d)
>> > colnames(m)
>> [1] "y" "log(`b$a$d`)" "x y z"
>> > attr(Terms, "term.labels")
>> [1] "log(`b$a$d`)" "`x y z`"
>> > ll <- attr(Terms, "term.labels")
>> > gsub("^`|`$", "", ll)
>> [1] "log(`b$a$d`)" "x y z"
>>
>> It is a bit of a mess.
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com <http://tibco.com>
>>
>> On Mon, Mar 5, 2018 at 12:55 PM, Therneau, Terry M., Ph.D. via R-devel
>> <r-devel at r-project.org <mailto:r-devel at r-project.org>> wrote:
>>
>> A user reported a problem with the survdiff function and the use of
>> variables that
>> contain a space. Here is a simple example. The same issue occurs in
>> survfit for the
>> same reason.
>>
>> lung2 <- lung
>> names(lung2)[1] <- "in st" # old name is inst
>> survdiff(Surv(time, status) ~ `in st`, data=lung2)
>> Error in `[.data.frame`(m, ll) : undefined columns selected
>>
>> In the body of the code the program want to send all of the right-hand
>> side variables
>> forward to the strata() function. The code looks more or less like
>> this, where m is
>> the model frame
>>
>> Terms <- terms(m)
>> index <- attr(Terms, "term.labels")
>> if (length(index) ==0) X <- rep(1L, n) # no coariates
>> else X <- strata(m[index])
>>
>> For the variable with a space in the name the term.label is "`in st`",
>> and the
>> subscript fails.
>>
>> Is this intended behaviour or a bug? The issue is that the name of
>> this column in the
>> model frame does not have the backtics, while the terms structure does
>> have them.
>>
>> Terry T.
>>
>> ______________________________________________
>> R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> <https://stat.ethz.ch/mailman/listinfo/r-devel>
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list