[Rd] Fwd: Re: [EXTERNAL] Re: backquotes and term.labels

Therneau, Terry M., Ph.D. therneau at mayo.edu
Thu Mar 8 15:39:42 CET 2018


Ben,

Looking at your notes, it appears that your solution is to write your own terms() function
for lme.  It is easy to verify that the "varnames.fixed" attribute is not returned by the
ususal terms function.

Then I also need to write my own terms function for the survival and coxme pacakges?
Because of the need to treat strata() terms in a special way I manipulate the
formula/terms in nearly every routine.

Extrapolating: every R package that tries to examine formulas and partition them into bits
needs its own terms function?  This does not look like a good solution to me.

On 03/07/2018 07:39 AM, Ben Bolker wrote:
> I knew I had seen this before but couldn't previously remember where.
> https://github.com/lme4/lme4/issues/441 ... I initially fixed with
> gsub(), but (pushed by Martin Maechler to do better) I eventually
> fixed it by storing the original names of the model frame (without
> backticks) as an attribute for later retrieval:
> https://github.com/lme4/lme4/commit/56416fc8b3b5153df7df5547082835c5d5725e89.
>
>
> On Wed, Mar 7, 2018 at 8:22 AM, Therneau, Terry M., Ph.D. via R-devel
> <r-devel at r-project.org> wrote:
>> Thanks to Bill Dunlap for the clarification.  On follow-up it turns out that
>> this will be an issue for many if not most of the routines in the survival
>> package: a lot of them look at the terms structure and make use of the
>> dimnames of attr(terms, 'factors'), which also keeps the unneeded
>> backquotes.  Others use the term.labels attribute.  To dodge this I will
>> need to create a fixterms() routine which I call at the top of every single
>> routine in the library.
>>
>> Is there a chance for a fix at a higher level?
>>
>> Terry T.
>>
>>
>>
>> On 03/05/2018 03:55 PM, William Dunlap wrote:
>>> I believe this has to do terms() making "term.labels" (hence the dimnames
>>> of "factors")
>>> with deparse(), so that the backquotes are included for non-syntactic
>>> names.  The backquotes
>>> are not in the column names of the input data.frame (nor model frame) so
>>> you get a mismatch
>>> when subscripting the data.frame or model.frame with elements of
>>> terms()$term.labels.
>>>
>>> I think you can avoid the problem by adding right after
>>>       ll <- attr(Terms, "term.labels")
>>> the line
>>>       ll <- gsub("^`|`$", "", ll)
>>>
>>> E.g.,
>>>
>>>   > d <- data.frame(check.names=FALSE, y=1/(1:5), `b$a$d`=sin(1:5)+2, `x y
>>> z`=cos(1:5)+2)
>>>   > Terms <- terms( y ~ log(`b$a$d`) + `x y z` )
>>>   > m <- model.frame(Terms, data=d)
>>>   > colnames(m)
>>> [1] "y"            "log(`b$a$d`)" "x y z"
>>>   > attr(Terms, "term.labels")
>>> [1] "log(`b$a$d`)" "`x y z`"
>>>   >   ll <- attr(Terms, "term.labels")
>>>   > gsub("^`|`$", "", ll)
>>> [1] "log(`b$a$d`)" "x y z"
>>>
>>> It is a bit of a mess.
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com <http://tibco.com>
>>>
>>> On Mon, Mar 5, 2018 at 12:55 PM, Therneau, Terry M., Ph.D. via R-devel
>>> <r-devel at r-project.org <mailto:r-devel at r-project.org>> wrote:
>>>
>>>      A user reported a problem with the survdiff function and the use of
>>> variables that
>>>      contain a space.  Here is a simple example.  The same issue occurs in
>>> survfit for the
>>>      same reason.
>>>
>>>      lung2 <- lung
>>>      names(lung2)[1] <- "in st"   # old name is inst
>>>      survdiff(Surv(time, status) ~ `in st`, data=lung2)
>>>      Error in `[.data.frame`(m, ll) : undefined columns selected
>>>
>>>      In the body of the code the program want to send all of the right-hand
>>> side variables
>>>      forward to the strata() function.  The code looks more or less like
>>> this, where m is
>>>      the model frame
>>>
>>>         Terms <- terms(m)
>>>         index <- attr(Terms, "term.labels")
>>>         if (length(index) ==0)  X <- rep(1L, n)  # no coariates
>>>         else X <- strata(m[index])
>>>
>>>      For the variable with a space in the name the term.label is "`in st`",
>>> and the
>>>      subscript fails.
>>>
>>>      Is this intended behaviour or a bug?  The issue is that the name of
>>> this column in the
>>>      model frame does not have the backtics, while the terms structure does
>>> have them.
>>>
>>>      Terry T.
>>>
>>>      ______________________________________________
>>>      R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
>>>      https://stat.ethz.ch/mailman/listinfo/r-devel
>>>      <https://stat.ethz.ch/mailman/listinfo/r-devel>
>>>
>>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list