[Rd] Fwd: Re: [EXTERNAL] Re: backquotes and term.labels
Martin Maechler
maechler at stat.math.ethz.ch
Thu Mar 8 16:07:12 CET 2018
>>>>> Ben Bolker <bbolker at gmail.com>
>>>>> on Thu, 8 Mar 2018 09:42:40 -0500 writes:
> Meant to respond to this but forgot.
> I didn't write a new terms() function -- I added an attribute to the
> terms() (a vector of the names
> of the constructed model matrix), thus preserving the information at
> the point when it was available.
> I do agree that it would be preferable to have an upstream fix ...
did anybody ever propose a small patch to the upstream sources ?
-- including a REPREX (or 2: one for lme4, one for survival)
I'm open to look at one .. not for the next few days, though.
Martin
> On Thu, Mar 8, 2018 at 9:39 AM, Therneau, Terry M., Ph.D. via R-devel
> <r-devel at r-project.org> wrote:
>> Ben,
>>
>>
>> Looking at your notes, it appears that your solution is to write your own
>> terms() function
>> for lme. It is easy to verify that the "varnames.fixed" attribute is not
>> returned by the
>> ususal terms function.
>>
>> Then I also need to write my own terms function for the survival and coxme
>> pacakges?
>> Because of the need to treat strata() terms in a special way I manipulate
>> the
>> formula/terms in nearly every routine.
>>
>> Extrapolating: every R package that tries to examine formulas and partition
>> them into bits
>> needs its own terms function? This does not look like a good solution to
>> me.
>>
>> On 03/07/2018 07:39 AM, Ben Bolker wrote:
>>>
>>> I knew I had seen this before but couldn't previously remember where.
>>> https://github.com/lme4/lme4/issues/441 ... I initially fixed with
>>> gsub(), but (pushed by Martin Maechler to do better) I eventually
>>> fixed it by storing the original names of the model frame (without
>>> backticks) as an attribute for later retrieval:
>>>
>>> https://github.com/lme4/lme4/commit/56416fc8b3b5153df7df5547082835c5d5725e89.
>>>
>>>
>>> On Wed, Mar 7, 2018 at 8:22 AM, Therneau, Terry M., Ph.D. via R-devel
>>> <r-devel at r-project.org> wrote:
>>>>
>>>> Thanks to Bill Dunlap for the clarification. On follow-up it turns out
>>>> that
>>>> this will be an issue for many if not most of the routines in the
>>>> survival
>>>> package: a lot of them look at the terms structure and make use of the
>>>> dimnames of attr(terms, 'factors'), which also keeps the unneeded
>>>> backquotes. Others use the term.labels attribute. To dodge this I will
>>>> need to create a fixterms() routine which I call at the top of every
>>>> single
>>>> routine in the library.
>>>>
>>>> Is there a chance for a fix at a higher level?
>>>>
>>>> Terry T.
>>>>
>>>>
>>>>
>>>> On 03/05/2018 03:55 PM, William Dunlap wrote:
>>>>>
>>>>> I believe this has to do terms() making "term.labels" (hence the
>>>>> dimnames
>>>>> of "factors")
>>>>> with deparse(), so that the backquotes are included for non-syntactic
>>>>> names. The backquotes
>>>>> are not in the column names of the input data.frame (nor model frame) so
>>>>> you get a mismatch
>>>>> when subscripting the data.frame or model.frame with elements of
>>>>> terms()$term.labels.
>>>>>
>>>>> I think you can avoid the problem by adding right after
>>>>> ll <- attr(Terms, "term.labels")
>>>>> the line
>>>>> ll <- gsub("^`|`$", "", ll)
>>>>>
>>>>> E.g.,
>>>>>
>>>>> > d <- data.frame(check.names=FALSE, y=1/(1:5), `b$a$d`=sin(1:5)+2, `x
>>>>> y
>>>>> z`=cos(1:5)+2)
>>>>> > Terms <- terms( y ~ log(`b$a$d`) + `x y z` )
>>>>> > m <- model.frame(Terms, data=d)
>>>>> > colnames(m)
>>>>> [1] "y" "log(`b$a$d`)" "x y z"
>>>>> > attr(Terms, "term.labels")
>>>>> [1] "log(`b$a$d`)" "`x y z`"
>>>>> > ll <- attr(Terms, "term.labels")
>>>>> > gsub("^`|`$", "", ll)
>>>>> [1] "log(`b$a$d`)" "x y z"
>>>>>
>>>>> It is a bit of a mess.
>>>>>
>>>>>
>>>>> Bill Dunlap
>>>>> TIBCO Software
>>>>> wdunlap tibco.com <http://tibco.com>
>>>>>
>>>>> On Mon, Mar 5, 2018 at 12:55 PM, Therneau, Terry M., Ph.D. via R-devel
>>>>> <r-devel at r-project.org <mailto:r-devel at r-project.org>> wrote:
>>>>>
>>>>> A user reported a problem with the survdiff function and the use of
>>>>> variables that
>>>>> contain a space. Here is a simple example. The same issue occurs
>>>>> in
>>>>> survfit for the
>>>>> same reason.
>>>>>
>>>>> lung2 <- lung
>>>>> names(lung2)[1] <- "in st" # old name is inst
>>>>> survdiff(Surv(time, status) ~ `in st`, data=lung2)
>>>>> Error in `[.data.frame`(m, ll) : undefined columns selected
>>>>>
>>>>> In the body of the code the program want to send all of the
>>>>> right-hand
>>>>> side variables
>>>>> forward to the strata() function. The code looks more or less like
>>>>> this, where m is
>>>>> the model frame
>>>>>
>>>>> Terms <- terms(m)
>>>>> index <- attr(Terms, "term.labels")
>>>>> if (length(index) ==0) X <- rep(1L, n) # no coariates
>>>>> else X <- strata(m[index])
>>>>>
>>>>> For the variable with a space in the name the term.label is "`in
>>>>> st`",
>>>>> and the
>>>>> subscript fails.
>>>>>
>>>>> Is this intended behaviour or a bug? The issue is that the name of
>>>>> this column in the
>>>>> model frame does not have the backtics, while the terms structure
>>>>> does
>>>>> have them.
>>>>>
>>>>> Terry T.
>>>>>
>>>>> ______________________________________________
>>>>> R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>> <https://stat.ethz.ch/mailman/listinfo/r-devel>
>>>>>
>>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list