[Rd] Fwd: Re: [EXTERNAL] Re: backquotes and term.labels

Martin Maechler maechler at stat.math.ethz.ch
Thu Mar 8 16:07:12 CET 2018


>>>>> Ben Bolker <bbolker at gmail.com>
>>>>>     on Thu, 8 Mar 2018 09:42:40 -0500 writes:

    > Meant to respond to this but forgot.
    > I didn't write a new terms() function  -- I added an attribute to the
    > terms() (a vector of the names
    > of the constructed model matrix), thus preserving the information at
    > the point when it was available.
    > I do agree that it would be preferable to have an upstream fix ...

did anybody ever propose a small patch to the upstream sources ?

-- including a REPREX (or 2: one for lme4, one for survival) 

I'm open to look at one .. not for the next few days, though.

Martin


    > On Thu, Mar 8, 2018 at 9:39 AM, Therneau, Terry M., Ph.D. via R-devel
    > <r-devel at r-project.org> wrote:
    >> Ben,
    >> 
    >> 
    >> Looking at your notes, it appears that your solution is to write your own
    >> terms() function
    >> for lme.  It is easy to verify that the "varnames.fixed" attribute is not
    >> returned by the
    >> ususal terms function.
    >> 
    >> Then I also need to write my own terms function for the survival and coxme
    >> pacakges?
    >> Because of the need to treat strata() terms in a special way I manipulate
    >> the
    >> formula/terms in nearly every routine.
    >> 
    >> Extrapolating: every R package that tries to examine formulas and partition
    >> them into bits
    >> needs its own terms function?  This does not look like a good solution to
    >> me.
    >> 
    >> On 03/07/2018 07:39 AM, Ben Bolker wrote:
    >>> 
    >>> I knew I had seen this before but couldn't previously remember where.
    >>> https://github.com/lme4/lme4/issues/441 ... I initially fixed with
    >>> gsub(), but (pushed by Martin Maechler to do better) I eventually
    >>> fixed it by storing the original names of the model frame (without
    >>> backticks) as an attribute for later retrieval:
    >>> 
    >>> https://github.com/lme4/lme4/commit/56416fc8b3b5153df7df5547082835c5d5725e89.
    >>> 
    >>> 
    >>> On Wed, Mar 7, 2018 at 8:22 AM, Therneau, Terry M., Ph.D. via R-devel
    >>> <r-devel at r-project.org> wrote:
    >>>> 
    >>>> Thanks to Bill Dunlap for the clarification.  On follow-up it turns out
    >>>> that
    >>>> this will be an issue for many if not most of the routines in the
    >>>> survival
    >>>> package: a lot of them look at the terms structure and make use of the
    >>>> dimnames of attr(terms, 'factors'), which also keeps the unneeded
    >>>> backquotes.  Others use the term.labels attribute.  To dodge this I will
    >>>> need to create a fixterms() routine which I call at the top of every
    >>>> single
    >>>> routine in the library.
    >>>> 
    >>>> Is there a chance for a fix at a higher level?
    >>>> 
    >>>> Terry T.
    >>>> 
    >>>> 
    >>>> 
    >>>> On 03/05/2018 03:55 PM, William Dunlap wrote:
    >>>>> 
    >>>>> I believe this has to do terms() making "term.labels" (hence the
    >>>>> dimnames
    >>>>> of "factors")
    >>>>> with deparse(), so that the backquotes are included for non-syntactic
    >>>>> names.  The backquotes
    >>>>> are not in the column names of the input data.frame (nor model frame) so
    >>>>> you get a mismatch
    >>>>> when subscripting the data.frame or model.frame with elements of
    >>>>> terms()$term.labels.
    >>>>> 
    >>>>> I think you can avoid the problem by adding right after
    >>>>> ll <- attr(Terms, "term.labels")
    >>>>> the line
    >>>>> ll <- gsub("^`|`$", "", ll)
    >>>>> 
    >>>>> E.g.,
    >>>>> 
    >>>>> > d <- data.frame(check.names=FALSE, y=1/(1:5), `b$a$d`=sin(1:5)+2, `x
    >>>>> y
    >>>>> z`=cos(1:5)+2)
    >>>>> > Terms <- terms( y ~ log(`b$a$d`) + `x y z` )
    >>>>> > m <- model.frame(Terms, data=d)
    >>>>> > colnames(m)
    >>>>> [1] "y"            "log(`b$a$d`)" "x y z"
    >>>>> > attr(Terms, "term.labels")
    >>>>> [1] "log(`b$a$d`)" "`x y z`"
    >>>>> >   ll <- attr(Terms, "term.labels")
    >>>>> > gsub("^`|`$", "", ll)
    >>>>> [1] "log(`b$a$d`)" "x y z"
    >>>>> 
    >>>>> It is a bit of a mess.
    >>>>> 
    >>>>> 
    >>>>> Bill Dunlap
    >>>>> TIBCO Software
    >>>>> wdunlap tibco.com <http://tibco.com>
    >>>>> 
    >>>>> On Mon, Mar 5, 2018 at 12:55 PM, Therneau, Terry M., Ph.D. via R-devel
    >>>>> <r-devel at r-project.org <mailto:r-devel at r-project.org>> wrote:
    >>>>> 
    >>>>> A user reported a problem with the survdiff function and the use of
    >>>>> variables that
    >>>>> contain a space.  Here is a simple example.  The same issue occurs
    >>>>> in
    >>>>> survfit for the
    >>>>> same reason.
    >>>>> 
    >>>>> lung2 <- lung
    >>>>> names(lung2)[1] <- "in st"   # old name is inst
    >>>>> survdiff(Surv(time, status) ~ `in st`, data=lung2)
    >>>>> Error in `[.data.frame`(m, ll) : undefined columns selected
    >>>>> 
    >>>>> In the body of the code the program want to send all of the
    >>>>> right-hand
    >>>>> side variables
    >>>>> forward to the strata() function.  The code looks more or less like
    >>>>> this, where m is
    >>>>> the model frame
    >>>>> 
    >>>>> Terms <- terms(m)
    >>>>> index <- attr(Terms, "term.labels")
    >>>>> if (length(index) ==0)  X <- rep(1L, n)  # no coariates
    >>>>> else X <- strata(m[index])
    >>>>> 
    >>>>> For the variable with a space in the name the term.label is "`in
    >>>>> st`",
    >>>>> and the
    >>>>> subscript fails.
    >>>>> 
    >>>>> Is this intended behaviour or a bug?  The issue is that the name of
    >>>>> this column in the
    >>>>> model frame does not have the backtics, while the terms structure
    >>>>> does
    >>>>> have them.
    >>>>> 
    >>>>> Terry T.
    >>>>> 
    >>>>> ______________________________________________
    >>>>> R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
    >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >>>>> <https://stat.ethz.ch/mailman/listinfo/r-devel>
    >>>>> 
    >>>>> 
    >>>> ______________________________________________
    >>>> R-devel at r-project.org mailing list
    >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel

    > ______________________________________________
    > R-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list