[R] regex challenge
Frank Harrell
f.harrell at Vanderbilt.Edu
Fri Aug 16 04:46:59 CEST 2013
Bill that is very impresive. The only problem I'm having is that I want
the paste0(toupper(...)) to be a general function that returns a
character string that is a legal part of a formula object that can't be
converted to a 'name'.
Frank
-------------------------------
Oops, I left "(" out of the list of operators.
ff <- function(expr) {
if (is.call(expr) && is.name(expr[[1]]) &&
is.element(as.character(expr[[1]]),
c("~","+","-","*","/","%in%","("))) {
for(i in seq_along(expr)[-1]) {
expr[[i]] <- Recall(expr[[i]])
}
} else if (is.name(expr)) {
expr <- as.name(paste0(toupper(as.character(expr)), "z"))
}
expr
}
> ff(a)
CATz + (AGEz + Heading("Females") * (sex == "Female") * SBPz) *
Heading() * Gz + (AGEz + SBPz) * Heading() * TRIOz ~ Heading() *
COUNTRYz * Heading() * SEXz
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf
> Of William Dunlap
> Sent: Thursday, August 15, 2013 6:03 PM
> To: Frank Harrell; RHELP
> Subject: Re: [R] regex challenge
>
> Try this one
>
> ff <- function (expr)
> {
> if (is.call(expr) && is.name(expr[[1]]) &&
> is.element(as.character(expr[[1]]), c("~", "+", "-", "*",
"/", ":", "%in%"))) {
> # the above list should cover the standard formula operators.
> for (i in seq_along(expr)[-1]) {
> expr[[i]] <- Recall(expr[[i]])
> }
> }
> else if (is.name(expr)) {
> # the conversion itself
> expr <- as.name(paste0(toupper(as.character(expr)), "z"))
> }
> expr
> }
>
> > ff(a)
> CATz + (age + Heading("Females") * (sex == "Female") * sbp) *
> Heading() * Gz + (age + sbp) * Heading() * TRIOz ~ Heading() *
> COUNTRYz * Heading() * SEXz
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -----Original Message-----
> > From: [hidden email] [mailto:[hidden email]] On Behalf
> > Of Frank Harrell
> > Sent: Thursday, August 15, 2013 4:45 PM
> > To: RHELP
> > Subject: Re: [R] regex challenge
> >
> > I really appreciate the excellent ideas from Bill Dunlap and Greg
Snow.
> > Both suggestions almost work perfectly. Greg's recognizes
expressions
> > such as sex=='female' but not ones such as age > 21, age < 21, a - b >
> > 0, and possibly other legal R expressions. Bill's idea is similar to
> > what Duncan Murdoch suggested to me. Bill's doesn't catch the case
when
> > a variable appears both in an expression and as a regular variable
(sex
> > in the example below):
> >
> > f <- function(formula) {
> > trms <- terms(formula)
> > variables <- as.list(attr(trms, "variables"))[-1]
> > ## the 'variables' attribute is stored as a call to list(),
> > ## so we changed the call to a list and removed the first element
> > ## to get the variables themselves.
> > if (attr(trms, "response") == 1) {
> > ## terms does not pull apart right hand side of formula,
> > ## so we assume each non-function is to be renamed.
> > responseVars <- lapply(all.vars(variables[[1]]), as.name)
> > variables <- variables[-1]
> > } else {
> > responseVars <- list()
> > }
> > ## omit non-name variables from list of ones to change.
> > ## This is where you could expand calls to certain functions.
> > variables <- variables[vapply(variables, is.name, TRUE)]
> > variables <- c(responseVars, variables) # all are names now
> > names(variables) <- vapply(variables, as.character, "")
> > newVars <- lapply(variables, function(v) as.name(paste0(toupper(v),
> > "z")))
> > formula(do.call("substitute", list(formula, newVars)),
> > env=environment(formula))
> > }
> >
> > a <- cat + (age + Heading("Females") * (sex == "Female") * sbp) *
> > Heading() * g + (age + sbp) * Heading() * trio ~ Heading() *
> > country * Heading() * sex
> > f(a)
> >
> > Output:
> >
> > CATz + (AGEz + Heading("Females") * (SEXz == "Female") * SBPz) *
> > Heading() * Gz + (AGEz + SBPz) * Heading() * TRIOz ~ Heading() *
> > COUNTRYz * Heading() * SEXz
> >
> > The method also doesn't work if I replace sex == 'Female' with x3 > 4,
> > converting to X3z > 4. I'm not clear on how to code what kind of
> > expressions to ignore.
> >
> > Thanks!
> > Frank
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
... [show rest of quote]
--
Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list