[R] abbreviating words in a model formula

Mon Jul 8 20:02:27 CEST 2013

Try using all.names() to get all the names in the formula.  E.g.,

f <- function (formula, minNameLength = 2, abbreviateFunctionNames = FALSE)
{
    names <- all.names(formula, functions = abbreviateFunctionNames)
    abbrNames <- lapply(abbreviate(names, minlength = minNameLength),
        as.name)
    deparse(do.call("substitute", list(formula, abbrNames)))
}

used as
  > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor))
  [1] "MR ~ log(FP) + sqrt(SP)"
  > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor), min=4)
  [1] "MyRs ~ log(FrsP) + sqrt(ScnP)"
  > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor), abbreviateFunctionNames=TRUE)
  [1] "MR ~ lg(FP) + sq(SP)"

You could put that in a loop that stopped when nchar(f(...)) got small enough.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Michael Friendly
> Sent: Monday, July 08, 2013 10:36 AM
> To: R-help
> Subject: [R] abbreviating words in a model formula
> 
> For an application, I need to get a character string representation of
> the formula or
> model call for glm objects, but also, for labeling output and plots, I
> want to be able
> to abbreviate the words (variables) in model terms.  This requires some
> formula
> magic that I can't quite get, in particular extracting the terms from a
> formula and
> then the words in each term.
> 
> Perhaps there is some code for something similar
> I haven't found yet, or someone can suggest how to do this.
> 
> A runnable example to show what I mean:
> 
> Freq <- c(68,42,42,30, 37,52,24,43,
>      66,50,33,23, 47,55,23,47,
>      63,53,29,27, 57,49,19,29)
> 
> Temperature <- gl(2, 2, 24, labels = c("Low", "High"))
> Softness <- gl(3, 8, 24, labels = c("Hard","Medium","Soft"))
> M.user <- gl(2, 4, 24, labels = c("N", "Y"))
> Brand <- gl(2, 1, 24, labels = c("X", "M"))
> 
> detg <- data.frame(Freq,Temperature, Softness, M.user, Brand)
> detg.m0 <- glm(Freq ~ M.user*Temperature*Softness +
> Brand*M.user*Temperature,
>         family = poisson, data = detg)
> 
> detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand),
>         family = poisson, data=detg)
> 
> detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2,
>         family = poisson, data=detg)
> 
> detg.m2a <- update(detg.m1, . ~ .^2)
> 
> In plot.lm, I found the following code to extract the model call from a
> glm object as
> a string and abbreviate it to a total length <=75.  I need shorter total
> length,
> by abbreviating individual words in the model call, so the approach has to
> at least extract the terms in the model and then abbreviate the words in
> each term.
> 
> # from plot.lm: get model call as a string
> # TODO: how to use abbreviate to abbreviate the words in the model terms???
> mod.call <- function(x, max.len=75) {
>          cal <- x$call
>          if (!is.na(m.f <- match("formula", names(cal)))) {
>              cal <- cal[c(1, m.f)]
>              names(cal)[2L] <- ""
>          }
>          cc <- deparse(cal, max.len+5)
>          nc <- nchar(cc[1L], "c")
>          abbr <- length(cc) > 1 || nc > max.len
>          cap <- if (abbr)
>              paste(substr(cc[1L], 1L, min(max.len, nc)), "...")
>          else cc[1L]
>          cap
> }
> 
> Tests, & WANTED, say with max length of each word in the string <= 6 &
> maximum total
> length <= 40
> 
>  > mod.call(detg.m0)
> [1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user *
> Temperature)"
> 
> WANTED, somthing like:
> "glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)"
> 
>  > mod.call(detg.m2a)
> [1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
> M.user:Temperature + M ..."
>  >
>  > mod.call(detg.m2a, max.len=200)
> [1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
> M.user:Temperature + M.user:Softness + M.user:Brand +
> Temperature:Softness + Temperature:Brand + Softness:Brand)"
>  >
> 
> WANTED, somthing closer to
> "glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft
> + Tmp:Brnd + Sft:Brnd)"
> 
> TIA
> -Michael
> 
> 
> 
> --
> Michael Friendly     Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
> 4700 Keele Street    Web:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.