[R] abbreviating words in a model formula
William Dunlap
wdunlap at tibco.com
Mon Jul 8 20:54:54 CEST 2013
The call to all.names() below probably should have the unique=TRUE argument.
It doesn't make any difference in this particular code, but having repeated names
could cause problems in related code.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of William Dunlap
> Sent: Monday, July 08, 2013 11:02 AM
> To: Michael Friendly; R-help
> Subject: Re: [R] abbreviating words in a model formula
>
> Try using all.names() to get all the names in the formula. E.g.,
>
> f <- function (formula, minNameLength = 2, abbreviateFunctionNames = FALSE)
> {
> names <- all.names(formula, functions = abbreviateFunctionNames)
> abbrNames <- lapply(abbreviate(names, minlength = minNameLength),
> as.name)
> deparse(do.call("substitute", list(formula, abbrNames)))
> }
>
> used as
> > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor))
> [1] "MR ~ log(FP) + sqrt(SP)"
> > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor), min=4)
> [1] "MyRs ~ log(FrsP) + sqrt(ScnP)"
> > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor),
> abbreviateFunctionNames=TRUE)
> [1] "MR ~ lg(FP) + sq(SP)"
>
> You could put that in a loop that stopped when nchar(f(...)) got small enough.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> > Of Michael Friendly
> > Sent: Monday, July 08, 2013 10:36 AM
> > To: R-help
> > Subject: [R] abbreviating words in a model formula
> >
> > For an application, I need to get a character string representation of
> > the formula or
> > model call for glm objects, but also, for labeling output and plots, I
> > want to be able
> > to abbreviate the words (variables) in model terms. This requires some
> > formula
> > magic that I can't quite get, in particular extracting the terms from a
> > formula and
> > then the words in each term.
> >
> > Perhaps there is some code for something similar
> > I haven't found yet, or someone can suggest how to do this.
> >
> > A runnable example to show what I mean:
> >
> > Freq <- c(68,42,42,30, 37,52,24,43,
> > 66,50,33,23, 47,55,23,47,
> > 63,53,29,27, 57,49,19,29)
> >
> > Temperature <- gl(2, 2, 24, labels = c("Low", "High"))
> > Softness <- gl(3, 8, 24, labels = c("Hard","Medium","Soft"))
> > M.user <- gl(2, 4, 24, labels = c("N", "Y"))
> > Brand <- gl(2, 1, 24, labels = c("X", "M"))
> >
> > detg <- data.frame(Freq,Temperature, Softness, M.user, Brand)
> > detg.m0 <- glm(Freq ~ M.user*Temperature*Softness +
> > Brand*M.user*Temperature,
> > family = poisson, data = detg)
> >
> > detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand),
> > family = poisson, data=detg)
> >
> > detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2,
> > family = poisson, data=detg)
> >
> > detg.m2a <- update(detg.m1, . ~ .^2)
> >
> > In plot.lm, I found the following code to extract the model call from a
> > glm object as
> > a string and abbreviate it to a total length <=75. I need shorter total
> > length,
> > by abbreviating individual words in the model call, so the approach has to
> > at least extract the terms in the model and then abbreviate the words in
> > each term.
> >
> > # from plot.lm: get model call as a string
> > # TODO: how to use abbreviate to abbreviate the words in the model terms???
> > mod.call <- function(x, max.len=75) {
> > cal <- x$call
> > if (!is.na(m.f <- match("formula", names(cal)))) {
> > cal <- cal[c(1, m.f)]
> > names(cal)[2L] <- ""
> > }
> > cc <- deparse(cal, max.len+5)
> > nc <- nchar(cc[1L], "c")
> > abbr <- length(cc) > 1 || nc > max.len
> > cap <- if (abbr)
> > paste(substr(cc[1L], 1L, min(max.len, nc)), "...")
> > else cc[1L]
> > cap
> > }
> >
> > Tests, & WANTED, say with max length of each word in the string <= 6 &
> > maximum total
> > length <= 40
> >
> > > mod.call(detg.m0)
> > [1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user *
> > Temperature)"
> >
> > WANTED, somthing like:
> > "glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)"
> >
> > > mod.call(detg.m2a)
> > [1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
> > M.user:Temperature + M ..."
> > >
> > > mod.call(detg.m2a, max.len=200)
> > [1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
> > M.user:Temperature + M.user:Softness + M.user:Brand +
> > Temperature:Softness + Temperature:Brand + Softness:Brand)"
> > >
> >
> > WANTED, somthing closer to
> > "glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft
> > + Tmp:Brnd + Sft:Brnd)"
> >
> > TIA
> > -Michael
> >
> >
> >
> > --
> > Michael Friendly Email: friendly AT yorku DOT ca
> > Professor, Psychology Dept. & Chair, Quantitative Methods
> > York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
> > 4700 Keele Street Web: http://www.datavis.ca
> > Toronto, ONT M3J 1P3 CANADA
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list