[Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments

Danny Smith d@nny @end|ng |rom gorch@@org
Mon May 20 05:23:39 CEST 2019


Hi Abs,

Re: your last point:

> You made an interesting comment.
>

> > This is not
> > always the desired behavior, because formulas are increasingly used
> > for purposes other than specifying linear models.
>
> Can I ask what these purposes are?



Not sure how relevant these are/what Pavel was referring to specifically,
but there are a few alternative uses that I'm familiar with in the
tidyverse packages.

Since formulas store both an expression and an environment they're really
useful for complex evaluation. rlang's "quosures" are a subclass of formula
<https://adv-r.hadley.nz/evaluation.html#quosure-impl>.

Othewise the main tidyverse use is a shorthand for specifying anonymous
functions (this is used extensively, particularly in purrr). From
?dplyr::mutate_at:
# You can also pass formulas to create functions on the spot, purrr-style:
starwars %>% mutate_at(c("height", "mass"), ~scale2(., na.rm = TRUE))

Also see ?dplyr::case_when:
x <- 1:50
case_when(
  x %% 35 == 0 ~ "fizz buzz",
  x %% 5 == 0 ~ "fizz",
  x %% 7 == 0 ~ "buzz",
  TRUE ~ as.character(x)
)

And in base R, formulas are used in the plotting functions, e.g.:
## boxplot on a formula:
boxplot(count ~ spray, data = InsectSprays, col = "lightgray")

Cheers,
Danny

On Mon, May 20, 2019 at 12:12 PM Abby Spurdle <spurdle.a using gmail.com> wrote:

> Hi Pavel
> (Back On List)
>
> And my two cents...
>
> > At this time, the update.formula() method always performs a number of
> > transformations on the results, eliminating redundant variables and
> > reordering interactions to be after the main effects.
> > This the proposal is to add an option simplify= (defaulting to TRUE,
> > for backwards compatibility) that if FALSE will skip the simplification
> > step.
> > Any thoughts? One particular question that Martin raised is whether the
> > UI should be just a single logical argument, or something else.
>
> Firstly, note that the constructor for formula objects behaves differently
> to the update method, so I think any changes should be consistent between
> the two functions.
> > #constructor - doesn't simplify
> > y ~ x + x
> y ~ x + x
> > #update method - does simplify
> > update (y ~ x, ~. + x)
> y ~ x
>
> Interestingly, this doesn't simplify.
> > update (y ~ I (x), ~. + x)
> y ~ I(x) + x
>
> I think that simplification could mean different things.
> So, there could be something like:
> > update (y ~ x, ~. + x, strip=FALSE)
> y ~ I (2 * x)
>
> I don't know how easy that would be to implement.
> (Symbolic computation on par with computer algebra systems is a discussion
> in itself...).
> And you could have one argument (say, method="simplify") rather than two or
> more logical arguments.
>
> It would also be possible to allow partial forms of simplification, by
> specifying which terms should be collapsed, however, I doubt any possible
> usefulness of this, would justify the complexity.
> However, feel free to disagree.
>
> You made an interesting comment.
>
> > This is not
> > always the desired behavior, because formulas are increasingly used
> > for purposes other than specifying linear models.
>
> Can I ask what these purposes are?
>
>
> kind regards
> Abs
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list