[R] [External] Creating model formulas programmatically

Ebert,Timothy Aaron tebert @end|ng |rom u||@edu
Sun Mar 30 15:04:22 CEST 2025


Option 1)
vec <- c("a", "b")
combinations <- sapply(2:length(vec), function(x) apply(combn(vec, x), 2, paste, collapse = ":"))
vec <- c(vec, unlist(combinations))
vec

option 2) generate the interaction terms in the data frame with the data and then read the column names to get the new vector of variables. An interaction term is just multiplying the values of each variable. The analysis can do that, or you can do that. A complex answer is use base R, or use dplyr.
Base R only)
df <- data.frame(a = 1:5, b = 6:10, c = 11:15)
combinations <- combn(names(df), 2)
for (i in 1:ncol(combinations)) {
  col1 <- combinations[1, i]
  col2 <- combinations[2, i]
  df[[paste(col1, col2, sep = ":")]] <- df[[col1]] * df[[col2]]
}
df

I can avoid paste in this alternative. I have tried to get closer to the programming goal, but sacrificed ease of interpretation in the output:
df <- data.frame(a = 1:5, b = 6:10, c = 11:15)
combinations <- combn(names(df), 2)
for (i in seq_len(ncol(combinations))) {
  df[[sprintf("col%d", i)]] <- df[[combinations[1, i]]] * df[[combinations[2, i]]]
}
df

Using dplyr)
library(dplyr)
df <- data.frame(a = 1:5, b = 6:10, c = 11:15)
combinations <- combn(names(df), 2, simplify = FALSE)
df <- df %>%
  mutate(across(all_of(combinations), ~ .[[1]] * .[[2]], .names = "{.col[[1]]}:{.col[[2]]}"))
df

At the end, use colnames() for the vector of independent variables. Rejoin the dependent variable to the data frame. Pass that to reformulate.

Tim
-----Original Message-----
From: Rui Barradas <ruipbarradas using sapo.pt>
Sent: Sunday, March 30, 2025 4:05 AM
To: Ebert,Timothy Aaron <tebert using ufl.edu>; Bert Gunter <bgunter.4567 using gmail.com>; Richard M. Heiberger <rmh using temple.edu>; R-help <R-help using r-project.org>
Subject: Re: [R] [External] Creating model formulas programmatically

[External Email]

Hello,

I thought of answering "reformulate can solve the problem" but how do you create quadratic terms with reformulate?

~(Heigh + Ho + Silver + Away)^2

is still a problem with no solution that I know of but paste/as.formula.
Or Bert's bquote or substitute.

Rui Barradas

Às 23:18 de 29/03/2025, Ebert,Timothy Aaron escreveu:
> The general formula is y ~ a + b + c + ...
>
> There is this approach:
> formula <- reformulate(independent_vars, response = "y") model <-
> lm(formula, data = mydata)
> summary(model)
>
> It does not generate a string object, but the formula is still a string even if it is of class formula. Also, in this approach you only get + and if you want interactions or such you will need to code them into independent_vars.
>
> This technically satisfies the parameters as I understand them, but it is unsatisfying to me because it is playing with semantics. If I cannot generate a string (no matter what you call it) then I cannot get to y ~ a + b + c + ...and without that I do not have a model.
>
> An alternative: Someone could write a function. Say I am using lm() so the function will take a vector, a data frame, and a symbol. It will put them together (as a black box) and spit out the answer. You will never see the string, so it satisfies that requirement. However, it will generate a string internally to make everything work. Again, playing with semantics.
>
> I suggest that as stated the problem does not have a solution in a meaningful way. The best I can do is to try to hide the string from you.
>
>
> Tim
>
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of Bert Gunter
> Sent: Saturday, March 29, 2025 6:40 PM
> To: Richard M. Heiberger <rmh using temple.edu>; R-help
> <R-help using r-project.org>
> Subject: Re: [R] [External] Creating model formulas programmatically
>
> [External Email]
>
> Thanks, Rich.
>
> I thought of that, too, but it violates the spirit of my restraints (to avoid character strings), which I unfortunately did not clearly articulate.
> So my apologies for that failure. My concern is that with more complex model formula, using as.formula, etc. to parse/convert character strings can get a bit hairy. But in most cases, as here maybe, it may be perfectly fine. So think of my post as mostly my attempt to learn some new tricks rather than to solve a useful problem. I hope this is not unfair to the list.
>
> Cheers,
> Bert
>
>
>
>
>
> On Sat, Mar 29, 2025 at 3:12 PM Richard M. Heiberger <rmh using temple.edu> wrote:
>
>>> somenames <- c("Heigh", "Ho", "Silver", "Away")
>>> as.formula(paste("~(",paste(somenames, collapse="+"),")^2"))
>> ~(Heigh + Ho + Silver + Away)^2
>>>
>>
>>> On Mar 29, 2025, at 14:30, Bert Gunter <bgunter.4567 using gmail.com> wrote:
>>>
>>> somenames <- c("Heigh", "Ho", "Silver", "Away")
>>
>>
>>
>
>          [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat/
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu
> %7Cec349d2b84c74ed5530f08dd6f618f03%7C0d4da0f84a314d76ace60a62331e1b84
> %7C0%7C0%7C638789186995716941%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGki
> OnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ
> %3D%3D%7C0%7C%7C%7C&sdata=CShaBJ%2BbPh4mOojRcNjOCwJhjVVJKxOEqXLtUQs1T%
> 2B4%3D&reserved=0 PLEASE do read the posting guide
> https://www/.
> r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7Ce
> c349d2b84c74ed5530f08dd6f618f03%7C0d4da0f84a314d76ace60a62331e1b84%7C0
> %7C0%7C638789186995731645%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy
> dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%
> 3D%7C0%7C%7C%7C&sdata=GfzMqC39dLvHCa19iGpF8NCHMiaszHQyjiCZsB8ojxk%3D&r
> eserved=0 and provide commented, minimal, self-contained, reproducible
> code.
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat/
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu
> %7Cec349d2b84c74ed5530f08dd6f618f03%7C0d4da0f84a314d76ace60a62331e1b84
> %7C0%7C0%7C638789186995740633%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGki
> OnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ
> %3D%3D%7C0%7C%7C%7C&sdata=2RiZVpa9wz8Zw1UKqJ3phnT0EWCBlDDcNwidfMLwbW4%
> 3D&reserved=0 PLEASE do read the posting guide
> https://www/.
> r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7Ce
> c349d2b84c74ed5530f08dd6f618f03%7C0d4da0f84a314d76ace60a62331e1b84%7C0
> %7C0%7C638789186995749236%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy
> dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%
> 3D%7C0%7C%7C%7C&sdata=SNLpntrboufHpz0RcLp%2FJnzsbqmpSgr1k9Yau73XO0Y%3D
> &reserved=0 and provide commented, minimal, self-contained,
> reproducible code.


--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus.
http://www.avg.com/


More information about the R-help mailing list