# [R] variable names in lm formula ~.

Ista Zahn istazahn at gmail.com
Wed Jan 31 18:47:30 CET 2018

```I poked at this a little bit and found that the issue exists in
stats:::C_termsform (which is called by terms.formula).

Here is a variation on the demonstrations provided by Vito and Bert earlier:

d<-data.frame(y=rnorm(10,5,.5),
age=rnorm(10),
exp=rnorm(10),
log = runif(10))

fs <- list(y ~ .,
exp(y) ~ .,
log(y) ~ .)

lapply(fs, function(x) terms(x, data = d)[[3]])
## [[1]]
## age + exp + log

## [[2]]
## age + log

## [[3]]
## age + exp

lapply(fs,
function(x)
.External(stats:::C_termsform,
x,
NULL,
d,
FALSE,
FALSE)[[3]])
## [[1]]
## age + exp + log

## [[2]]
## age + log

## [[3]]
## age + exp

I don't speak C so I stopped there.

Best,
Ista

On Tue, Jan 30, 2018 at 11:12 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
> Well...
>
> ?terms.formula says:
>
> "data: a data frame from which the meaning of the special symbol . can
> be inferred. It is unused if there is no . in the formula."
>
> So this seems to me to be an obscure bug, as I have found no warning
> against this admittedly confusing but still, I think, legal syntax.
> Note:
>
>> d <- data.frame(log = runif(10), x = 1:10)
>> y <- rnorm(10,5)
>
>> m1 <- lm(y ~ ., data = d)
>> formula(m1)
> y ~ log + x
>
>> m2 <- update(m1, formula =log(y) ~.)
>> formula(m2)
> log(y) ~ log + x
>
>> m3 = lm(log(y) ~., data =d)
>> formula(m3)
> log(y) ~ x
>
> As always, correction appreciated if I'm wrong.
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Tue, Jan 30, 2018 at 6:23 AM, Jeff Newmiller
> <jdnewmil at dcn.davis.ca.us> wrote:
>>
>> Functions are first class objects, so some kind of collision is bound to happen if you do this... so don't.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On January 30, 2018 3:11:56 AM PST, "Vito M. R. Muggeo" <vito.muggeo at unipa.it> wrote:
>> >dear all,
>> >Is the following intentional? Am I missing anything in documentation?
>> >
>> >d<-data.frame(y=rnorm(10,5,.5),exp=rnorm(10), age=rnorm(10))
>> >formula(lm(exp(y)~exp+age, data=d))
>> >#--> exp(y) ~ exp + age
>> >
>> >formula(lm(exp(y)~., data=d))
>> >#--> exp(y) ~ age
>> >
>> >variable 'exp' (maybe indicating "experience") is not included in the
>> >model. The same happens with 'log' (and other function names, I
>> >suppose..)
>> >
>> >best,
>> >vito
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.

```