[Rd] columnames changes behaviour of formula

Joshua Wiley jwiley.psych at gmail.com
Fri May 25 06:46:46 CEST 2012


Hi Robin,

Seems like the intended behavior to me.  From the docs:
"There are two special interpretations of '.' in a formula.  The usual
one is in the context of a 'data' argument of model fitting functions
and means 'all columns not otherwise in the formula' "

d is in the formula so the only column not in the formula is nd.  the
(.)^2 asks for all two way interactions, but with only one variable,
there are none.

What were you expecting?

Josh

On Thu, May 24, 2012 at 9:25 PM, robin hankin <hankin.robin at gmail.com> wrote:
> Hello. precompiled R-2.15.0, svn58871, macosx 10.7.4.
>
>
> I have discovered that defining column names of a dataframe can alter the
> behaviour of lm():
>
>
> d <- c(4,7,6,4)
> x <- data.frame(cbind(0:3,5:2))
> coef(lm(d~ -1 + (.)^2,data=x))
>   X1    X2 X1:X2
> -1.77  0.83  1.25
> R>
> R>
>
>
> OK, so far so good.  But change the column names of 'x' and the behaviour
> changes:
>
>
> colnames(x) <- c("d","nd")   # 'd' == 'death' and 'nd' == 'no death'
> coef(lm(d~ -1 + (.)^2,data=x))
>       nd
> 0.2962963
>
>
>
> I am not sure if this is consistent with the special meaning of '.'
> described under ?formula.
>
> Is this the intended behaviour?
>
>
> --
> Robin Hankin
> Uncertainty Analyst
> hankin.robin at gmail.com
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/



More information about the R-devel mailing list