[R-sig-ME] [R] understanding I() in lmer formula
David Winsemius
dwinsemius at comcast.net
Wed Jun 14 18:31:26 CEST 2017
> On Jun 14, 2017, at 7:40 AM, Ben Bolker <bbolker at gmail.com> wrote:
>
>
>
> On 17-06-14 08:59 AM, Don Cohen wrote:
>>
>>> The expression `~z.n.males` is a unary formula, i.e. a language
>>> object, so it's not a data object much less a numeric data
>>> object. The `*` operator does not have a method for such an object
>>> as the second argument.
>>
>> ok, so the next question is where this comes from
>
> Something is getting mangled internally. One of the reasons you're
> not getting much useful advice here is that this is new to us -- I've
> never seen anything like this happen before. I keep making guesses
> based on the differences between your example and the stuff people
> normally try to do (I() inside random effects term, long random-effects
> term), but so far my guesses haven't panned out. This is the reason we
> keep asking for a **reproducible** example; if I could run this example
> myself I could almost certainly figure out what's going on, but remote
> debugging is really hard.
>
>>
>>> It appears that trying to use I() on an expression with a
>>> multiplication operator was not something that was anticipated as
>>> having a sensible meaning.
>
> It is perfectly sensible: it just means "multiply these two terms
> together, using the normal arithmetic meaning of *, rather than
> composing their interaction"). However, if this term is indeed causing a
> problem, it can be worked around by defining a new variable rather than
> constructing it on the fly (this is what I suggested in my previous error).
I didn't say it wasn't sensible, just that it might not have been anticipated. It appears to be at least superfluous if these are both numeric.
It _might_ be sensible but it also might be confusing at the interpretative level. But if these were both factors, the resulting numeric value will be some sort of an ordinal by ordinal interaction term. There would be confounding of the interaction between the first level of variable-1 with the third level of variable-2 and the third level of variable-1 and the first level of variable-2 since the coercion to numeric would mean both were mapped to 3.
>>
>> That would cause it to get the error in all those other cases where
>> there is no error.
>> Notice that the difference between getting the error and not getting
>> it is not in the I() term - that's the same in both cases.
>>
>>> I'm not sure what this was supposed to be doing but perhaps you
>>> wanted the interaction()-function rather than the as.is function?
>>
>> What all this is supposed to mean is another topic I'd also like to
>> discuss. I did not write the original formula. I'm just trying to
>> make small changes to it to see the effects. Or at least I WAS trying
>> to make small changes. I had already made rather large changes to
>> simplify the example in my first post.
>
> OK, I've been able to reproduce this (code below), will dig in and let
> you know what I find.
I'm not understanding how the example below illustrates the differences that Dan observed when reversal of the variable names in the argument to `I` caused an error in one case and no error in the second case. Was the problem occurring because the formula expanded to three-way terms with some of the terms in the form `var1*(var1*var2||var3) or var1*(var1*var2|var3) in the case causing problems, but not such a problem with var1*(var2*var1||var3)?
>
> ----
>
> form <- log.corti~z.n.fert.females*z.n.males+
>
> is.alpha2*(z.infanticide.susceptibility+z.min.co.res+z.co.res+z.log.tenure)+
> z.xtime+z.age.at.sample+sin.season+cos.season+
> (1 #+z.n.fert.females
> +z.n.males
> +is.alpha2.subordinate
> +z.infanticide.susceptibility
> +z.min.co.res
> +z.log.tenure
> +z.co.res
> ## +z.xtime
> +z.age.at.sample
> +sin.season
> +cos.season+
> I(z.n.fert.females*z.n.males)+
> I(is.alpha2.subordinate*z.min.co.res)+
> ## I(z.co.res*is.alpha2.subordinate)
> I(is.alpha2.subordinate*z.co.res)
> ## +int.is.a.log.ten
> ||monkeyid)
>
> xvars <- setdiff(all.vars(form),"monkeyid")
> dd <- data.frame(matrix(rnorm(1000*length(xvars)),ncol=length(xvars)))
> names(dd) <- xvars
> dd$monkeyid <- factor(rep(1:20,50))
> library(lme4)
> parse <- lFormula(form, data=dd)
>
>
>>
>> I think this shows that the problem is in parsing:
>> parse <- lFormula(formula = log.corti~z.n.fert.females*z.n.males+
>> is.alpha2*(z.infanticide.susceptibility+z.min.co.res+z.co.res+z.log.tenure)+
>> z.xtime+z.age.at.sample+sin.season+cos.season+
>> (1 #+z.n.fert.females
>> +z.n.males
>> +is.alpha2.subordinate
>> +z.infanticide.susceptibility
>> +z.min.co.res
>> +z.log.tenure
>> +z.co.res
>> # +z.xtime
>> +z.age.at.sample
>> +sin.season
>> +cos.season+
>> I(z.n.fert.females*z.n.males)+
>> I(is.alpha2.subordinate*z.min.co.res)+
>> # I(z.co.res*is.alpha2.subordinate)
>> I(is.alpha2.subordinate*z.co.res)
>> # +int.is.a.log.ten
>> ||monkeyid), data=fe.re.xx$data)
>> + + + + + + + + + + + + + + + + + + + Error in is.alpha2.subordinate * ~z.co.res :
>> non-numeric argument to binary operator
>>
>> whereas switching the order of the * arguments
>> parse <- lFormula(formula = log.corti~z.n.fert.females*z.n.males+
>> is.alpha2*(z.infanticide.susceptibility+z.min.co.res+z.co.res+z.log.tenure)+
>> z.xtime+z.age.at.sample+sin.season+cos.season+
>> (1 #+z.n.fert.females
>> +z.n.males
>> +is.alpha2.subordinate
>> +z.infanticide.susceptibility
>> +z.min.co.res
>> +z.log.tenure
>> +z.co.res
>> # +z.xtime
>> +z.age.at.sample
>> +sin.season
>> +cos.season+
>> I(z.n.fert.females*z.n.males)+
>> I(is.alpha2.subordinate*z.min.co.res)+
>> I(z.co.res*is.alpha2.subordinate)
>> # I(is.alpha2.subordinate*z.co.res)
>> # +int.is.a.log.ten
>> ||monkeyid), data=fe.re.xx$data)
>> + + + + + + + + + + + + + + + + + + + >
>>> parse$formula
>> log.corti ~ z.n.fert.females * z.n.males + is.alpha2 * (z.infanticide.susceptibility +
>> z.min.co.res + z.co.res + z.log.tenure) + z.xtime + z.age.at.sample +
>> sin.season + cos.season + ((1 | monkeyid) + (0 + z.n.males |
>> monkeyid) + (0 + is.alpha2.subordinate | monkeyid) + (0 +
>> z.infanticide.susceptibility | monkeyid) + (0 + z.min.co.res |
>> monkeyid) + (0 + z.log.tenure | monkeyid) + (0 + z.co.res |
>> monkeyid) + (0 + z.age.at.sample | monkeyid) + (0 + sin.season |
>> monkeyid) + (0 + cos.season | monkeyid) + (0 + I(z.n.fert.females *
>> z.n.males) | monkeyid) + (0 + I(is.alpha2.subordinate * z.min.co.res) |
>> monkeyid) + (0 + I(z.co.res * is.alpha2.subordinate) | monkeyid))
>>
>> BTW, passing that result to lFormula with the
>> I(z.co.res * is.alpha2.subordinate) changed to
>> I(is.alpha2.subordinate * z.co.res )
>> also works.
>>
>> So as a work around perhaps the solution is to start from the result
>> of lFormula on the original formula and make my incremental changes
>> to that.
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
David Winsemius
Alameda, CA, USA
More information about the R-sig-mixed-models
mailing list