[R-sig-ME] Assessing linearity

Mon Oct 25 20:15:40 CEST 2010

Thanks Andrew. Jarrod mentioned using splines as well, and I do think
they sound promising since I indeed do not have strong a priori
reasons to expect polynomial shaped trends.

I'm uncertain, however, how to choose values for the "degree" and "df"
arguments to the bs. I see that df has to be equal-to-or greater than
degree, and less than the number of unique values in grade minus 2
(minus one makes it equivalent to factorizing grade, which I'm doing
already). I also see that degree=1 and df=1 is equivalent to the
linear case, which I'm already estimating in the original lmer fit.

However, that still leaves a large parameter space (df at degree:
2 at 1;3 at 1,4 at 1, 2 at 2, 3 at 2; 4 at 2; 3 at 3; 4 at 3; 4 at 4), and I'm not sure it's
appropriate to compare the linear case to fits from every point in
this space. Thoughts?

Mike

On Mon, Oct 25, 2010 at 12:59 PM, Andrew Dolman <andydolman at gmail.com> wrote:
> Hi Mike,
>
> Couple of things.
>
> poly() is used like this
>
> a <- rnorm(100)
> b <- rnorm(100)
>
> m2 <- lm(a~poly(b,2))
> summary(m2)
>
> To compare a model with a linear predictor and a quadratic polynomial
> you can do this:
> m0 <- lm(a~1)
> m1 <- lm(a~poly(b,1))
> m2 <- lm(a~poly(b,2))
> anova(m1,m2)
>
> In the case of likelihood fitted models you'll get a likelihood ratio test.
>
>
> You wouldn't normally fit a quadratic term instead of the linear term,
> rather in addition, so m2 has 1 extra parameter. You can compare
> multiple models at once with anova(m0,m1,m2), or you can go the AIC
> route and calculate deltaAIC for your set of candidate models.
>
> Unless you expect the non-linearity to be polynomial shaped you might
> be better off fitting splines with varying number of knots/df.
>
>
>
> andydolman at gmail.com
>
>
>
> On 24 October 2010 18:43, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
>> Ah, this looks promising! So how does this sound:
>>
>> I typically assess the evidence for a relationship between the
>> predictor and response variables by comparing the AIC values for a
>> model including the predictor to a model without it. In the case of
>> grade_as_numeric, I'd do:
>>
>> fit_null = lmer(
>>    formula = response ~ (1|individual)
>>    data = my_data
>> )
>> fit_null_AIC = AIC(fit_null)
>>
>> fit_alt = lmer(
>>    formula = response ~ (1|individual) + grade_as_numeric
>>    data = my_data
>> )
>> fit_alt_AIC = AIC(fit_alt)
>>
>> grade_loglikratio = fit_null_AIC - fit_alt_AIC
>>
>> Now, if I wanted to check whether there is a quadratic component to
>> the grade effect, I'd first compute an analogous likelihood ratio for
>> the quadratic fit compared to the null:
>> fit_alt_quad = lmer(
>>    formula = response ~ (1|individual) + poly(grade_as_numeric)^2
>>    data = my_data
>> )
>> fit_alt_quad_AIC = AIC(fit_alt_quad)
>> grade_quad_loglikratio = fit_null_AIC - fit_alt_quad_AIC
>>
>> Then compute a final log likelihood ratio between the improvement over
>> the null caused by grade versus the improvement over the null caused
>> by grade as a quadratic:
>>
>> grade_lin_vs_quad_LLR = grade_loglikratio - grade_quad_loglikratio
>>
>> I could repeat this for higher-order polynomials of grade, each
>> compared to the order directly below it, to develop a series of
>> likelihoood ratios that describe the relative improvement of the fit.
>>
>> Does this sound appropriate?
>>
>> Cheers,
>>
>> Mike
>>
>> --
>> Mike Lawrence
>> Graduate Student
>> Department of Psychology
>> Dalhousie University
>>
>> Looking to arrange a meeting? Check my public calendar:
>> http://tr.im/mikes_public_calendar
>>
>> ~ Certainty is folly... I think. ~
>>
>>
>>
>> On Sun, Oct 24, 2010 at 6:41 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
>>> Hi Mike,
>>>
>>> You would be better off trying out something like polynomials or splines,
>>> For example:
>>>
>>>  fit1 = lmer(
>>>     formula = response ~ (1|individual)+poly(grade_as_numeric,n),
>>>     , data = my_data
>>>     , family = gaussian
>>>  )
>>>
>>> where n is the order of the polynomial. n=1 would fit the same model as your
>>> original fit1, although the covariate (and the regression parameter) would
>>> be scaled by some number. When n=6 the model would be a reparameterised
>>> version of your model fit2. When 1<n<6 you would be working with a
>>> non-linear relationship in between these two extremes, although the model is
>>> still linear in the parameters.
>>>
>>> Cheers,
>>>
>>> Jarrod
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Quoting Mike Lawrence <Mike.Lawrence at dal.ca>:
>>>
>>>> Hi folks,
>>>>
>>>> I have developmental data collected across several grades (1-6). I
>>>> would like to be able to assess whether there are any linear or
>>>> non-linear trends across grade. Does it make sense to run a first lmer
>>>> treating grade as continuous, obtain the residuals, then run a second
>>>> lmer treating grade as a factor? That is:
>>>>
>>>> fit1 = lmer(
>>>>    formula = response ~ (1|individual)+grade_as_numeric
>>>>    , data = my_data
>>>>    , family = gaussian
>>>> )
>>>> my_data$resid = residuals(fit1)
>>>> fit2 = lmer(
>>>>    formula = resid ~ (1|individual)+grade_as_factor
>>>>    , data = my_data
>>>>    , family = gaussian
>>>> )
>>>>
>>>>
>>>> As I understand it, fit1 will tell me if there are any linear trends
>>>> in the data, while fit2 will tell me if there are any non-linear
>>>> trends in the data in addition to the linear trends obtained in fit1.
>>>>
>>>> If this is sensible, how might I apply it to a second binomial
>>>> response variable given that the residuals from a binomial model are
>>>> not 0/1?
>>>>
>>>> Cheers,
>>>>
>>>> Mike
>>>>
>>>> --
>>>> Mike Lawrence
>>>> Graduate Student
>>>> Department of Psychology
>>>> Dalhousie University
>>>>
>>>> Looking to arrange a meeting? Check my public calendar:
>>>> http://tr.im/mikes_public_calendar
>>>>
>>>> ~ Certainty is folly... I think. ~
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>