[R-sig-ME] Problem with fitted function
Doran, Harold
HDoran at air.org
Mon Oct 31 17:00:09 CET 2011
You're email is a bit confusing. Let's keep this simple and draw with a distinction between lm and lmer in terms of what is a fitted value.
In least squares regression, your model is
Y = XB + e
Where Y is your outcome, X is the model matrix for the fixed effects, B is the vector of coefficients, and e is the error term.
And the fitted values are then
\hat{Y} = X\hat{B}
Where \hat{B} is the vector of estimated fixed effects.
In mixed models, your model is
Y = XB + Zu + e
Where Y is your outcome, X is the model matrix for the fixed effects, B is the vector of coefficients, Z is the model matrix for the random effects, u are the BLUP, and e is the error term.
Now, the fitted function in lmer yields
\hat{Y} = X\hat{B} + Z\hat{u}
Where \hat{B} is the vector of estimated fixed effects and now \hat{u} are the predictions of the random effects.
> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-
> bounces at r-project.org] On Behalf Of Ira Sharenow
> Sent: Saturday, October 29, 2011 7:39 PM
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] Problem with fitted function
>
> I am new to mixed effects and library(lme4). But I have spent quite a bit of
> time trying to understand Prof. Bates's online book.
>
>
>
> I have data from the California Department of Education. I have Academic
> Performance Index (API) scores for 677 schools and I have data from 2002
> through 2009. The Years are coded 0:7.
>
>
>
> API response variable, 200 <=API <= 1000
>
> Years, 0:7
>
> CDS, ID numbers, coded as factors
>
> AVGED, average parental education 1 <=AVGED<=5
>
>
>
> For each year and for each school, I would like to predict API scores. I
> cannot figure out a model that works. My main concern is with fitted().
>
>
>
> For each year, I did standard lm() analyses. The data is not perfectly normal
> but various adjustments made little difference for my purposes. I am satisfied
> with my lm() analyses.
>
>
>
> Specifically, I am attempting to predict the API score for a school, AHS.
>
>
>
> In order to get started I tried,
>
>
>
> model4 = lmer(API ~ 1 + (1 | Years) + (1 | CDS), data = AVGEDAPI0209,
> na.action = na.omit)
>
>
>
> > fixef(model4)
>
> (Intercept)
>
> 701
>
>
>
> $Years
>
> 0 1 2 3 4 5 6 7
>
> -55 -30 -18 5 12 17 29 40
>
>
>
> $AHS # actually hundreds of numbers were produced and I picked out the AHS ID
> and the associated value
>
> 124
>
>
>
> So I performed a simple calculation
>
>
>
> # intercept + AHS effect + Years effect
>
> expAHSScoreBase = numeric(8)
>
> expAHSScoreBase = 701 + 124 + c(-55, -30,-18,5,
>
> 12, 17,29, 40)
>
>
>
> > expAHSScoreBase
>
> [1] 770 795 807 831 838 843 854 866
>
>
>
> I then computed the predicted scores.
>
> > fittedAHSScore = fitted(model4)[AVGEDAPI0209$CDS == AHSCDS]
>
> > fittedAHSScore
>
> [1] 770 795 807 831 838 798 933 761
>
>
>
>
>
>
>
> The first five scores match, but the last three differ by a significant
> amount. Can someone please offer some advice? Why are those final three
> numbers so different? And my real goal is to predict API scores from AVGED.
>
>
>
> I then tried the model I think I want,
>
>
>
> model5 = lmer(API ~ 1 + AVGED + (1 | Years) + (1 | CDS), data = AVGEDAPI0209,
> na.action = na.omit)
>
>
>
>
>
> Unfortunately, when I tried to fit, I again obtained unexpected results
>
> fittedAHSScore5 = fitted(model5)[AVGEDAPI0209$CDS == AHSCDS]
>
> > fittedAHSScore5
>
> [1] 768 794 806 832 837 804 931 763
>
>
>
> However when I added up the pieces, intercept + (AVGED parameter * actual
> value) + Years effect + CDS effect
>
>
>
> I think I got what I wanted.
>
>
>
> > expAHSScoreBase5 = 574 + AVGED5 + 69 + Years5
>
> > expAHSScoreBase5
>
> [1] 768 794 806 832 837 842 855 868
>
>
>
> By doing separate lm() regressions for each year, I obtained the following
> predicted scores,
>
>
>
> 800 813 821 838 842 844 854 868
>
>
>
> Finally, I tried the following models and in each case the fitted values
> produced the same pattern of values.
>
>
>
> model6 = lmer(API ~ 1 + Years + (1 + Years| CDS), data = AVGEDAPI0209,
> na.action = na.omit, REML = 0)
>
>
>
> model6AVGED = lmer(API ~ 1 + AVGED + Years + (1 + Years| CDS), data =
> AVGEDAPI0209, na.action = na.omit, REML = 0)
>
>
>
> model7 = lmer(API ~ 1 + Years + (1 | CDS) + (-1 + Years| CDS), data =
> AVGEDAPI0209, na.action = na.omit, REML = 0)
>
>
>
>
>
> model7AVGED = lmer(API ~ 1 + AVGED + Years + (1 | CDS) + (-1 + Years| CDS),
> data = AVGEDAPI0209, na.action = na.omit, REML = 0)
>
>
>
>
>
> Thanks.
>
>
>
> Ira Sharenow
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list