[R] Calculating Pseudo R-squared from nlme

Thu Feb 23 18:42:36 CET 2012

I saw no reply to this yet, so herewith a few comments.

1. Best recommendation: Post to r-sig-mixed-models instead.

Miscellaneous  comments.

R-squared as "an overall summary of the total outcome variability
explained" is practically useless and generally misleading. Why?
Short answer: Because nonlinear models are fundamentally different
(mathematically) from linear models. For example, the basic linear
models concept of "degrees of freedom" (df) is not  immediately
applicable and is certainly NOT simply related to the number of
parameters in the model.

Long Answer: Ask on mixed models list.

I am well aware that much software and lots of non-statistical
literature quote R-squares and "pseudo"-R-squares (at least there is a
qualification) as informative measures of fit for nonlinear models.

As I am now 65, I can be a bit impolite and expect forbearance when I
say: it's all crap. I say this in the same spirit that one would speak
of papers on perpetual motion machines or creationism: it's contrary
to the underlying reality.

And now for the disclaimer: As no one has proclaimed me expert in
chief of anything, others who actually are may point out my egregious
errors (or I certainly hope they will). So long as they have the
mathematics on their side (opinion counts for nothing in the world of
thermodynamics or evolution: entropy always increases and bacteria
eventually evolve resistance), pay attention to them and ignore me.
Age does not guarantee wisdom.

Best,
Bert

On Thu, Feb 23, 2012 at 5:18 AM, dadrivr <dadrivr at gmail.com> wrote:
> I am fitting individual growth models using nlme (multilevel models with
> repeated measurements nested within the individual), and I am trying to
> calculate the Pseudo R-squared for the models (an overall summary of the
> total outcome variability explained).  Singer and Willett (2003) recommend
> calculating Pseudo R-squared in multilevel modeling by squaring the sample
> correlation between observed and predicted values (across the sample for
> each person on each occasion of measurement).
>
> My question is which set of predicted values should I use from nlme in that
> calculation?  From my models in nlme, I receive two sets of fitted values.
> Reading the description of the fitted lme values
> (http://stat.ethz.ch/R-manual/R-patched/library/nlme/html/fitted.lme.html),
> there appear to be two sets of fitted values that correspond to levels of
> grouping, where the first set of fitted values (Level 0) correspond to the
> population fitted values and it moves to more innermore groupings as the
> levels increase (e.g., I suppose Level 1 corresponds to the individual-level
> fitted values in my data).
>
> I'm not sure I understand the distinction between population fitted values
> and individual-level fitted values because each individual and each
> measurement occasion has an estimate for both (population and individual
> fitted estimates).  Could you please explain the distinction and which one I
> should be using to calculate the Pseudo R-squared as suggested by Singer and
> Willett (2003)?
>
> Thanks so much for your help!
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Calculating-Pseudo-R-squared-from-nlme-tp4413825p4413825.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm