[R-sig-ME] Question about what is "shrinkage"...

i white i.m.s.white at ed.ac.uk
Thu Sep 19 12:03:55 CEST 2013


Emmanuel,

A very simple example of shrinkage: data take the form

Y = m + U + e

for each individual. There are two error terms, U and e. We may want to 
predict m + U, and if we know the ratio of variances of U and e, we can 
add the appropriate fraction (<1) of the residual to m. For example, Y 
might be a student's test score, U is a measure of his innate ability, 
and e reflects temporary effects such as whether he had a cold on the 
day of the exam, or got lucky in his choice of revision topics.

On 09/18/2013 09:20 PM, Emmanuel Curis wrote:
> Hello,
>
> I've read several time the term "shrinkage", either on this list or,
> even more often, when dealing with population pharmacokinetics, and I
> am not quite sure what it means and what is its usage... Could it be
> possible to have either some references or some explanations? I give
> below a longer version, with how far I could get and where I am
> stopped... Thanks in advance for any help!
>
> I've search a little bit on the net; shrinkage seems related to the
> fact that after regression, it is possible to obtain more precise, but
> slightly biased, estimators of the coefficients, by making them a
> little bit smaller than the actual value (hence « shrinkage »).
> However, in the discussions especially about PK-pop models, the usage
> of "shrinkage" does not seem to me coherent with this meaning...
> Instead, it seems to be a property of mixed-models, linked to
> variances estimations, and used to check the model quality or validaty
> in some way, with sentences like "this model increased the shrinkage"
> and mentions of something like "random effects parameters shrinkage"
> and "residuals shrinkage" (eta-shrinkage and epsilon-shrinkage)...
>
> My other idea was related to the fact that when modeling a set of
> repeated measures on several patients, with a straight line, the set
> of slopes shows less variability when using a mixed model on the whole
> set, than using separate lines for each patient --- as exemplified for
> instance in Douglas Bate's book. Hence, variance of slopes is shrinked
> in the mixed model approach compared to the variance obtained from the
> sample of all individual slopes. This idea seems closer to the use and
> terminology above, but I can't see if shrinkage is a good or bad
> thing...
>
> I mean, since one imposes a given distribution, hence a constraint, on
> slopes, the fact that variance is smaller is not a surprise and it
> could be a drawback of the estimation, leading to underestimation.
> Conversely, variance of individual slopes also includes the residual
> variability, hence is expected to be higher. Is it true then that the
> mixed-model estimation is better? But in that case, how shrinkage can
> be used to quantify the correctness of a model?
>
> Thanks in advance,
> Best regards,
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list