[R-sig-ME] Query on predicting from GLMM to new data with a different range of values of the random term

Tue Apr 8 13:48:54 CEST 2014

Dear Alice,

Have a look at the allow.new.levels argument of predict.merMod(). Read the helpfile carefully to know what this option does (and doesn't).

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
Thierry.Onkelinx op inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-sig-mixed-models-bounces op r-project.org [mailto:r-sig-mixed-models-bounces op r-project.org] Namens Alice Jones
Verzonden: dinsdag 8 april 2014 10:23
Aan: r-sig-mixed-models op r-project.org
Onderwerp: [R-sig-ME] Query on predicting from GLMM to new data with a different range of values of the random term

Hi All,

I am fitting a Gaussian lmer model to a set of training data with a number of fixed effects and 'year' as a random effect.

I want to use the model that I have fit to the training data (which spans the period 1982 - 2001) to predict to another data set which covers the period 2001 - 2011.  I am doing this as a form of external model validation (how good is my model at predicted to a different dataset that does not span the same temporal extent as the training data?).  However I am getting an error message: "Error in (function (x, n)  : new levels detected in newdata".  This has caused me to question whether it is even possible to use an lmer model to predict to data with values outside the range of the random effect in the training data?  I assume this must be possible, because many studies have used mixed effects models to predict into the future........

I have checked that the levels of all the fixed effect factors are the same between the two datasets (training and validation).  Additionally I have tried specifying the random effect (year) as.numeric and as.integer as opposed to as.factor (in order to work out of this was a factor-specific problem), but this has made no difference.

Any advice on this would be much appreciated.
Thanks,
Alice

Dr Alice Jones
ARC Research Associate
Global Ecology Lab &
School of Earth and Environmental Sciences

G44, Mawson Laboratory,
North Terrace Campus,
The University of Adelaide,
Australia 5005

alice.jones01 op adelaide.edu.au<mailto:alice.jones01 op adelaide.edu.au>
Web profile<http://www.adelaide.edu.au/directory/alice.jones01>
Ph:  +61 8 8313 2243

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models op r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.