[R-sig-ME] interactions

Iasonas Lamprianou lamprianou at yahoo.com
Mon Mar 29 23:54:09 CEST 2010


Dear colleagues,
apologies for the long question, but I really need some help. It is not a case of academic laziness, I have done my homework but now I need some advice.

I have 10000 students nested in 50 schools, each student taking a test. The tests are randomly grouped in batches of 20 scripts (around 500 batches). Then, two random markers (out of a pool of 57 markers) mark each batch blindly. Each marker marks around 17 random batches of 20 scripts. If the two random markers disagree by more than 10% on the total score for a specific student, then the test of the student is being given to another marker randomly for a third blind marking. So, I have students nested within schools which are nested in areas. Then, scripts (the tests of the students) nested into batches. Then, batches crossed with markers. However, the actual marking lasted around 15 days. One batch may be marked by the first marker on day 1 and then marked by the second marker on the next day (or the day after etc). So the batches are also crossed by days. 

My Research question is: are there markers who are differentially severe or lenient on different days? 

I use this model:

lmer(score ~ 1+day+gender+school+(1+day|marker)+(1|candidate)+(1|batch), mg2007_sub)

I assume that this model is appropriate, because it models candidates as random effects (each candidate has an ability estimate based on the scores he/she received from the two or the three markers if his/her script was remarked by a third marker). Also, every batch is modelled as a random effect. Every marker is modelled as a random effect but I allow his/her estimate to vary by day, so I assume that this would allow me to see if a marker behaves in a different way on different days. So this should answer my research question.

Question 1: does this model make sense statistically (as i formulate it in lme4?)

So far so good, but lme4 will run for 48 hours and then fail because it reached maximum iterations. I assume that even if I allow for more iterations, it will go on for ever.

Question 2: Why is it so slow? Why it does not converge?

Then, I decided to simplify the problem. Instead of modelling day as a factor, I use 'd' which is the cardinal number of day e.g. 1, 2, 3, days that passed since we started marking. I assume that this is an interval scale. If a marker becomes (linearly) more lenient, he/she would 'like' this model. So I run

lmer(score ~ 1+day+gender+dvhool+(1+d|marker)+(1|candidate)+(1|batch), mg2007_sub)

please notice that (1+day|marker) where day is a factor with 15 levels, becomes (1+d|marker) which is numeric variable

and this gives

Error terms:
 Groups    Name        Std.Dev. Corr  
 candidate (Intercept) 16.85          
 batch     (Intercept)  3.49          
 marker    (Intercept)  4.68          
           d            0.38    -0.77 
 Residual               4.83          
---
number of obs: 18001, groups: candidate, 9402; batch, 470; marker, 59
AIC = 138232, DIC = 138388.9
deviance = 138230.4 


Question 3: How do I know that the variance of 'd' is statistically significant? MlWin gives some error estimates.

Question 4: The correlation of -0.77 is a problem?

I hope somebody can help me, this has taken me three days up to now...

Thank you 

jason



Dr. Iasonas Lamprianou


Assistant Professor (Educational Research and Evaluation)
Department of Education Sciences
European University-Cyprus
P.O. Box 22006
1516 Nicosia
Cyprus 
Tel.: +357-22-713178
Fax: +357-22-590539


Honorary Research Fellow
Department of Education
The University of Manchester
Oxford Road, Manchester M13 9PL, UK
Tel. 0044  161 275 3485
iasonas.lamprianou at manchester.ac.uk


--- On Mon, 29/3/10, r-sig-mixed-models-request at r-project.org <r-sig-mixed-models-request at r-project.org> wrote:

> From: r-sig-mixed-models-request at r-project.org <r-sig-mixed-models-request at r-project.org>
> Subject: R-sig-mixed-models Digest, Vol 39, Issue 47
> To: r-sig-mixed-models at r-project.org
> Date: Monday, 29 March, 2010, 21:25
> Send R-sig-mixed-models mailing list
> submissions to
>     r-sig-mixed-models at r-project.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> or, via email, send a message with subject or body 'help'
> to
>     r-sig-mixed-models-request at r-project.org
> 
> You can reach the person managing the list at
>     r-sig-mixed-models-owner at r-project.org
> 
> When replying, please edit your Subject line so it is more
> specific
> than "Re: Contents of R-sig-mixed-models digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: The issue of increasing maximum
> number of iterations
>       (Ben Bolker)
>    2. poisson GLMER with identity link (Tim
> Carnus)
>    3. Re: unbalanced data in nested lmer
> model (Jana B?rger)
>    4. Re: Correlation of -1: is it a
> problem? (Eric Castet)
>    5. Re: unbalanced data in nested lmer
> model (Luca Borger)
>    6. Re: unbalanced data in nested lmer
> model (Ben Bolker)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 29 Mar 2010 15:50:30 +0000 (UTC)
> From: Ben Bolker <bolker at ufl.edu>
> To: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] The issue of increasing maximum
> number of
>     iterations
> Message-ID: <loom.20100329T171254-536 at post.gmane.org>
> Content-Type: text/plain; charset=us-ascii
> 
> Martin Stjernman <Martin.Stjernman at ...> writes:
> 
> > 
> > Dear listmembers!
> > 
> > I have also had trouble with the maxIter specification
> in (g)lmer. 
> > I've tried to increase the number of
> > iterations (to 1000) via the control statement in lmer
> 
> > but it still stops at 300 (the default).
> > We are now a number of users having problem with this
> but 
> > I haven't seen any solution to the problem on this
> > list and I would really appreciate any hint on what I
> am doing wrong.
> > I am running on a Vista 64-bit machine and my model
> specification and
> >  session info follows below. Please let
> > me know if any other information needs to be
> attached.
> 
>   Looking at the source code in lmer.R, this seems
> like
> a moderately obvious hole: on or about line 714 (I may be
> off by a couple of lines) we have
> 
>     if (missing(verbose)) verbose <-
> cv$msVerbose
> ### FIXME: issue a warning if the model argument is
> FALSE.  It is ignored. 
> 
>  adding the lines
> 
>     FL$dims["mxit"] <- cv$maxIter
>     FL$dims["mxfn"] <- cv$maxFN
> 
>   appears to work: running
> 
>  (gm1 <- glmer(cbind(incidence, size - incidence) ~
> period + 
>    (1 | herd), family=binomial, data=cbpp,
> control=list(maxIter=1000)))
> 
> produces a model with the correct iteration number listed
> in its
> guts.
> 
>   The next question is whether you or someone at your
> institution
> is capable of modifying lmer.R appropriately and rebuilding
> the
> package ...
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Mon, 29 Mar 2010 17:45:08 +0100
> From: Tim Carnus <Tim.Carnus at ucd.ie>
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] poisson GLMER with identity link
> Message-ID: <1269881108.3596.24.camel at tim-laptop>
> Content-Type: text/plain; charset=UTF-8
> 
> Dear list,
> 
> I am trying to fit a number of GLMERs to count data with an
> additive
> model (in the predictors) that requires the use of the
> identity link
> function. For about half of my response variables this
> causes no
> problems. However in a number of cases the model fitting
> runs into
> problems with regards estimation of negative mean (for e.g.
> the error
> message in mer_finalize: mu[i] must be positive: mu =
> 1.76267e-312, i =
> 13075456). As far as I understand this is well known and
> documented, and
> guarding against that possibility is necessary, and built
> in to say the
> glm() function.
> 
> My question then is, how can I do this with lmer? (ie how
> can I specify
> the constraints necessary to fit these types of models, if
> at all
> possible)
> 
> Best regards,
> 
> Tim Carnus
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Mon, 29 Mar 2010 16:17:41 +0200
> From: Jana B?rger <jana.buerger at uni-rostock.de>
> To: Andrew Dolman <andydolman at gmail.com>
> Cc: "r-sig-mixed-models at r-project.org"
>     <r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] unbalanced data in nested lmer
> model
> Message-ID: <4BB0B685.70506 at uni-rostock.de>
> Content-Type: text/plain; charset="UTF-8"; format=flowed
> 
> Dear Andrew and other list members,
> As I described in an earlier 
> post(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q1/003503.html)
> my data is actually hierarchical down to the level of
> fields within farms.
> 
> There is more than 500 cases on 8 farms in 6 regions.
> Would you not think that gives enough power to distinguish
> within region 
> variability vs. between regions?
> 
> Moreover I don't understand your argument that fitting
> random efects 
> with less than 5 levels was dodgy, as often examples in the
> books have 3 
> samples from one beach, or 3 laboratory workers within one
> laboratory. 
> These are less than 5 levels, are they not?
> 
> Regards, Jana
> 
> Andrew Dolman schrieb:
> > Dear Jana,
> > 
> >  >An anova(lm1, lm2) 
> lm1<-lmer(y~x1+x2+...+(1|region)+(1|region:farm)); 
> > lm2<-lmer(y+x1+x2+...+(1|farm)) said models did not
> differ significantly 
> > and AIC was about the same. So I know there is no
> additional explanatory 
> > power including the region term.
> > 
> >  >Yet, I would like to keep the region effect
> in the model to separate 
> > and compare the effect size of region vs. farm. Is it
> valid to do so 
> > even if  some of the regions are only represented
> by one farm?
> > 
> > I don't think you have the data to ask questions about
> differences 
> > between regions as distinct from differences between
> farms. Look at it 
> > this way. If you were just doing a normal comparison
> between regions and 
> > you only looked at 1 or 2 farms per region, would you
> have the 
> > statistical power to say that differences were due to
> region rather than 
> > farm? Answer = No.
> > 
> > Similarly, are the differences between the farms
> because they are in 
> > different regions or just normal variation between
> farms? Well you only 
> > have 2 farms per region so it's hard to tell. Maybe
> you just have enough 
> > data if pairs of farms within regions are always very
> similar and 
> > differences between regions large.
> > 
> > Also. Fitting random effects with fewer than 5 levels
> is dodgy, and you 
> > only have 2 levels of farm per region, sometimes 1.
> > 
> > Perhaps you could look at it this way.
> > 
> > compare
> > 
> > m1 <- lmer (y~(1|region))
> > m2 <- lmer (y~(1|farm))
> > 
> > If m2 is better then there is variation between farms
> within regions, if 
> > there's no difference then region accounts for most of
> the variation. 
> > BUT you've not got much power to detect farm effects
> within regions, so 
> > a null result is not strong evidence for the absence
> of farm variation 
> > within regions.
> > 
> > 
> > Andy.
>> > 
> > 
> > andydolman at gmail.com
> <mailto:andydolman at gmail.com>
> > 
> > 
> > 
> 
> -- 
> Jana B?rger
> 
> Universit?t Rostock
> Agrar-  und Umweltwissenschaftliche Fakult?t
> FG Phytomedizin
> Satower Stra?e 48
> 18059 Rostock
> 
> Tel. 0381-498 31 71
> Fax.0381-498 31 62
> 
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Mon, 29 Mar 2010 19:45:15 +0200
> From: Eric Castet <Eric.Castet at incm.cnrs-mrs.fr>
> To: Douglas Bates <bates at stat.wisc.edu>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Correlation of -1: is it a
> problem?
> Message-ID: <4BB0E72B.2070404 at incm.cnrs-mrs.fr>
> Content-Type: text/plain
> 
> Dear Doug,
> 
> Thanks for your reply.
> 
> I've done as you suggested (your point b/), i.e. I've fit
> another model 
> that considers 'couleurs' within 'nom'
> However, after running anova() (likelihood ratio test), I
> find that I 
> should keep the initial model that contains the -1
> correlation:
> 
>  > anova (jb.lmer1, jb.lmer2)
> Data: jb
> Models:
> jb.lmer2: lRT ~ couleurs + (1 | nom) + (1 | nom:couleurs)
> jb.lmer1: lRT ~ couleurs + (1 + couleurs | nom)
>          
> Df   AIC   BIC 
> logLik  Chisq Chi Df Pr(>Chisq)
> jb.lmer2  5 17260 17295 -8625.2
> jb.lmer1  6 17251 17293 -8619.4 11.584   
>   1  0.0006654 ***
> 
> So, the question is now:
> 
> a/ How can I justify to refuse the initial model that
> contains the -1 
> correlation?
> 
> b/ And a parallel question is: what is wrong with the -1
> correlation? Is 
> it because it is exactly -1 ? Would it still be a problem
> if it were -0.99 ?
> 
> c/ Ultimately, how can I report what appears to me as an
> important 
> result: namely the high correlation between the intercept
> and the 
> 'couleurs' effect for each subject?
> 
> 
> Thanks,
> Eric
> 
> 
> 
> 
> 
> >> a/ is it really a statistical (or numerical)
> problem to have a -1
> >> correlation in the model that I should keep?
> >>      
> >
> > Yes, it is.  The fitted model is has a singular
> variance-covariance
> > matrix for the random effects and that is not
> good.  In fact, it is no
> > longer a linear mixed model.
> >
> >    
> >> b/ is it possible to remove the correlation
> between Intercept and
> >> Couleurs, as I would do if Couleurs were not a
> categorical factor?
> >>      
> >
> > I would fit another model of
> >
> > IRT ~ couleurs + (1|nom:couleurs) + (1|nom)
> >
> > and see how that works.  This model is, in some
> sense, intermediate to
> > the models that you have fit above.
> >
> >    
> >> Thanks in advance,
> >>
> >> Eric Castet
> >>
> >>
> >>
> >>
> >> --
> >>
> >> Eric Castet
> >>
> >> Institut de Neurosciences Cognitives de la
> Méditerranée -- INCM CNRS
> >>
> >> 31 chemin Joseph Aiguier
> >>
> >> 13402 Marseille cedex 20 (France)
> >>
> >> tel : (+33)(0)4-91-16-43-34
> >>
> >> fax : (+33) (0)4-91-16-44-98
> >>
> >> UMR 6193 du CNRS
> >>
> >> Université Aix-Marseille II
> >>
> >> http://www.incm.cnrs-mrs.fr/equipedyva.php
> >>
> >> http://www.incm.cnrs-mrs.fr/pperso/ecastet.php
> >>
> >>
> >>
> >>
> >>     
>    [[alternative HTML version deleted]]
> >>
> >>
> >> _______________________________________________
> >> R-sig-mixed-models at r-project.org 
> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>
> >>
> >>      
> >
> >    
> 
> -- 
> 
> Eric Castet
> 
> Institut de Neurosciences Cognitives de la Méditerranée
> -- INCM CNRS
> 
> 31 chemin Joseph Aiguier
> 
> 13402 Marseille cedex 20 (France)
> 
> tel : (+33)(0)4-91-16-43-34
> 
> fax : (+33) (0)4-91-16-44-98
> 
> UMR 6193 du CNRS
> 
> Université Aix-Marseille II
> 
> http://www.incm.cnrs-mrs.fr/equipedyva.php
> 
> http://www.incm.cnrs-mrs.fr/pperso/ecastet.php
> 
> 
>     [[alternative HTML version deleted]]
> 
> 
> 
> ------------------------------
> 
> Message: 5
> Date: Mon, 29 Mar 2010 13:55:01 -0400
> From: "Luca Borger" <lborger at uoguelph.ca>
> To: Jana B?rger <jana.buerger at uni-rostock.de>,   
> "Andrew Dolman"
>     <andydolman at gmail.com>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] unbalanced data in nested lmer
> model
> Message-ID:
> <D63BEA7FF21244B88504AA33F4E7C14D at lborger>
> Content-Type: text/plain; format=flowed; charset="utf-8";
>     reply-type=response
> 
> Hello,
> 
> as Andrew and others already explained:
> 
> > There is more than 500 cases
> 
> Fine, this might give you reasonable estimates about how
> your y is affected 
> by your fixed effects covariates (x1,x2,...)
> 
> > (...) on 8 farms in 6 regions.
> #and, from your previos post
> >For 2 of 8 regions there is only 1 farm, the other
> regions have 2 farms.
> 
> thus no way to estimate a difference between region or farm
> effects for 2 
> regions, and very, very limited power for the other 6 (just
> 2 farms per 
> region). To make things worse your data are also quite
> unbalanced:
> 
> >unbalance of case numbers in cells? Or would it be no
> problem if cell sizes 
> >vary between 0 and 53?
> 
> which I think means for some farms you got only one record?
> Anyway, to 
> recap, probably OK data for understanding y~x1+x2 etc.,
> insufficient data 
> otherwise (should invest in getting data for more farms
> within regions, not 
> more data for the farms you have already sampled).
> 
> > Moreover I don't understand your argument that fitting
> random efects with 
> > less than 5 levels was dodgy, as often examples in the
> books have 3 
> > samples from one beach, or 3 laboratory workers within
> one laboratory. 
> > These are less than 5 levels, are they not?
> 
> These are usually toy datasets to exemplify how the
> approach works, I do not 
> think they make a claim that the resulting variance
> estimates are very 
> reliable (think in the Zuur etal. mixed effects book you
> can find more 
> realistic examples, if I remember well). Plus, "level"
> refers to the number 
> of beaches or the number of labs etc. and the resulting
> variance estimates - 
> if less than say 5 it appears that you might be better off
> fitting it as a 
> fixed effect and not trying to decompose the variance into
> between labs and 
> within labs etc. Anyway, just my 2 cents and hope I
> explained this 
> correctly...
> 
> 
> See also the wiki page set up by Ben Bolker:
> http://glmm.wikidot.com/faq
> 
> e.g. you might be interested in this entry therein:
> 
> Zero or very small random effects variance estimates;
> (...)
> Very small variance estimates, or very large correlation
> estimates, often 
> indicates unidentifiability/lack of data (either due to
> exact 
> identifiability [e.g. designs that are not replicated at an
> important level] 
> or weak identifiable (designs that would be workable with
> more data of the 
> same type)
> 
> 
> HTH
> 
> 
> Cheers,
> 
> Luca
> 
> 
> 
> 
> 
> 
> ----- Original Message ----- 
> From: "Jana B?rger" <jana.buerger at uni-rostock.de>
> To: "Andrew Dolman" <andydolman at gmail.com>
> Cc: <r-sig-mixed-models at r-project.org>
> Sent: Monday, March 29, 2010 10:17 AM
> Subject: Re: [R-sig-ME] unbalanced data in nested lmer
> model
> 
> 
> > Dear Andrew and other list members,
> > As I described in an earlier 
> > post(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q1/003503.html)
> > my data is actually hierarchical down to the level of
> fields within farms.
> >
> > There is more than 500 cases on 8 farms in 6 regions.
> > Would you not think that gives enough power to
> distinguish within region 
> > variability vs. between regions?
> >
> > Moreover I don't understand your argument that fitting
> random efects with 
> > less than 5 levels was dodgy, as often examples in the
> books have 3 
> > samples from one beach, or 3 laboratory workers within
> one laboratory. 
> > These are less than 5 levels, are they not?
> >
> > Regards, Jana
> >
> > Andrew Dolman schrieb:
> >> Dear Jana,
> >>
> >>  >An anova(lm1, lm2) 
> lm1<-lmer(y~x1+x2+...+(1|region)+(1|region:farm)); 
> >> lm2<-lmer(y+x1+x2+...+(1|farm)) said models did
> not differ significantly 
> >> and AIC was about the same. So I know there is no
> additional explanatory 
> >> power including the region term.
> >>
> >>  >Yet, I would like to keep the region
> effect in the model to separate 
> >> and compare the effect size of region vs. farm. Is
> it valid to do so even 
> >> if  some of the regions are only represented
> by one farm?
> >>
> >> I don't think you have the data to ask questions
> about differences 
> >> between regions as distinct from differences
> between farms. Look at it 
> >> this way. If you were just doing a normal
> comparison between regions and 
> >> you only looked at 1 or 2 farms per region, would
> you have the 
> >> statistical power to say that differences were due
> to region rather than 
> >> farm? Answer = No.
> >>
> >> Similarly, are the differences between the farms
> because they are in 
> >> different regions or just normal variation between
> farms? Well you only 
> >> have 2 farms per region so it's hard to tell.
> Maybe you just have enough 
> >> data if pairs of farms within regions are always
> very similar and 
> >> differences between regions large.
> >>
> >> Also. Fitting random effects with fewer than 5
> levels is dodgy, and you 
> >> only have 2 levels of farm per region, sometimes
> 1.
> >>
> >> Perhaps you could look at it this way.
> >>
> >> compare
> >>
> >> m1 <- lmer (y~(1|region))
> >> m2 <- lmer (y~(1|farm))
> >>
> >> If m2 is better then there is variation between
> farms within regions, if 
> >> there's no difference then region accounts for
> most of the variation. BUT 
> >> you've not got much power to detect farm effects
> within regions, so a 
> >> null result is not strong evidence for the absence
> of farm variation 
> >> within regions.
> >>
> >>
> >> Andy.
> >>  andydolman at gmail.com
> <mailto:andydolman at gmail.com>
> >>
> >>
> >>
> >
> > -- 
> > Jana B?rger
> >
> > Universit?t Rostock
> > Agrar-  und Umweltwissenschaftliche Fakult?t
> > FG Phytomedizin
> > Satower Stra?e 48
> > 18059 Rostock
> >
> > Tel. 0381-498 31 71
> > Fax.0381-498 31 62
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> 
> 
> 
> ------------------------------
> 
> Message: 6
> Date: Mon, 29 Mar 2010 16:25:12 -0400
> From: Ben Bolker <bolker at ufl.edu>
> To: Luca Borger <lborger at uoguelph.ca>
> Cc: Andrew Dolman <andydolman at gmail.com>,
>     "r-sig-mixed-models at r-project.org"
> <r-sig-mixed-models at r-project.org>,
>     Jana B?rger <jana.buerger at uni-rostock.de>
> Subject: Re: [R-sig-ME] unbalanced data in nested lmer
> model
> Message-ID: <4BB10CA8.6030803 at ufl.edu>
> Content-Type: text/plain; charset=UTF-8
> 
> Luca Borger wrote:
> 
> [Jana B?rger:]
> >> Moreover I don't understand your argument that
> fitting random efects with 
> >> less than 5 levels was dodgy, as often examples in
> the books have 3 
> >> samples from one beach, or 3 laboratory workers
> within one laboratory. 
> >> These are less than 5 levels, are they not?
> > 
> > These are usually toy datasets to exemplify how the
> approach works, I do not 
> > think they make a claim that the resulting variance
> estimates are very 
> > reliable (think in the Zuur etal. mixed effects book
> you can find more 
> > realistic examples, if I remember well). Plus, "level"
> refers to the number 
> > of beaches or the number of labs etc. and the
> resulting variance estimates - 
> > if less than say 5 it appears that you might be better
> off fitting it as a 
> > fixed effect and not trying to decompose the variance
> into between labs and 
> > within labs etc. Anyway, just my 2 cents and hope I
> explained this 
> > correctly... 
> > 
> > See also the wiki page set up by Ben Bolker:
> > http://glmm.wikidot.com/faq
> > 
> > e.g. you might be interested in this entry therein:
> > 
> > Zero or very small random effects variance estimates;
> > (...)
> > Very small variance estimates, or very large
> correlation estimates, often 
> > indicates unidentifiability/lack of data (either due
> to exact 
> > identifiability [e.g. designs that are not replicated
> at an important level] 
> > or weak identifiable (designs that would be workable
> with more data of the 
> > same type)
> 
>   I just added this to the FAQ:
> 
> Should I treat factor xxx as fixed or random?
> 
> This is in general a far more difficult question than it
> seems on the
> surface. There are many competing philosophies and
> definitions (see
> Gelman 2xxx). One point of particular relevance to 'modern'
> mixed model
> estimation (rather than 'classical' method-of-moments
> estimation) is
> that, for practical purposes, there must be a reasonable
> number of
> random-effects levels (e.g. blocks) ? more than 5 or 6 at a
> minimum.
> 
>     e.g., from Crawley (2002) p. 670: "Are there
> enough levels of the
> factor in the data on which to base an estimate of the
> variance of the
> population of effects? No, means [you should probably treat
> the variable
> as] fixed effects."
> 
> Some researchers (who treat fixed vs random as a
> philosophical rather
> than a pragmatic decision) object to this approach.
> 
> Treating factors with small numbers of levels as random
> will in the best
> case lead to very small estimates of random effects; in the
> worst case
> it will lead to various numerical difficulties such as lack
> of
> convergence, zero variance estimates, etc.. In the
> classical
> method-of-moments approach these problems do not arise
> (because the sums
> of squares are always well defined as long as there are at
> least two
> units), but the underlying problems of lack of power are
> there nevertheless.
> 
>    (Contributions welcome!)
> 
> -- 
> Ben Bolker
> Associate professor, Biology Dep't, Univ. of Florida
> bolker at ufl.edu /
> people.biology.ufl.edu/bolker
> GPG key:
> people.biology.ufl.edu/bolker/benbolker-publickey.asc
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> R-sig-mixed-models mailing list
> R-sig-mixed-models at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> End of R-sig-mixed-models Digest, Vol 39, Issue 47
> **************************************************
> 


    



More information about the R-sig-mixed-models mailing list