[R-sig-ME] interactions

Tue Mar 30 10:29:04 CEST 2010

Thank you for the email. 
I want to see whether a marker is differentially severe or lenient in different days. Therefore, I need to get an overall severity for each marker (probably a fixed effect?), but also an additional estimate for every marker on each day (random effect?). 
I am not sure if I need a varying intercept per marker per day or if the correct term for this is 'slope' as you call it. (Since 'day' is categorical, should we use the term 'intercept' or 'slope'?)
In any case, now that I explained to you what I want to do, do you still think that (1|marker/day) is an option? I am not sure how different this is compared to (day|marker). 

I know that we all have limited time, but any help is welcome

thanks

jason

Dr. Iasonas Lamprianou

Assistant Professor (Educational Research and Evaluation)
Department of Education Sciences
European University-Cyprus
P.O. Box 22006
1516 Nicosia
Cyprus 
Tel.: +357-22-713178
Fax: +357-22-590539

Honorary Research Fellow
Department of Education
The University of Manchester
Oxford Road, Manchester M13 9PL, UK
Tel. 0044  161 275 3485
iasonas.lamprianou at manchester.ac.uk

--- On Tue, 30/3/10, ONKELINX, Thierry <Thierry.ONKELINX at inbo.be> wrote:

> From: ONKELINX, Thierry <Thierry.ONKELINX at inbo.be>
> Subject: RE: [R-sig-ME] interactions
> To: "Iasonas Lamprianou" <lamprianou at yahoo.com>, r-sig-mixed-models at r-project.org
> Date: Tuesday, 30 March, 2010, 9:08
> Dear Jason,
> 
> Are you interested in the variability of the random
> intercept for each day along marker? If you just want to
> allow a different random intercept per day and per marker,
> you could consider (1|marker/day) instead of (day|marker).
> If day has 15 levels then (day|marker) has to estimate 15
> variances and 105 covariances! This is probably the reason
> why the model doesn't converge. (1|marker/day) requires only
> 2 variances to be estimated.
> 
> HTH,
> 
> Thierry
> 
> 
> ----------------------------------------------------------------------------
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> 
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> 
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be
> www.inbo.be
> 
> To call in the statistician after the experiment is done
> may be no more than asking him to perform a post-mortem
> examination: he may be able to say what the experiment died
> of.
> ~ Sir Ronald Aylmer Fisher
> 
> The plural of anecdote is not data.
> ~ Roger Brinner
> 
> The combination of some data and an aching desire for an
> answer does not ensure that a reasonable answer can be
> extracted from a given body of data.
> ~ John Tukey
>   
> 
> > -----Oorspronkelijk bericht-----
> > Van: r-sig-mixed-models-bounces at r-project.org
> 
> > [mailto:r-sig-mixed-models-bounces at r-project.org]
> Namens 
> > Iasonas Lamprianou
> > Verzonden: maandag 29 maart 2010 23:54
> > Aan: r-sig-mixed-models at r-project.org
> > Onderwerp: Re: [R-sig-ME] interactions
> > 
> > 
> > Dear colleagues,
> > apologies for the long question, but I really need
> some help. 
> > It is not a case of academic laziness, I have done my
> 
> > homework but now I need some advice.
> > 
> > I have 10000 students nested in 50 schools, each
> student 
> > taking a test. The tests are randomly grouped in
> batches of 
> > 20 scripts (around 500 batches). Then, two random
> markers 
> > (out of a pool of 57 markers) mark each batch blindly.
> Each 
> > marker marks around 17 random batches of 20 scripts.
> If the 
> > two random markers disagree by more than 10% on the
> total 
> > score for a specific student, then the test of the
> student is 
> > being given to another marker randomly for a third
> blind 
> > marking. So, I have students nested within schools
> which are 
> > nested in areas. Then, scripts (the tests of the
> students) 
> > nested into batches. Then, batches crossed with
> markers. 
> > However, the actual marking lasted around 15 days. One
> batch 
> > may be marked by the first marker on day 1 and then
> marked by 
> > the second marker on the next day (or the day after
> etc). So 
> > the batches are also crossed by days. 
> > 
> > My Research question is: are there markers who are 
> > differentially severe or lenient on different days? 
> > 
> > I use this model:
> > 
> > lmer(score ~ 
> >
> 1+day+gender+school+(1+day|marker)+(1|candidate)+(1|batch),
> 
> > mg2007_sub)
> > 
> > I assume that this model is appropriate, because it
> models 
> > candidates as random effects (each candidate has an
> ability 
> > estimate based on the scores he/she received from the
> two or 
> > the three markers if his/her script was remarked by a
> third 
> > marker). Also, every batch is modelled as a random
> effect. 
> > Every marker is modelled as a random effect but I
> allow 
> > his/her estimate to vary by day, so I assume that this
> would 
> > allow me to see if a marker behaves in a different way
> on 
> > different days. So this should answer my research
> question.
> > 
> > Question 1: does this model make sense statistically
> (as i 
> > formulate it in lme4?)
> > 
> > So far so good, but lme4 will run for 48 hours and
> then fail 
> > because it reached maximum iterations. I assume that
> even if 
> > I allow for more iterations, it will go on for ever.
> > 
> > Question 2: Why is it so slow? Why it does not
> converge?
> > 
> > Then, I decided to simplify the problem. Instead of
> modelling 
> > day as a factor, I use 'd' which is the cardinal
> number of 
> > day e.g. 1, 2, 3, days that passed since we started
> marking. 
> > I assume that this is an interval scale. If a marker
> becomes 
> > (linearly) more lenient, he/she would 'like' this
> model. So I run
> > 
> > lmer(score ~ 
> >
> 1+day+gender+dvhool+(1+d|marker)+(1|candidate)+(1|batch),
> mg2007_sub)
> > 
> > please notice that (1+day|marker) where day is a
> factor with 
> > 15 levels, becomes (1+d|marker) which is numeric
> variable
> > 
> > and this gives
> > 
> > Error terms:
> >  Groups    Name     
>   Std.Dev. Corr  
> >  candidate (Intercept) 16.85     
>     
> >  batch     (Intercept) 
> 3.49          
> >  marker    (Intercept)  4.68 
>         
> >            d 
>           0.38    -0.77 
> >  Residual         
>      4.83       
>   
> > ---
> > number of obs: 18001, groups: candidate, 9402; batch,
> 470; 
> > marker, 59 AIC = 138232, DIC = 138388.9 deviance =
> 138230.4 
> > 
> > 
> > Question 3: How do I know that the variance of 'd' is
> 
> > statistically significant? MlWin gives some error
> estimates.
> > 
> > Question 4: The correlation of -0.77 is a problem?
> > 
> > I hope somebody can help me, this has taken me three
> days up to now...
> > 
> > Thank you 
> > 
> > jason
> > 
> > 
> > 
> > Dr. Iasonas Lamprianou
> > 
> > 
> > Assistant Professor (Educational Research and
> Evaluation) 
> > Department of Education Sciences European
> University-Cyprus 
> > P.O. Box 22006
> > 1516 Nicosia
> > Cyprus
> > Tel.: +357-22-713178
> > Fax: +357-22-590539
> > 
> > 
> > Honorary Research Fellow
> > Department of Education
> > The University of Manchester
> > Oxford Road, Manchester M13 9PL, UK
> > Tel. 0044  161 275 3485
> > iasonas.lamprianou at manchester.ac.uk
> > 
> > 
> > --- On Mon, 29/3/10, r-sig-mixed-models-request at r-project.org
> 
> > <r-sig-mixed-models-request at r-project.org>
> wrote:
> > 
> > > From: r-sig-mixed-models-request at r-project.org
> 
> > <r-sig-mixed-models-request at r-project.org>
> > > Subject: R-sig-mixed-models Digest, Vol 39, Issue
> 47
> > > To: r-sig-mixed-models at r-project.org
> > > Date: Monday, 29 March, 2010, 21:25
> > > Send R-sig-mixed-models mailing list
> > > submissions to
> > >     r-sig-mixed-models at r-project.org
> > > 
> > > To subscribe or unsubscribe via the World Wide
> Web, visit
> > >     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > or, via email, send a message with subject or
> body 'help'
> > > to
> > >     r-sig-mixed-models-request at r-project.org
> > > 
> > > You can reach the person managing the list at
> > >     r-sig-mixed-models-owner at r-project.org
> > > 
> > > When replying, please edit your Subject line so
> it is more
> > > specific
> > > than "Re: Contents of R-sig-mixed-models
> digest..."
> > > 
> > > 
> > > Today's Topics:
> > > 
> > >    1. Re: The issue of increasing maximum
> > > number of iterations
> > >       (Ben Bolker)
> > >    2. poisson GLMER with identity link (Tim
> > > Carnus)
> > >    3. Re: unbalanced data in nested lmer
> > > model (Jana B?rger)
> > >    4. Re: Correlation of -1: is it a
> > > problem? (Eric Castet)
> > >    5. Re: unbalanced data in nested lmer
> > > model (Luca Borger)
> > >    6. Re: unbalanced data in nested lmer
> > > model (Ben Bolker)
> > > 
> > > 
> > > 
> >
> ----------------------------------------------------------------------
> > > 
> > > Message: 1
> > > Date: Mon, 29 Mar 2010 15:50:30 +0000 (UTC)
> > > From: Ben Bolker <bolker at ufl.edu>
> > > To: r-sig-mixed-models at r-project.org
> > > Subject: Re: [R-sig-ME] The issue of increasing
> maximum
> > > number of
> > >     iterations
> > > Message-ID: <loom.20100329T171254-536 at post.gmane.org>
> > > Content-Type: text/plain; charset=us-ascii
> > > 
> > > Martin Stjernman <Martin.Stjernman at ...>
> writes:
> > > 
> > > > 
> > > > Dear listmembers!
> > > > 
> > > > I have also had trouble with the maxIter
> specification
> > > in (g)lmer. 
> > > > I've tried to increase the number of
> > > > iterations (to 1000) via the control
> statement in lmer
> > > 
> > > > but it still stops at 300 (the default).
> > > > We are now a number of users having problem
> with this
> > > but 
> > > > I haven't seen any solution to the problem
> on this
> > > > list and I would really appreciate any hint
> on what I
> > > am doing wrong.
> > > > I am running on a Vista 64-bit machine and
> my model
> > > specification and
> > > >  session info follows below. Please let
> > > > me know if any other information needs to
> be
> > > attached.
> > > 
> > >   Looking at the source code in lmer.R, this
> seems
> > > like
> > > a moderately obvious hole: on or about line 714
> (I may be
> > > off by a couple of lines) we have
> > > 
> > >     if (missing(verbose)) verbose <-
> > > cv$msVerbose
> > > ### FIXME: issue a warning if the model argument
> is
> > > FALSE.  It is ignored. 
> > > 
> > >  adding the lines
> > > 
> > >     FL$dims["mxit"] <- cv$maxIter
> > >     FL$dims["mxfn"] <- cv$maxFN
> > > 
> > >   appears to work: running
> > > 
> > >  (gm1 <- glmer(cbind(incidence, size -
> incidence) ~
> > > period + 
> > >    (1 | herd), family=binomial, data=cbpp,
> > > control=list(maxIter=1000)))
> > > 
> > > produces a model with the correct iteration
> number listed
> > > in its
> > > guts.
> > > 
> > >   The next question is whether you or someone at
> your
> > > institution
> > > is capable of modifying lmer.R appropriately and
> rebuilding
> > > the
> > > package ...
> > > 
> > > 
> > > 
> > > ------------------------------
> > > 
> > > Message: 2
> > > Date: Mon, 29 Mar 2010 17:45:08 +0100
> > > From: Tim Carnus <Tim.Carnus at ucd.ie>
> > > To: r-sig-mixed-models at r-project.org
> > > Subject: [R-sig-ME] poisson GLMER with identity
> link
> > > Message-ID:
> <1269881108.3596.24.camel at tim-laptop>
> > > Content-Type: text/plain; charset=UTF-8
> > > 
> > > Dear list,
> > > 
> > > I am trying to fit a number of GLMERs to count
> data with an
> > > additive
> > > model (in the predictors) that requires the use
> of the
> > > identity link
> > > function. For about half of my response variables
> this
> > > causes no
> > > problems. However in a number of cases the model
> fitting
> > > runs into
> > > problems with regards estimation of negative mean
> (for e.g.
> > > the error
> > > message in mer_finalize: mu[i] must be positive:
> mu =
> > > 1.76267e-312, i =
> > > 13075456). As far as I understand this is well
> known and
> > > documented, and
> > > guarding against that possibility is necessary,
> and built
> > > in to say the
> > > glm() function.
> > > 
> > > My question then is, how can I do this with lmer?
> (ie how
> > > can I specify
> > > the constraints necessary to fit these types of
> models, if
> > > at all
> > > possible)
> > > 
> > > Best regards,
> > > 
> > > Tim Carnus
> > > 
> > > 
> > > 
> > > ------------------------------
> > > 
> > > Message: 3
> > > Date: Mon, 29 Mar 2010 16:17:41 +0200
> > > From: Jana B?rger <jana.buerger at uni-rostock.de>
> > > To: Andrew Dolman <andydolman at gmail.com>
> > > Cc: "r-sig-mixed-models at r-project.org"
> > >     <r-sig-mixed-models at r-project.org>
> > > Subject: Re: [R-sig-ME] unbalanced data in nested
> lmer
> > > model
> > > Message-ID: <4BB0B685.70506 at uni-rostock.de>
> > > Content-Type: text/plain; charset="UTF-8";
> format=flowed
> > > 
> > > Dear Andrew and other list members,
> > > As I described in an earlier 
> > > 
> > post(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q1/
> > 003503.html)
> > > my data is actually hierarchical down to the
> level of
> > > fields within farms.
> > > 
> > > There is more than 500 cases on 8 farms in 6
> regions.
> > > Would you not think that gives enough power to
> distinguish
> > > within region 
> > > variability vs. between regions?
> > > 
> > > Moreover I don't understand your argument that
> fitting
> > > random efects 
> > > with less than 5 levels was dodgy, as often
> examples in the
> > > books have 3 
> > > samples from one beach, or 3 laboratory workers
> within one
> > > laboratory. 
> > > These are less than 5 levels, are they not?
> > > 
> > > Regards, Jana
> > > 
> > > Andrew Dolman schrieb:
> > > > Dear Jana,
> > > > 
> > > >  >An anova(lm1, lm2) 
> > >
> lm1<-lmer(y~x1+x2+...+(1|region)+(1|region:farm)); 
> > > > lm2<-lmer(y+x1+x2+...+(1|farm)) said
> models did not
> > > differ significantly 
> > > > and AIC was about the same. So I know there
> is no
> > > additional explanatory 
> > > > power including the region term.
> > > > 
> > > >  >Yet, I would like to keep the region
> effect
> > > in the model to separate 
> > > > and compare the effect size of region vs.
> farm. Is it
> > > valid to do so 
> > > > even if  some of the regions are only
> represented
> > > by one farm?
> > > > 
> > > > I don't think you have the data to ask
> questions about
> > > differences 
> > > > between regions as distinct from differences
> between
> > > farms. Look at it 
> > > > this way. If you were just doing a normal
> comparison
> > > between regions and 
> > > > you only looked at 1 or 2 farms per region,
> would you
> > > have the 
> > > > statistical power to say that differences
> were due to
> > > region rather than 
> > > > farm? Answer = No.
> > > > 
> > > > Similarly, are the differences between the
> farms
> > > because they are in 
> > > > different regions or just normal variation
> between
> > > farms? Well you only 
> > > > have 2 farms per region so it's hard to
> tell. Maybe
> > > you just have enough 
> > > > data if pairs of farms within regions are
> always very
> > > similar and 
> > > > differences between regions large.
> > > > 
> > > > Also. Fitting random effects with fewer than
> 5 levels
> > > is dodgy, and you 
> > > > only have 2 levels of farm per region,
> sometimes 1.
> > > > 
> > > > Perhaps you could look at it this way.
> > > > 
> > > > compare
> > > > 
> > > > m1 <- lmer (y~(1|region))
> > > > m2 <- lmer (y~(1|farm))
> > > > 
> > > > If m2 is better then there is variation
> between farms
> > > within regions, if 
> > > > there's no difference then region accounts
> for most of
> > > the variation. 
> > > > BUT you've not got much power to detect farm
> effects
> > > within regions, so 
> > > > a null result is not strong evidence for the
> absence
> > > of farm variation 
> > > > within regions.
> > > > 
> > > > 
> > > > Andy.
> > > >  
> > > > 
> > > > 
> > > > andydolman at gmail.com
> > > <mailto:andydolman at gmail.com>
> > > > 
> > > > 
> > > > 
> > > 
> > > -- 
> > > Jana B?rger
> > > 
> > > Universit?t Rostock
> > > Agrar-  und Umweltwissenschaftliche Fakult?t
> > > FG Phytomedizin
> > > Satower Stra?e 48
> > > 18059 Rostock
> > > 
> > > Tel. 0381-498 31 71
> > > Fax.0381-498 31 62
> > > 
> > > 
> > > 
> > > ------------------------------
> > > 
> > > Message: 4
> > > Date: Mon, 29 Mar 2010 19:45:15 +0200
> > > From: Eric Castet <Eric.Castet at incm.cnrs-mrs.fr>
> > > To: Douglas Bates <bates at stat.wisc.edu>
> > > Cc: r-sig-mixed-models at r-project.org
> > > Subject: Re: [R-sig-ME] Correlation of -1: is it
> a
> > > problem?
> > > Message-ID: <4BB0E72B.2070404 at incm.cnrs-mrs.fr>
> > > Content-Type: text/plain
> > > 
> > > Dear Doug,
> > > 
> > > Thanks for your reply.
> > > 
> > > I've done as you suggested (your point b/), i.e.
> I've fit
> > > another model 
> > > that considers 'couleurs' within 'nom'
> > > However, after running anova() (likelihood ratio
> test), I
> > > find that I 
> > > should keep the initial model that contains the
> -1
> > > correlation:
> > > 

> > >  > anova (jb.lmer1, jb.lmer2)
> > > Data: jb
> > > Models:
> > > jb.lmer2: lRT ~ couleurs + (1 | nom) + (1 |
> nom:couleurs)
> > > jb.lmer1: lRT ~ couleurs + (1 + couleurs | nom)
> > >          
> > > Df   AIC   BIC 
> > > logLik  Chisq Chi Df Pr(>Chisq)
> > > jb.lmer2  5 17260 17295 -8625.2
> > > jb.lmer1  6 17251 17293 -8619.4 11.584   
> > >   1  0.0006654 ***
> > > 
> > > So, the question is now:
> > > 
> > > a/ How can I justify to refuse the initial model
> that
> > > contains the -1 
> > > correlation?
> > > 
> > > b/ And a parallel question is: what is wrong with
> the -1
> > > correlation? Is 
> > > it because it is exactly -1 ? Would it still be a
> problem
> > > if it were -0.99 ?
> > > 
> > > c/ Ultimately, how can I report what appears to
> me as an
> > > important 
> > > result: namely the high correlation between the
> intercept
> > > and the 
> > > 'couleurs' effect for each subject?
> > > 
> > > 
> > > Thanks,
> > > Eric
> > > 
> > > 
> > > 
> > > 
> > > 
> > > >> a/ is it really a statistical (or
> numerical)
> > > problem to have a -1
> > > >> correlation in the model that I should
> keep?
> > > >>      
> > > >
> > > > Yes, it is.  The fitted model is has a
> singular
> > > variance-covariance
> > > > matrix for the random effects and that is
> not
> > > good.  In fact, it is no
> > > > longer a linear mixed model.
> > > >
> > > >    
> > > >> b/ is it possible to remove the
> correlation
> > > between Intercept and
> > > >> Couleurs, as I would do if Couleurs were
> not a
> > > categorical factor?
> > > >>      
> > > >
> > > > I would fit another model of
> > > >
> > > > IRT ~ couleurs + (1|nom:couleurs) + (1|nom)
> > > >
> > > > and see how that works.  This model is, in
> some
> > > sense, intermediate to
> > > > the models that you have fit above.
> > > >
> > > >    
> > > >> Thanks in advance,
> > > >>
> > > >> Eric Castet
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >>
> > > >> Eric Castet
> > > >>
> > > >> Institut de Neurosciences Cognitives de
> la
> > > Méditerranée -- INCM CNRS
> > > >>
> > > >> 31 chemin Joseph Aiguier
> > > >>
> > > >> 13402 Marseille cedex 20 (France)
> > > >>
> > > >> tel : (+33)(0)4-91-16-43-34
> > > >>
> > > >> fax : (+33) (0)4-91-16-44-98
> > > >>
> > > >> UMR 6193 du CNRS
> > > >>
> > > >> Université Aix-Marseille II
> > > >>
> > > >> http://www.incm.cnrs-mrs.fr/equipedyva.php
> > > >>
> > > >> http://www.incm.cnrs-mrs.fr/pperso/ecastet.php
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>     
> > >    [[alternative HTML version deleted]]
> > > >>
> > > >>
> > > >>
> _______________________________________________
> > > >> R-sig-mixed-models at r-project.org 
> > > mailing list
> > > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > >>
> > > >>
> > > >>      
> > > >
> > > >    
> > > 
> > > -- 
> > > 
> > > Eric Castet
> > > 
> > > Institut de Neurosciences Cognitives de la
> Méditerranée
> > > -- INCM CNRS
> > > 
> > > 31 chemin Joseph Aiguier
> > > 
> > > 13402 Marseille cedex 20 (France)
> > > 
> > > tel : (+33)(0)4-91-16-43-34
> > > 
> > > fax : (+33) (0)4-91-16-44-98
> > > 
> > > UMR 6193 du CNRS
> > > 
> > > Université Aix-Marseille II
> > > 
> > > http://www.incm.cnrs-mrs.fr/equipedyva.php
> > > 
> > > http://www.incm.cnrs-mrs.fr/pperso/ecastet.php
> > > 
> > > 
> > >     [[alternative HTML version deleted]]
> > > 
> > > 
> > > 
> > > ------------------------------
> > > 
> > > Message: 5
> > > Date: Mon, 29 Mar 2010 13:55:01 -0400
> > > From: "Luca Borger" <lborger at uoguelph.ca>
> > > To: Jana B?rger <jana.buerger at uni-rostock.de>,   
> > > "Andrew Dolman"
> > >     <andydolman at gmail.com>
> > > Cc: r-sig-mixed-models at r-project.org
> > > Subject: Re: [R-sig-ME] unbalanced data in nested
> lmer
> > > model
> > > Message-ID:
> > > <D63BEA7FF21244B88504AA33F4E7C14D at lborger>
> > > Content-Type: text/plain; format=flowed;
> charset="utf-8";
> > >     reply-type=response
> > > 
> > > Hello,
> > > 
> > > as Andrew and others already explained:
> > > 
> > > > There is more than 500 cases
> > > 
> > > Fine, this might give you reasonable estimates
> about how
> > > your y is affected 
> > > by your fixed effects covariates (x1,x2,...)
> > > 
> > > > (...) on 8 farms in 6 regions.
> > > #and, from your previos post
> > > >For 2 of 8 regions there is only 1 farm, the
> other
> > > regions have 2 farms.
> > > 
> > > thus no way to estimate a difference between
> region or farm
> > > effects for 2 
> > > regions, and very, very limited power for the
> other 6 (just
> > > 2 farms per 
> > > region). To make things worse your data are also
> quite
> > > unbalanced:
> > > 
> > > >unbalance of case numbers in cells? Or would
> it be no
> > > problem if cell sizes 
> > > >vary between 0 and 53?
> > > 
> > > which I think means for some farms you got only
> one record?
> > > Anyway, to 
> > > recap, probably OK data for understanding y~x1+x2
> etc.,
> > > insufficient data 
> > > otherwise (should invest in getting data for more
> farms
> > > within regions, not 
> > > more data for the farms you have already
> sampled).
> > > 
> > > > Moreover I don't understand your argument
> that fitting
> > > random efects with 
> > > > less than 5 levels was dodgy, as often
> examples in the
> > > books have 3 
> > > > samples from one beach, or 3 laboratory
> workers within
> > > one laboratory. 
> > > > These are less than 5 levels, are they not?
> > > 
> > > These are usually toy datasets to exemplify how
> the
> > > approach works, I do not 
> > > think they make a claim that the resulting
> variance
> > > estimates are very 
> > > reliable (think in the Zuur etal. mixed effects
> book you
> > > can find more 
> > > realistic examples, if I remember well). Plus,
> "level"
> > > refers to the number 
> > > of beaches or the number of labs etc. and the
> resulting
> > > variance estimates - 
> > > if less than say 5 it appears that you might be
> better off
> > > fitting it as a 
> > > fixed effect and not trying to decompose the
> variance into
> > > between labs and 
> > > within labs etc. Anyway, just my 2 cents and hope
> I
> > > explained this 
> > > correctly...
> > > 
> > > 
> > > See also the wiki page set up by Ben Bolker:
> > > http://glmm.wikidot.com/faq
> > > 
> > > e.g. you might be interested in this entry
> therein:
> > > 
> > > Zero or very small random effects variance
> estimates;
> > > (...)
> > > Very small variance estimates, or very large
> correlation
> > > estimates, often 
> > > indicates unidentifiability/lack of data (either
> due to
> > > exact 
> > > identifiability [e.g. designs that are not
> replicated at an
> > > important level] 
> > > or weak identifiable (designs that would be
> workable with
> > > more data of the 
> > > same type)
> > > 
> > > 
> > > HTH
> > > 
> > > 
> > > Cheers,
> > > 
> > > Luca
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > ----- Original Message ----- 
> > > From: "Jana B?rger" <jana.buerger at uni-rostock.de>
> > > To: "Andrew Dolman" <andydolman at gmail.com>
> > > Cc: <r-sig-mixed-models at r-project.org>
> > > Sent: Monday, March 29, 2010 10:17 AM
> > > Subject: Re: [R-sig-ME] unbalanced data in nested
> lmer
> > > model
> > > 
> > > 
> > > > Dear Andrew and other list members,
> > > > As I described in an earlier 
> > > > 
> > post(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q1/
> > 003503.html)
> > > > my data is actually hierarchical down to the
> level of
> > > fields within farms.
> > > >
> > > > There is more than 500 cases on 8 farms in 6
> regions.
> > > > Would you not think that gives enough power
> to
> > > distinguish within region 
> > > > variability vs. between regions?
> > > >
> > > > Moreover I don't understand your argument
> that fitting
> > > random efects with 
> > > > less than 5 levels was dodgy, as often
> examples in the
> > > books have 3 
> > > > samples from one beach, or 3 laboratory
> workers within
> > > one laboratory. 
> > > > These are less than 5 levels, are they not?
> > > >
> > > > Regards, Jana
> > > >
> > > > Andrew Dolman schrieb:
> > > >> Dear Jana,
> > > >>
> > > >>  >An anova(lm1, lm2) 
> > >
> lm1<-lmer(y~x1+x2+...+(1|region)+(1|region:farm)); 
> > > >> lm2<-lmer(y+x1+x2+...+(1|farm)) said
> models did
> > > not differ significantly 
> > > >> and AIC was about the same. So I know
> there is no
> > > additional explanatory 
> > > >> power including the region term.
> > > >>
> > > >>  >Yet, I would like to keep the
> region
> > > effect in the model to separate 
> > > >> and compare the effect size of region
> vs. farm. Is
> > > it valid to do so even 
> > > >> if  some of the regions are only
> represented
> > > by one farm?
> > > >>
> > > >> I don't think you have the data to ask
> questions
> > > about differences 
> > > >> between regions as distinct from
> differences
> > > between farms. Look at it 
> > > >> this way. If you were just doing a
> normal
> > > comparison between regions and 
> > > >> you only looked at 1 or 2 farms per
> region, would
> > > you have the 
> > > >> statistical power to say that
> differences were due
> > > to region rather than 
> > > >> farm? Answer = No.
> > > >>
> > > >> Similarly, are the differences between
> the farms
> > > because they are in 
> > > >> different regions or just normal
> variation between
> > > farms? Well you only 
> > > >> have 2 farms per region so it's hard to
> tell.
> > > Maybe you just have enough 
> > > >> data if pairs of farms within regions
> are always
> > > very similar and 
> > > >> differences between regions large.
> > > >>
> > > >> Also. Fitting random effects with fewer
> than 5
> > > levels is dodgy, and you 
> > > >> only have 2 levels of farm per region,
> sometimes
> > > 1.
> > > >>
> > > >> Perhaps you could look at it this way.
> > > >>
> > > >> compare
> > > >>
> > > >> m1 <- lmer (y~(1|region))
> > > >> m2 <- lmer (y~(1|farm))
> > > >>
> > > >> If m2 is better then there is variation
> between
> > > farms within regions, if 
> > > >> there's no difference then region
> accounts for
> > > most of the variation. BUT 
> > > >> you've not got much power to detect farm
> effects
> > > within regions, so a 
> > > >> null result is not strong evidence for
> the absence
> > > of farm variation 
> > > >> within regions.
> > > >>
> > > >>
> > > >> Andy.
> > > >>  andydolman at gmail.com
> > > <mailto:andydolman at gmail.com>
> > > >>
> > > >>
> > > >>
> > > >
> > > > -- 
> > > > Jana B?rger
> > > >
> > > > Universit?t Rostock
> > > > Agrar-  und Umweltwissenschaftliche
> Fakult?t
> > > > FG Phytomedizin
> > > > Satower Stra?e 48
> > > > 18059 Rostock
> > > >
> > > > Tel. 0381-498 31 71
> > > > Fax.0381-498 31 62
> > > >
> > > >
> _______________________________________________
> > > > R-sig-mixed-models at r-project.org
> > > mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > >
> > > 
> > > 
> > > 
> > > ------------------------------
> > > 
> > > Message: 6
> > > Date: Mon, 29 Mar 2010 16:25:12 -0400
> > > From: Ben Bolker <bolker at ufl.edu>
> > > To: Luca Borger <lborger at uoguelph.ca>
> > > Cc: Andrew Dolman <andydolman at gmail.com>,
> > >     "r-sig-mixed-models at r-project.org"
> > > <r-sig-mixed-models at r-project.org>,
> > >     Jana B?rger <jana.buerger at uni-rostock.de>
> > > Subject: Re: [R-sig-ME] unbalanced data in nested
> lmer
> > > model
> > > Message-ID: <4BB10CA8.6030803 at ufl.edu>
> > > Content-Type: text/plain; charset=UTF-8
> > > 
> > > Luca Borger wrote:
> > > 
> > > [Jana B?rger:]
> > > >> Moreover I don't understand your
> argument that
> > > fitting random efects with 
> > > >> less than 5 levels was dodgy, as often
> examples in
> > > the books have 3 
> > > >> samples from one beach, or 3 laboratory
> workers
> > > within one laboratory. 
> > > >> These are less than 5 levels, are they
> not?
> > > > 
> > > > These are usually toy datasets to exemplify
> how the
> > > approach works, I do not 
> > > > think they make a claim that the resulting
> variance
> > > estimates are very 
> > > > reliable (think in the Zuur etal. mixed
> effects book
> > > you can find more 
> > > > realistic examples, if I remember well).
> Plus, "level"
> > > refers to the number 
> > > > of beaches or the number of labs etc. and
> the
> > > resulting variance estimates - 
> > > > if less than say 5 it appears that you might
> be better
> > > off fitting it as a 
> > > > fixed effect and not trying to decompose the
> variance
> > > into between labs and 
> > > > within labs etc. Anyway, just my 2 cents and
> hope I
> > > explained this 
> > > > correctly... 
> > > > 
> > > > See also the wiki page set up by Ben
> Bolker:
> > > > http://glmm.wikidot.com/faq
> > > > 
> > > > e.g. you might be interested in this entry
> therein:
> > > > 
> > > > Zero or very small random effects variance
> estimates;
> > > > (...)
> > > > Very small variance estimates, or very
> large
> > > correlation estimates, often 
> > > > indicates unidentifiability/lack of data
> (either due
> > > to exact 
> > > > identifiability [e.g. designs that are not
> replicated
> > > at an important level] 
> > > > or weak identifiable (designs that would be
> workable
> > > with more data of the 
> > > > same type)
> > > 
> > >   I just added this to the FAQ:
> > > 
> > > Should I treat factor xxx as fixed or random?
> > > 
> > > This is in general a far more difficult question
> than it
> > > seems on the
> > > surface. There are many competing philosophies
> and
> > > definitions (see
> > > Gelman 2xxx). One point of particular relevance
> to 'modern'
> > > mixed model
> > > estimation (rather than 'classical'
> method-of-moments
> > > estimation) is
> > > that, for practical purposes, there must be a
> reasonable
> > > number of
> > > random-effects levels (e.g. blocks) ? more than 5
> or 6 at a
> > > minimum.
> > > 
> > >     e.g., from Crawley (2002) p. 670: "Are
> there
> > > enough levels of the
> > > factor in the data on which to base an estimate
> of the
> > > variance of the
> > > population of effects? No, means [you should
> probably treat
> > > the variable
> > > as] fixed effects."
> > > 
> > > Some researchers (who treat fixed vs random as a
> > > philosophical rather
> > > than a pragmatic decision) object to this
> approach.
> > > 
> > > Treating factors with small numbers of levels as
> random
> > > will in the best
> > > case lead to very small estimates of random
> effects; in the
> > > worst case
> > > it will lead to various numerical difficulties
> such as lack
> > > of
> > > convergence, zero variance estimates, etc.. In
> the
> > > classical
> > > method-of-moments approach these problems do not
> arise
> > > (because the sums
> > > of squares are always well defined as long as
> there are at
> > > least two
> > > units), but the underlying problems of lack of
> power are
> > > there nevertheless.
> > > 
> > >    (Contributions welcome!)
> > > 
> > > -- 
> > > Ben Bolker
> > > Associate professor, Biology Dep't, Univ. of
> Florida
> > > bolker at ufl.edu /
> > > people.biology.ufl.edu/bolker
> > > GPG key:
> > >
> people.biology.ufl.edu/bolker/benbolker-publickey.asc
> > > 
> > > 
> > > 
> > > ------------------------------
> > > 
> > > _______________________________________________
> > > R-sig-mixed-models mailing list
> > > R-sig-mixed-models at r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > 
> > > 
> > > End of R-sig-mixed-models Digest, Vol 39, Issue
> 47
> > >
> **************************************************
> > > 
> > 
> > 
> >     
> > _______________________________________________
> > R-sig-mixed-models at r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > 
> 
> Druk dit bericht a.u.b. niet onnodig af.
> Please do not print this message unnecessarily.
> 
> Dit bericht en eventuele bijlagen geven enkel de visie van
> de schrijver weer 
> en binden het INBO onder geen enkel beding, zolang dit
> bericht niet bevestigd is
> door een geldig ondertekend document. The views expressed
> in  this message 
> and any annex are purely those of the writer and may not be
> regarded as stating 
> an official position of INBO, as long as the message is not
> confirmed by a duly 
> signed document.
>