[R-sig-ME] Unbalanced nested design

Thu Apr 23 01:13:15 CEST 2009

Hi Stephen -
You'd get an exactly equivalent result if you calculated averages over  
site and based your analysis on those.  The analysis can be a one-way  
anova with site as a fixed effect, with just one source of random  
variation.  So the unbalance does not matter, except that the F- 
statistic is an approximate F-statistic.  (For comparisons between the  
region with 3 sites and the other region, you need the Welch or  
Satterthwaite approx; see under help(t.test).  The unbalance is  
however relatively mild, as such things go!)

The F-statistic that tests for fixed site differences in the anova is  
the same statistic that tests for a non-zero between site component of  
variance in the nlme analysis, if that is what you had.  (The multiple  
observations within sites do not affect the argument.

(Even if there had been unbalance in the number of observations per  
site, the anova and nlme would still come up with the same number for  
the F-test for comparing regions, but now all comparisons between  
regions might involve use of Satterthwaite type approximation, if you  
use that ancient technology!)

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

On 21/04/2009, at 11:32 PM, Stephen Cole wrote:

> Hello All - I would like to run a 2 factor nested ANOVA.  The design
> is unbalanced as i have 6 sites in 3 regions and 3 sites in 1 other
> region. Site is nested in region  I am interested in the differences
> in mean recruit density among 4 regions.  I have used the lme function
> in the nlme library and am confused about the output.  I have a copy
> of P/B and I quote p. 25 " The lme function does produce sensible
> maximum likelihood estimates or restricted maximum likelihood
> estimates from the unbalanced data."  Thus, as i understand it lme can
> handle this unbalanced data set that i have.  However, when i compared
> the lme model to an aov model, the fixed effects results are
> identical. (f-ratio and p-value).  How does lme handle unbalanced data
> if the result is the same as aov which can not handle unbalanced data.
> Thank-you for any help provided.
>
> I have attached a subsection of my data.  The total number of records
> is 420, with a sample of 20 quadrat counts from each site (21 sites x
> 20 = 420)
>
> Data:
>         adults  recruits region site site2   site3
> 1      138 1268.3300    ANS    1     1    ANS:1
> 2      131  608.3300     ANS    1     1    ANS:1
> 3       13  696.8800     ANS    1     1    ANS:1
> 4       12  412.5000     ANS    1     1    ANS:1
> 5        2  355.5600     ANS    1     1    ANS:1
> 6        0  528.0000     ANS    1     1    ANS:1
> 7        4  421.2100     ANS    1     1    ANS:1
> 8        0  378.0000     ANS    1     1    ANS:1
> 9       92  893.3300    ANS    1     1    ANS:1
> 10      77 1184.3100   ANS    1     1    ANS:1
> 11      92  961.4200    ANS    1     1    ANS:1
> 12       0 1029.0000    ANS    1     1   ANS:1
> 13      19 1144.6800    ANS    1     1   ANS:1
>
> Region (4 levels, fixed)
> Site (6 levels and 3 levels, random)
>
> data$site <- as.factor(data$site)
> data$site3 <-factor(data$region:data$site)
>
>
> mod.lme <- lme(recruits ~ region, data=data, random=~1|site3)
>
> anova(mod)
>
>           numDF denDF  F-value p-value
> (Intercept)     1   399 93.58730  <.0001
> region          3    17 19.21751  <.0001
>
> mod.aov <- aov(recruits ~ region + Error(site3), data=data)
> summary(mod.aov)\
>
> Error: site3
>                Df   Sum Sq  Mean Sq F value    Pr(>F)
> region        3  32024226 10674742  19.218 1.057e-05 ***
> Residuals  17  9442984   555470
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Error: Within
>                 Df   Sum Sq  Mean Sq F value Pr(>F)
> Residuals  399 11839881    29674
>
> Now, both F-ratios are 19.21, I am not sure what i am doing
> incorrectly but I would appreciate any advice on my mistake.  Thanks
> very much
>
> Stephen Cole
> Graduate student
> Marine Ecology Lab
> Saint Francis Xavier University
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models