[R-sig-ME] Should blocking factors be modeled as random effects?

Juan Pedro Steibel steibelj at msu.edu
Sat Jan 31 01:38:33 CET 2009


Thanks for the comment John,
I should have written that better. I had in mind a very simple CRBD with 
one grouping factor (treatment) and complete blocks with only one plot 
per treatment. You are perfectly right, when there are between and 
within block (or plot) treatments (example: split-plot, strip-plot, 
split-block designs), the way to go is to consider the blocks and plots 
as random effects.

I meant to say that in treating the block as random produced the same 
inferences (SE and all) only in (very) simple design, while in more 
complex designs, the random block effect leads to better inferences. 
That is the reason I treat block as random by default.
Thanks again.
JP

John Maindonald wrote:
> "In a complete randomized block design (CRBD), treating blocks as 
> fixed or random should yield identical results."
>
> It depends what you mean by "results".  SEs of effects, for treatments 
> that are estimated "within blocks", will be the same.  The between 
> block variance does not contribute to this SE.
>
> Estimates of SEs of treatment means may be very different.  The 
> between block variance does contribute to this SE.  This is where it 
> does matter if there are very few blocks.  The SE will be estimated 
> with very poor accuracy (low df).
>
> Of course, the SEs of effects assume that there is no systematic 
> change in treatment effect from one block to another.  Unless there 
> are super-blocks (sites?), there is no way to estimate the SE of any 
> block-treatment interaction.  Look at the kiwishade data in the DAAG 
> package for an example where there might well be differences between 
> blocks that are affected by the direction in which the blocks face.
>
> John Maindonald             email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
>
>
> On 27/01/2009, at 8:51 AM, Juan Pedro Steibel wrote:
>
>> Hello,
>> Treating a block effect as random allows recovering inter-block 
>> information. In a complete randomized block design (CRBD), treating 
>> blocks as fixed or random should yield identical results. In an 
>> incomplete block design (incomplete by design or by missing at random 
>> some observations), the results will differ.
>>
>> If the Gaussian assumption regarding the block effects are sound, I 
>> would expect that treating the block as random will be more efficient 
>> that fitting block as fixed. Moreover, one could compute the relative 
>> efficiency of both analyses by comparing the variances of a 
>> particular treatment difference when block is treated as fixed versus 
>> when it is treated as a random effect.
>>
>> The catch is that the relative efficiency depends on the actual 
>> variance ratios (unknown) and on the assumptions regarding the random 
>> effects (commonly, Gaussian distribution).
>>
>> In practice, when analyzing field or lab experiments, I tend to 
>> specify the block as a random effect. Always.
>> In some cases there are very few levels, though. In those cases, if 
>> someone asks "how can you reliably estimate a variance component for 
>> a (blocking) factor with only (say) 4 or 5 levels?", I just shrug. 8^D
>>
>> JP
>>
>>
>>
>> Prew, Paul wrote:
>>> I have been following your R discussion list on mixed modeling for a 
>>> few
>>> weeks, in hopes of understanding mixed modeling better.  And it has
>>> helped.  I was not aware of the controversy surrounding degrees of
>>> freedom and the distribution of test statistics.  I have just been
>>> trusting the ANOVA output from software (Minitab, JMP) that reported F
>>> tests.  JMP uses Kenward-Roger, Minitab's ANOVA reports an F-statistic,
>>> followed by "F-test not exact for this term".
>>> A recent mention by Douglas Bates of George Box, though, hit upon an
>>> aspect of mixed models that has confused me.  I'm an industrial
>>> statistician, and studied statistics at Iowa State and the 
>>> University of
>>> Minnesota.  I have had 3 courses in DOE, 2 at the graduate level, and
>>> none of them mentioned blocking factors could (should?) be modeled as
>>> random effects.  **Exception: the whole plots in a split plot design
>>> were taught as random effects.**
>>>
>>> The 2005 update to Box Hunter Hunter discusses blocking as does Wu &
>>> Hamada (2000).  Both texts model blocking factors such as Days and
>>> Batches as fixed effects.  Montgomery's DOE text, 2009 rev., pretty
>>> consistently states that blocks can be either random or fixed.  Don't
>>> have a consensus from that small sample.
>>> I'm trying to understand the implications if I consistently used random
>>> effects for DOE analysis.
>>>
>>> I'm quite willing to use R for mixed models, seeing as Minitab, JMP 
>>> etc.
>>> appear to use degrees of freedom calculations that are questionable.
>>> But as Douglas points out --- Box said, "all models are wrong, some are
>>> useful" => Box's latest text doesn't bother with random effects for DOE
>>> =>  does it follow that for practical purposes it's OK to consider
>>> blocks as fixed?  There are certainly several advantages to keeping it
>>> simple (i.e. fixed only):
>>> * The analyses we (my statistics group) provide to our chemists and
>>> engineers are more easily understood
>>> * The 2-day short courses we teach in DOE to these same coworkers
>>> couldn't realistically get across the idea of mixed model analysis ---
>>> they would become less self-sufficient, where we're trying to make them
>>> more self-sufficient
>>> * We have a handful of softwares (Minitab, JMP, Design Expert) that can
>>> perform DOE and augment the results in a number of ways:
>>> *** fold-over the design to resolve aliasing in fractional designs
>>> *** add axial runs to enable Response Surface methods
>>> *** add distributions to the input factors, enabling
>>> Robustness/Sensitivity analyses
>>> *** running optimization algorithms to suggest the factor settings
>>> that simultaneously consider multiple objectives
>>> *****  Not to mention the loss of Sample Size Calculations, far and
>>> away my most frequent request
>>> None of these softwares recognize random factors to perform these
>>> augmentations
>>>
>>> Replacing this functionality with R is going to be a high learning
>>> curve, and probably not entirely possible.  My coding skills in R
>>> consist of cutting and pasting what others have done.
>>>
>>> I don't really expect that there's a "right" answer to the question of
>>> random effects in DOE.  But I do believe that beyond the loss of
>>> p-values, there are other ramifications for advising experimenters,
>>> '"You can't trust results from your blocking on Days (or Shifts or RM
>>> Lots or Batches, etc) unless they are modeled as random effects."
>>>
>>> There's statistical significance, and practical significance.  My hope
>>> is that blocks while random effects are statistically "truer", their
>>> marginal worth over fixed effects in DOE is ignorable. Again, I don't
>>> want this to come across as shooting the messenger, you are only laying
>>> out the current state of art and the work that remains to be done.  But
>>> any insight you can provide into what's practical right now would be
>>> highly interesting.
>>>
>>> Thank you for your time and consideration,
>>> Paul Prew
>>>
>>> 651-795-5942     fax 651-204-7504
>>> Ecolab Research Center
>>> Mail Stop ESC-F4412
>>> Lone Oak Drive
>>> Eagan, MN 55121-1560
>>>
>>>
>>> CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}}
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>>
>>
>>
>> -- 
>> =============================
>> Juan Pedro Steibel
>>
>> Assistant Professor
>> Statistical Genetics and Genomics
>>
>> Department of Animal Science & Department of Fisheries and Wildlife
>>
>> Michigan State University
>> 1205-I Anthony Hall
>> East Lansing, MI
>> 48824 USA
>> Phone: 1-517-353-5102
>> E-mail: steibelj at msu.edu
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>


-- 
=============================
Juan Pedro Steibel

Assistant Professor
Statistical Genetics and Genomics

Department of Animal Science & 
Department of Fisheries and Wildlife

Michigan State University
1205-I Anthony Hall
East Lansing, MI
48824 USA 

Phone: 1-517-353-5102
E-mail: steibelj at msu.edu




More information about the R-sig-mixed-models mailing list