[R-sig-ME] gls error

Ben Bolker bbolker at gmail.com
Wed Mar 13 16:37:35 CET 2013


Benjamin Gillespie <gybrg at ...> writes:

> 
> Morning all,
 
> I wonder if anyone could shed some light on a problem I am receiving
>  in R when I try to fit a model:
 
> I'm attempting to follow the 'Protocol' as in Chapters 4 & 5 of Zuur
> et al 2009 for some data I have for a number of river sites, sampled
> once for macroinvertebrates. Each site has been graded into 1 of 3
> groups dependent on it's characteristics. I want to find out whether
> the factor: "group" is significant.
 
> I have a response variable: "simp" (simpsons diversity index) and a
> number of fixed factors that I would like to include in my model.
 
> In R, this is the code I use:
 
> f1=formula(simp~group+date+altitude+data_source+catchment_size+
  g1+g2+g3+g4+g5+g6+lc1+lc2+lc3+lc4+lc5)

> s1.gls=gls(f1,data=env.sp)
 
> Please note: g1...gX are %cover of geology types for each site and 
  lc1...lcX are % land cover types for each site.
> 
> The following is the error I receive:
 
> Error in glsEstimate(glsSt, control = glsEstControl) : computed
  "gls" fit is singular, rank 16
 
> From what I've read, it looks like I'm using too many explanatory
> factors. I have tested for colinearity between all factors, but none
> look like obvious candidates for removal.

  If you use a full set of compositional data as predictors (i.e. A,
B, C, D such that A+B+C+D=1) then you will necessarily have a
multicollinearity problem, even if the pairwise correlations between
the components aren't that high.  The correlation between any
component and the sum of all of the other components is exactly -1 (as
A+B+C increases, D must decrease).  You should leave one out
(preferably not a rare component, because if you leave a rare
component the remaining components will still be pretty strongly
multicollinear).  Alternatively, you could use something like
a log-ratio transform (see Aitchison and others) to transform
the n-dimensional compositional predictor to an (n-1)-dimensional
set of variables.

  (This isn't really a mixed model question ...)



More information about the R-sig-mixed-models mailing list