[R-sig-ME] glmmADMB definition of levels for random component

Ben Bolker bbolker at gmail.com
Tue Dec 27 19:53:12 CET 2011


Juan Santos <juan.santos at ...> writes:

> As a reminder, say that I'm working with 2003-2010 fishing data
> (clustered data, Primary sampling unit [PSU]=trip, secondary
> sampling unit [SSU]=haul); Specifically, I am trying to model the
> numbers of fishes of a given species thrown back to sea (discarded
> individuals as the Ultimate sampling Unit [USU]) with a set of
> available covariates. The main aim of the model is to assess if
> yearly increase in discards found in previous EDA, is also found
> significant using a formal and well fitted model.  I fitted an
> starting model using (1| trip/ haul) as random component (fishing
> hauls nested on trips, following the hierarchicalsamplingdesign):

> mod.admb.1<-glmmadmb(ndiscarded~year + quarter+deep+duration+ 
> (1|trip/haul), zeroInflation=TRUE,family="nbinom",data=dmod.1)
> 
> (I used zero inflation link  because previous study indicates that a 
> model with zeroInflation=FALSE  does not account for the excess of zeros 
> in the data).
> 
> "Trip" coding is
> 
> dat<-ddply(dat,.(year),function(x){
>          levels(x$trip) <-1:length(levels(x$trip))
>          return(data.frame(x))})
> 
> Meaning that trip levels restarts in every sampled year. Similar 
> approach is used for hauls within trips (hauls coding restarts for every 
> sampled trip).
> 
> But I am not sure of this codification, because using this coding means 
> that I have same trips ID in every year, but obviously  they are not the 
> same trips from one year to another (Perhaps the only similarity is that 
> they are sampled in  same temporal interval each year).  As expected, 
> random coefficients [ranef ()] from mod.admb.1does not  take into 
> consideration that trip number "n" differs between years. I have some 
> ideas to solve this problem:
> 
> a) To recode trip level with not regard of the sampled  year  (using 
> factor(1: N) as levels, being N=total number of trips sampled across the 
> years)
> b) Fit models for each year separately and compare intercepts
> c) use (1|year/trip/haul) as random component,
> 
> Can someone indicate a right practice to deal with this model 
> specification problem ?
 
  Did anyone ever answer you?  It seems as though this may have
slipped through the cracks.

  As it turns out you're right to be concerned in this case,
the model specification you have (~year + (1|trip/haul), leaving
out all the other stuff) does assume that trips are consistent
across years.  Your option (a) [recode to an "implicitly nested"
rather than "explicitly nested" coding] would work.  I don't
like option (b) because it would require fitting a separate parameter
for all of your other predictor variables in each year [i.e.
it would be equivalent to introducing a (quarter+deep+duration):year
interaction], which you may not want to do.  Option (c) doesn't
work so well either, because it introduces year as a random effect,
which (unless you have coded year as numeric in order to look for
a log-linear trend in by-catch with year) will be completely
redundant with the fixed effect of year in your model.  If you
don't want to use option (a), then I would recommend the following
option (d):

  ndiscarded~year+ [other fixed effects] + (1|year:trip)+(1|year:trip:haul)

which expands the trip/haul nesting (which would otherwise expand
to trip+trip:haul) to include an interaction with year.  If this gives
you trouble you could also do this even more explicitly by adding
a 'yeartrip' variable to your data set which was defined as 
interaction(year,trip) and similarly for 'yeartriphaul' ...




More information about the R-sig-mixed-models mailing list