[R-sig-ME] GAMM- Missing/uncertain group id's in random effect

Mon Jul 3 21:49:40 CEST 2023

    Unfortunately, I don't think there's an easy way to deal with this. 
You *could* build a fancy Bayesian engine that would try to 
estimate/impute uncertain boat IDs concurrently with the statistical 
analysis (I think people *may* have done this for IDing individual 
animals from camera trap or sighting data, but I don't remember).

   However, my recommendation would be to try to come up with some 
reasonable, objective heuristics for lumping IDs together.  Can you use 
information about co-occurrence of observations in space and time (or 
not) to come up with a set of rules?  (Basically, think about how you 
would try to pick out suspected duplicates by eye, then try to implement 
those rules in code.) For small, noisy data sets, if you can't make 
reasonable guesses about duplicates, it's unlikely that a computer will 
be able to do better.

  Misclassification in this way (either incorrectly lumping or 
splitting) might not make a huge difference to your results, as it will 
affect the correlation of observations, not the observed 
gear/effort/habitat/CPUE relationships directly ...

   good luck,
    Ben Bolker

On 2023-06-30 12:23 p.m., Meaghan Rupprecht wrote:
> I am currently modelling fish catch in the Amazon River in response to variables such as habitat, effort, gear type, and spatial locations. The model we have selected to accomplish this task is a GAMM, and we are currently using the mgcv and brms packages in R. We are facing an issue with random effects in our model, and I was hoping to get some insight about possible solutions. I've tried to find solutions online without much luck.
> 
> For each record of fish catch, there is information recorded such as boat name and boat length. We aimed at using this information to generate a unique boat id, which would be treated as a group id for a random effect variable in our model. A problem arises in boat names because they may not be unique, but boat lengths are somewhat unreliable information and could have varying responses. This creates some records where the boat names may be the same with slightly varying lengths, resulting in multiple id's being generated for what might actually be the same boat (i.e., boat 1 with length of 17m and boat 1 with length of 17.2m). This greatly complicates our attempts at identifying unique boats and generates uncertainty in our classifications.
> Is there a method for dealing with uncertainty or missing group id's in random effects? I'd be happy to elaborate or provide additional information if anything above was unclear.
> Thanks for your time.
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models