[R-sig-ME] spatial auto-correlation or more complicated pseudo-replication?
Thierry Onkelinx
th|erry@onke||nx @end|ng |rom |nbo@be
Wed Apr 22 19:45:49 CEST 2020
Dear Thomas,
Have a look at the data.frame in the variogram() output. Given your
variogram I expect a high number of pairs (np variable) at short range and
a low (< 100) at large ranges. Note the width and cutoff arguments of
variogram(). The defaults are 1/3 of the diagonal of the bounding box for
cutoff and cutoff/15 for width. These are likely suboptimal for your data.
I'd set width to slightly larger than the distance between two adjacent
nests. Increase the width if the variogram is unstable.
If you still get a similar picture as the ones you send, then there then
residuals are iid and thus you don't need to correct for spatial
autocorrelation.
Given the strong correlation between pair and location, the pair random
effect will take up some of the spatial autocorrelation. You could make a
variogram of the random intercepts. There should be a pure nugget effect
too.
Best regards,
ir. Thierry Onkelinx
Statisticus / Statistician
Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be
///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////
<https://www.inbo.be>
Op wo 22 apr. 2020 om 18:19 schreef Thomas Merkling <
thomasmerkling00 using gmail.com>:
> Dear Thierry,
>
> Thanks for your answer.
> Below is the piece of code I ran:
> mod <- lmer(Laydate ~ Treatment + Year + (1|PairID), REML= FALSE, data =
> CRlF)
> CRlF$resmod <- residuals(mod, type = "pearson")
> plot(gstat::variogram(resmod ~ 1, loc = x+y, data = CRlF))
>
> It seems like the variance is quite stable for distances up to 25 and then
> drops a bit. I did the same analysis with another response variable (egg
> weight) and got a similar pattern. (links to plot for laydate
> <https://drive.google.com/open?id=1T2n41DeSh0BlkZVX5t-E1IKdMuudujBy> and
> for eggweight
> <https://drive.google.com/open?id=10_ou4yQQkF-kVVnf7zbgHyMmUEABvnPU>)
> So does it mean that there is no spatial auto-correlation then?
>
> This would match the fact that our results don't change much if we add the
> Matern correlation random effects or not.
> A reviewer suggested that spatial-autocorrelation isn't sufficient to
> account for the pseudo-replication in our data, and that we still have an
> issue of inflation of the degrees of freedom and suggested permutation
> tests to account for that, but is that really necessary?
>
> Kind regards,
> Thomas
> On 22/04/2020 16:44, Thierry Onkelinx wrote:
>
> Dear Thomas,
>
> Extract the residuals from the model. Then use gstat::variogram() to
> calculate the empirical variogram of the residuals. If there is spatial
> autocorrelation, you'll see an increase in the variance as the distance
> between observations increases.
>
> I would expect that the birds have a stronger effect than the nests. Hence
> I'd use Pair ID. If the dataset would span more than 2 years you could try
> both a Pair and Nest random effect.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx using inbo.be
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be
>
>
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> ///////////////////////////////////////////////////////////////////////////////////////////
>
> <https://www.inbo.be>
>
>
> Op wo 22 apr. 2020 om 14:58 schreef Thomas Merkling <
> thomasmerkling00 using gmail.com>:
>
>> Dear Thierry,
>>
>> Thanks for reply. We used a sample of the population for our experiment,
>> but for this sample we have information (treatment and Prop variable at
>> each scale for all the nests.
>> How would you suggest to test/check is there is spatial autocorrelation?
>> I tried with the DHARMa package (which makes a Moran's I test adapted to
>> mixed models), but it doesn't show if autocorrelation changes with
>> distance, it just gives a p-value. I tried with a model with PairID as
>> random effect (p = 0.33), but if I include nest as a random effect (some
>> pairs changed in between the 2 years of the experiment, so there are less
>> Nest IDs than Pair IDs) the p-value becomes 0.054 ...
>>
>> Kind regards,
>> Thomas
>> On 22/04/2020 13:57, Thierry Onkelinx wrote:
>>
>> Dear Thomas,
>>
>> Do you have information on all the nests or only on a sample of the
>> nests? In case you have data on every nest, then I would look at a simple
>> model with only treatment and an iid nest effect. Then see if there is
>> spatial autocorrelation. Variation at small ranges would indicate an effect
>> of the treatment of the neighbouring nests.
>>
>> Best regards,
>>
>> ir. Thierry Onkelinx
>> Statisticus / Statistician
>>
>> Vlaamse Overheid / Government of Flanders
>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>> AND FOREST
>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>> thierry.onkelinx using inbo.be
>> Havenlaan 88 bus 73, 1000 Brussel
>> www.inbo.be
>>
>>
>> ///////////////////////////////////////////////////////////////////////////////////////////
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>>
>> ///////////////////////////////////////////////////////////////////////////////////////////
>>
>> <https://www.inbo.be>
>>
>>
>> Op wo 22 apr. 2020 om 11:44 schreef Thomas Merkling <
>> thomasmerkling00 using gmail.com>:
>>
>>> Hi all,
>>>
>>> I'm wondering how to best model data from an experimental design
>>> involving a spatial component. This is a study on seabirds nesting on
>>> artificial cliffs: each nest has been attributed an experimental
>>> treatment (supplemented or not), while making sure that there was a
>>> variable proportion of surrounding nests of the opposite treatment. Our
>>> main goal was to investigate if laying date of a focal pair was
>>> influenced by its treatment and/or by the proportion of surrounding
>>> nests of the opposite treatment (hereafter, "Prop"), which we calculated
>>> at 3 different spatial scale (local, panel and global, see
>>> https://drive.google.com/open?id=1OrJQCkNfBO6KOBHSlkOoQyAdTrqtIdY8 for
>>> a
>>> visual representation).
>>>
>>> Hence, the treatment information of a focal pair is used in the
>>> "Treatment" predictor variable, but also in the calculation of "Prop"
>>> for the surrounding pairs (the number of pairs affected depending on the
>>> spatial scale considered), thereby leading to some pseudo-replication.
>>> Since this is dependent on the distance (i.e. "Prop" of pairs closer to
>>> a focal one are more influenced than pairs further away), we thought
>>> that accounting for spatial auto-correlation for be sufficient. We used
>>> the spaMM package to do so, and our models look something like:
>>>
>>> Laying ~ Treatment * Prop + Year + (1|PairID) + Matern(Y2011|x + y) +
>>> Matern(Y2012|x + y)
>>>
>>> with two Matérn correlation random effects (one for each year of the
>>> study) being included (x and y being the spatial coordinates of the
>>> nests).
>>>
>>> My question is: Is this random effect structure taking into account the
>>> fact that "Prop" of a focal pair depends on the "Treatment" of the
>>> surrounding pairs or not ? If not, how can we account for that?
>>>
>>> Thanks in advance for your help!
>>> Thomas
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list