[R] Assessing temporal correlation in GAM with irregular time steps

Worthington, Thomas A thomas.worthington at okstate.edu
Fri Sep 6 22:17:43 CEST 2013


Dear Gavin 

I got the code to work by setting the 'bar' to DOY as this was the measured time step. I stumbled across another problem, on some days I have multiple measurements which isn't allowed in corCAR1, therefore I've had to take the average of the replicates as suggested in another R help post.

Taking the average has meant adding the autocorrelation structure now doesn't improve the model, although I now appear to have heterogeneity in the residuals (as you suggested in your initial post). I will look at variance structures or moving to the GLM GAM 

Thanks again,     

Tom

-----Original Message-----
From: Gavin Simpson [mailto:ucfagls at gmail.com] 
Sent: Thursday, September 05, 2013 6:35 PM
To: Worthington, Thomas A
Cc: r-help at r-project.org
Subject: Re: [R] Assessing temporal correlation in GAM with irregular time steps

On 3 September 2013 16:10, Worthington, Thomas A <thomas.worthington at okstate.edu> wrote:
> Dear Gavin
>
> Thank you for the very detailed response. I had started to go down the route of fitting a correlation structure via gamm.
>
> I tried applying your code to my data but returned the error "Error in 
> corCAR1(~ID | SiteCode1971) : parameter in CAR(1) structure must be between 0 and 1"

Sorry, that is my fault, I keep forgetting that you need to specify the formula argument, the first argument of corCAR1() is the value of the correlation parameter if you want to specify it. So try:

corCAR1(form = ~ID | SiteCode1971)

I do this (get that error) all the time myself.
> I set the 'bar' in your code to the sample ID (basically a number between 1 and 192) but I wasn't sure if this was what you meant in relation to 'ordering of the samples'

That is not that useful as you need to give the software something about when the samples occur in time, otherwise it doesn't have the information needed to properly model the decay in correlation with time.

You need to give it the observation time, however you measured it.

HTH

G

> Best wishes
> Tom
>
> -----Original Message-----
> From: Gavin Simpson [mailto:ucfagls at gmail.com]
> Sent: Tuesday, September 03, 2013 3:17 PM
> To: Worthington, Thomas A
> Cc: r-help at r-project.org
> Subject: Re: [R] Assessing temporal correlation in GAM with irregular 
> time steps
>
> It is possible, but you can't use the discrete time or classical stochastic trend models (or evaluate using the ACF). Also, why do you care to do this with regard to DoY? The assumption of the model relates to the residuals, so you should check those for residual autocorrelation.
>
> As you are using `mgcv::gam` you could also use `mgcv::gamm` which can then leverage the correlation structures from the nlme package, which has spatial correlation structures (and you can think of time as a 1-d spatial direction). The package also has a `corCAR1()` correlation structure which is the continuous-time analogue of the AR(1). Fitting via `gamm()` will also allow you to use the `Variogram()` function from the nlme package to assess the model residuals for residual autocorrelation.
>
> For example you could compare the two fits
>
> m0 <- gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo, 
> method = "REML")
> m1 <- gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo, method = "REML",
>                     correlation = corCAR1( ~ bar | SiteCode))
>
> where `foo` is the object that contains the variables mentioned in the call, and `bar` is the variable (in `foo)` that indicates the ordering of the samples. Notice that I nest the CAR(1) within the two respective Sites, but do note IIRC that this fits the same residual correlation structure to both sites' residuals (i.e. there is 1 CAR(1) process, not two separate ones).
>
> require(nlme)
> anova(m0$lme, m1$lme)
>
> will perform a likelihood ratio test on the two models.
>
> If you have residual autocorrelation, do note that the smooth for DoY may be chosen to be more complex than is appropriate (it might be fitting the autocorrleated noise), so you may want to fix the degrees of freedom for the smoother at some a priori chosen value and use this same value when fitting both m0 and m1, or at the very least set an upper limit on the complexity of the DoY smooth, say via s(DoY, by = SiteCode, k = 5).
>
> Finally, as a length <= 0 insect makes no sense, the assumption of Gaussian (Normal) errors may be in trouble with your data; apart from their strictly positive nature, the mean-variance relationship of the data may not follow that of the assumptions for the errors. You can move to a GLM (GAM) to account for this but things get very tricky with the correlation structures (you can use gamm() still but fitting then goes via glmmPQL() in the MASS package a thence to lme()).
>
> If you just want to fit a variogram to something, there are a large number of spatial packages available for R, several of which can fit variograms to data, though you will need to study their respective help files for how to use them. As for the input data, often the time/date of sampling encoded as a numeric will be sufficient input, but you will need to check individual functions for what they require.
> I would check out the Spatial Task View on CRAN.
>
> HTH
>
> G
>
> On 28 August 2013 14:26, Worthington, Thomas A <thomas.worthington at okstate.edu> wrote:
>> I have constructed a GAM using the package mgcv to test whether the 
>> lengths of an emerging insect (Length) varies with day of the year
>> (DOY) and between two sites (SiteCode). The data are collected at 
>> irregular time steps ranging from 2 days to 20 days between samples.
>> The GAM takes the form
>>
>> M3 <- gam(Length ~s(DOY, by = SiteCode) + SiteCode)
>>
>> As the data are a time series I would like to test for temporal autocorrelation. I have read that it is not possible to use the autocorrelation function (ACF) due to the irregular spacing and that producing a variogram in relation to DOY would be an option.
>>
>> Is this a correct method to test for temporal autocorrelation?
>>
>> And could someone suggest the code to produce the variogram as I'm 
>> getting an error related to the 'distance' argument
>>
>> Best wishes
>> Tom
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gavin Simpson, PhD



--
Gavin Simpson, PhD


More information about the R-help mailing list