[R-sig-Geo] Advice on GLS on residual variogram?

Lyndon Estes lestes at princeton.edu
Wed Jun 16 19:40:15 CEST 2010


Dear Thierry,

Many thanks again for your help and clarifications.

I ran the GLS again last night, this time using the following code (I
post here in case it is of use to anyone else), run on the Mac, Linux,
and a Windows machine for good measure:

library(gstat)
library(foreign)
library(nlme)
library(sp)
library(MASS)

oc<-read.dbf('~/ocdata.dbf')	                                      # Import file
oc2new<-as.data.frame(oc[,c(2,5:6,21:27,207:225)])   # Subset relevant columns
coordinates(oc2new)=~X+Y                                          #
Define coordinates

#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++#
#GLS procedure

########################
# 1. Start with OLS using GLS
########################

gls.new<-gls(log(CTOP)~PCB1+PCB2+PCB3+PCB4+PCB5+PCB6+PCB8,
data=oc2new)

#######################################################
# 2. Find autocorrelation structure of the residuals, using a variogram
#######################################################

v.gls.new<-variogram(residuals(gls.new)~1,oc2new)
plot(v.gls.new)

#____________________________
# Starting values for fitting variogram

# Nugget
v.gls.nug1<-min(v.gls.new$gamma)

# Gives major range, but then divided by two because had singularity problems.
v.gls.range1<-(sqrt(diff(oc2new at bbox["X",])^2 + diff(oc2new at bbox["Y",])^2)/4)/2	

# Partial sill estimate
v.gls.psill<-var(residuals(gls.new))-v.gls.nug1					

v.gls.new.vgm<-vgm(nugget=v.gls.nug1,model="Sph",psill=v.gls.psill,range=v.gls.range1)

# Fit variogram			
v.gls.new.fit<-fit.variogram(v.gls.new,model=v.gls.new.vgm)

# Check variogram fit
# pl.vgm.gls.new.fit<-plot(v.gls.new,v.gls.new.fit,col="black",main="Log
OC residuals")
																
# Capture output to feed to GLS corStruct
gls.fitted.nug<-v.gls.new.fit$psill[1]							# Capture nugget value
gls.fitted.range<-v.gls.new.fit$range[2]						# Capture range value

################################################
# 3. Update the OLS model with defined correlation structure
################################################
ding<-Sys.time()
gls.new.up<-update(gls.new,correlation=corSpher(c(gls.fitted.range,gls.fitted.nug),form=~X+Y,nugget=TRUE))
dong<-Sys.time()

summary(gls.new.up)

print("Spherical GLS completed")
print(dong-ding)

The results this time around were pretty much identical for all three
machines, with Windows and Linux returning the same results, and Mac
just a shade different, e.g. two of the coefficients showed
differences in the 6th decimal.

I guess the earlier problem with the larger between-platform
differences was caused by GLS not having any initial values for
correlation structure specified.  So, that approach helps. Also, and
perhaps of greatest interest, providing starting values for GLS
dramatically improved the processing time (by reducing the number of
iterations needed).

Time to run this on Mac:

gls.2.sph<-update(gls.2,correlation=corSpher(form=~X+Y,nugget=TRUE))

4.7 hours

Versus 1.8 hours for the newer version:

gls.new.up<-update(gls.new,correlation=corSpher(c(gls.fitted.range,gls.fitted.nug),form=~X+Y,nugget=TRUE))

Thanks again for your assistance!

Cheers, Lyndon









On Wed, Jun 16, 2010 at 5:06 AM, ONKELINX, Thierry
<Thierry.ONKELINX at inbo.be> wrote:
>
> Hi Lyndon,
>
> The results seems sensible to me. The difference in range and nugget
> between the OLS variogram and the GLS correlation structure is IMHO not
> a problem. They both are some the same magnitude. The GLS takes the
> spatial structure directly into account. Hence the model fit
> (parameters) will change and so will the residuals.
>
> As far as a know, GLS uses an iterative algorithm to estimate the
> correlation structure. That might explain the differences you observe
> between the two machines. Rerunning the same model on the same dataset
> and the same machine will also yield somewhat different results.
>
> Best regards,
>
> Thierry
>
> ------------------------------------------------------------------------
> ----
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to
> say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of
> data.
> ~ John Tukey
>
>
>
> Druk dit bericht a.u.b. niet onnodig af.
> Please do not print this message unnecessarily.
>
> Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
> en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
> door een geldig ondertekend document. The views expressed in  this message
> and any annex are purely those of the writer and may not be regarded as stating
> an official position of INBO, as long as the message is not confirmed by a duly
> signed document.



More information about the R-sig-Geo mailing list