[R-sig-Geo] geoR likfit: confusing AIC results?

Timothy W. Hilton hilton at meteo.psu.edu
Sat Aug 14 19:51:25 CEST 2010


Dear R-sig-geo readers,

I am trying to use geoR's likfit to determine whether some spatial
fields are better described by an exponential semivariogram or a pure
nugget semivariogram.  To investigate the frequency of a 'false
positive', where my test picks the exponential semivariogram when
there is in fact no spatial correlation in the field, I ran the
following experiment:

I used geoR's grf function to create ten different Gaussian random
fields (GRFs) with phi = 0.0, sigmasq = 0.001, and nugget = 3.0.  For
each GRF I use geoR's likfit to fit 50 exponential covariance models
and 50 pure nugget covariance models.  Within each set of 50 models, I
chose the lowest AIC (as reported by likfit) as the "best" fit.

Because the GRFs were generated with no spatial correlation, I
expected the best-fit pure nugget AIC to be lower than the best-fit
exponential AIC in the majority of cases.  Across my ten experiements,
though, the best-fit exponential AIC was lower eight times, and the
best-fit pure nugget AIC was lower only once (the other was a tie).

This concerns me, as it suggests than many of my fields that I had
concluded *do* have spatial structure in fact may not.

Am I misusing the AIC result from likfit to choose a covariance
structure?  I will greatly appreciate any advice the list can offer.

A more detailed description of my experiment, code, and results follow
below.

many thanks,
Tim

--

Timothy W. Hilton
PhD Candidate, Department of Meteorology
The Pennsylvania State University
503 Walker Building, University Park, PA   16802
hilton at meteo.psu.edu

==================================================

For each of the ten GRFs, I used likfit to estimate parameters for
exponential covariance model 50 times.  Each likfit began with initial
covariance parameters drawn randomly from uniform distributions within
(0,10) for sigmasq, (0,1000) for phi, and (0,10) for the nugget.  To
estimate the pure nugget parameters, I again ran likfit 50 times with
initial covariance parameters of phi = 0.0, sigmasq = 0.001, and
nugget = 3.0.

==================================================

Resulting AIC for best-fit model for each of the ten GRFs.
Exponential (exp) is less than pure nugget (PN) in eight of ten cases.

         PN      exp
1  251.9389 251.6512
2  276.7283 276.3321
3  252.2566 252.2566
4  227.2106 227.2106
5  251.4253 251.4252
6  268.3910 268.3910
7  259.6977 257.8298
8  254.6102 247.9796
9  262.4842 262.0296
10 258.1551 257.4813

==================================================

The R code I used.  It took about ten minutes to complete on my
notebook computer.

#----------------------------------------------------------------------
getGRFRealizations <- function(nsim=1) {
                               
  phi <- 0.0
  sigmasq <- 0.001
  nug <- 3.0
  
  realizations <- grf(n=65,
                      xlims=c(0, 6000),
                      ylims=c(0, 6000),
                      nsim=nsim,
                      cov.model='pure.nugget',
                      cov.pars=c(sigmasq, phi),
                      nugget=nug)
  return(realizations)
}

#----------------------------------------------------------------------
# performs likfit for a Gaussian Random Field (fld) multiple times
# with initial covariance parameters drawn from a uniform distribution
# within a window.  Returns the fit with the lowest AIC.
multi.likfit <- function(nfits=50, data, coords, cov.model) {

  if (cov.model == 'exponential')
    pars <- c(runif(n=1, min=0, max=10),
              runif(n=1, min=0, max=1000))
  else
    pars <- c(0.001, 0.0)  ## for pure nugget

  # estimate one fit
  best.so.far <- likfit(data=data,
                        coords=coords,
                        cov.model=cov.model,
                        ini.cov.pars=pars,
                        nugget=runif(n=1, min=0, max=10),
                        messages=FALSE)
  # do the rest of the fits, keep the lowest AIC
  for (i in 2:nfits) {
    this.fit <- likfit(data=data,
                       coords=coords,
                       cov.model=cov.model,
                       ini.cov.pars=pars,
                       nugget=runif(n=1, min=0, max=10),
                       messages=FALSE)
    if (this.fit$AIC < best.so.far$AIC)
      best.so.far <- this.fit
  }
  return(best.so.far)
}
    
#----------------------------------------------------------------------
fitGRFRealizations <- function(nsim=5, nfits=50) {

  #allocate lists to hold the covariance model fits
  fits.list.exp <- vector(mode='list', length=nsim) #for exponential fits
  fits.list.PN <- vector(mode='list', length=nsim)  #for pure nugget fits
  #generate random field realizations
  flds <- getGRFRealizations(nsim=nsim)

  #fit covariance model to each of the random fields
  for (i in 1:nsim) {
    fits.list.exp[[i]] <- multi.likfit(nfits=50,
                                       data=flds$data[, i],
                                       coords=flds$coords,
                                       cov.model='exponential')
    fits.list.PN[[i]] <- multi.likfit(nfits=50,
                                      data=flds$data[, i],
                                      coords=flds$coords,
                                      cov.model='pure.nugget')
    cat('.')  # crude progress bar
  }
  cat('\n')
  return(list(exponential=fits.list.exp, pure.nugget=fits.list.PN))
}

#----------------------------------------------------------------------
gatherFits <- function(fits.list) {
  phi <- unlist(lapply(fits.list, function(x) x[['phi']]))
  sigmasq <- unlist(lapply(fits.list, function(x) x[['sigmasq']]))
  nugget <- unlist(lapply(fits.list, function(x) x[['nugget']]))
  pRange <- unlist(lapply(fits.list, function(x) x[['practicalRange']]))
  AIC <- unlist(lapply(fits.list, function(x) x[['AIC']]))
  df <- data.frame(phi=phi, sigmasq=sigmasq, nugget=nugget, pRange=pRange, AIC=AIC)
  return(df)
}

result <- fitGRFRealizations(nsim=10)
df <- data.frame(PN.AIC=unlist(lapply(result[['pure.nugget']], function(x) x$AIC)),
                 exp.AIC=unlist(lapply(result[['exponential']], function(x) x$AIC)))



More information about the R-sig-Geo mailing list