[R-sig-Geo] negative range from fit.variogram through autofitVariogram

Paul Hiemstra p.hiemstra at geo.uu.nl
Tue May 4 11:05:12 CEST 2010


Hi Mark,

Thanks for the reproducible example. The problem is that when I look at 
the sample variogram, the semivariance values start high and end low. 
This is best illustrated by:

plot(variogram(theta_percent~1, sparse))

You see that there are outliers in the data that cause high 
semi-variance at a short distance. I would say that is 'strange' ;). You 
can try and identify which point causes this by:

plot(variogram(theta_percent~1, sparse, cloud = TRUE), identify = TRUE)
# Click on the plot to identify the point pairs

There is not really one value I think that causes this. It might be also 
attributable to the fact that your dataset is somewhat sparse in the 
distance range from 20-40 m.

As a quick fix you can restrict the model selection to "Sph", this 
works. And I would not be enthusiastic about using Ste. This is because 
the main difference between different kappa values if the behavior at 
short distances, but you don't have a lot of data on the short distance 
to fit this value on in a meaningful way.

At this stage I don't see a (relatively quick) fix that could solve this 
problem in an automatic way and on a more fundamental level. Do you have 
any suggestions? From an implementation point of view I can let automap 
discard any model that has negative values in it, this would ensure that 
the user gets at least the Sph model back.

cheers,
Paul

Mark Connolly wrote:
> I am using autofitVariogram during the process of interpolating a 
> large set of daily observations through a volume.  Each volume is 
> decomposed into 2D layers prior to selecting a model to use for 
> interpolation.  I made it through 2010 interpolations and then ran 
> into a failed interpolation when the best model selected by 
> autofitVariogram had a negative range.  This was rejected by the krige 
> function.  I see mention of negative sills but not of negative ranges.
>
> It appears that autofitVariogram is having some issues with the trial 
> arguments sent to fit.variogram.  This is repeatable.  Not sure if 
> this is a bug for some package or a data issue.  The data values do 
> not look overly strange.
>
>
>
> # data
> sparse =
> structure(list(x = c(740381.862, 740456.052, 740503.958, 740551.752,
> 740559.502, 740502.995, 740446, 740389.229, 740371.693, 740428.25,
> 740484.918, 740541.356, 740549.277, 740474.724, 740418.118, 740370.187,
> 740354.321, 740410.53, 740467.451, 740523.772, 740522.433, 740474.797,
> 740400.293, 740343.175, 740336.067, 740392.917, 740449.622, 740506.162,
> 740495.664, 740448.693, 740382.062, 740325.464, 740318.174, 740430.337,
> 740488.37, 740477.578, 740429.695, 740373.133, 740325.408, 740631.842,
> 740688.362, 740744.857, 740726.149, 740695.778, 740621.663, 740613.553,
> 740670.205, 740726.566, 740733.965, 740660.272, 740620.315, 740594.82,
> 740651.714, 740708.217, 740690.056, 740659.603, 740575.902, 740576.796,
> 740558.179), y = c(181644.086, 181620.772, 181605.577, 181590.417,
> 181568.637, 181586.847, 181604.615, 181622.136, 181565.531, 181547.708,
> 181530.169, 181512.439, 181490.328, 181513.956, 181531.875, 181547.048,
> 181508.946, 181491.148, 181473.394, 181455.233, 181436.726, 181452.342,
> 181475.522, 181492.661, 181451.96, 181434.265, 181416.566, 181398.729,
> 181382.764, 181397.808, 181418.748, 181436.677, 181395.477, 181360.409,
> 181342.547, 181327.09, 181341.971, 181359.55, 181374.453, 181546.959,
> 181528.576, 181510.58, 181497.501, 181507.159, 181530.759, 181490.008,
> 181472.209, 181453.968, 181432.87, 181456.085, 181468.588, 181433.854,
> 181415.758, 181397.497, 181384.359, 181393.795, 181420.579, 181376.982,
> 181363.899), depth_cm = c(-8, -8, -8, -8, -8, -8, -8, -8, -8,
> -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8,
> -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8,
> -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8,
> -8, -8), theta_percent = c(23.63, 19.68, 23.81, 22.01, 23.98,
> 12.8, 14.92, 20.49, 22.59, 24.32, 20.24, 23.03, 12.97, 19.09,
> 39.2, 12.09, 24.52, 25.57, 25.5, 19.76, 19.17, 21.98, 7.5, 22.75,
> 17.85, 17.75, 17.95, 26.93, 18.84, 22.95, 23.71, 25.03, 40.69,
> 9.7, 24.66, 17.43, 16.3, 24.13, 19.98, 23.35, 12.16, 17.24, 14.29,
> 34.42, 21.84, 25.63, 20.51, 25.87, 24.44, 22.35, 8.57, 21.43,
> 25.63, 21.56, 21.49, 17.66, 25.61, 24.11, 28.31)), .Names = c("x",
> "y", "depth_cm", "theta_percent"), row.names = c("1", "5", "10",
> "15", "20", "25", "30", "35", "40", "45", "50", "55", "60", "65",
> "70", "75", "80", "85", "90", "95", "100", "105", "109", "114",
> "119", "124", "129", "134", "139", "144", "149", "154", "159",
> "164", "169", "174", "179", "184", "188", "193", "198", "203",
> "208", "213", "218", "223", "228", "233", "238", "243", "248",
> "253", "258", "263", "268", "273", "278", "283", "288"), class = 
> "data.frame")
>
>
> # the broken fit for best search
> require("automap")
> coordinates(sparse) = c("x", "y", "depth_cm")
> proj4string(sparse) = CRS("+init=epsg:32119")
> v.fit <- autofitVariogram(theta_percent~1, sparse)
>
>
> There were 50 or more warnings (use warnings() to see the first 50)
> > warnings()
> Warning messages:
> 1: In getModel(initial_sill - initial_nugget, m, initial_range,  ... :
>   An error has occured during variogram fitting. Used:
>         nugget: 34.1432533936652
>         model:  Exp
>         psill:  13.2004974623731
>         range:  53.1549477005646
>         kappa:  NA
>   as initial guess. This particular variogram fit is not taken into 
> account.
> Gstat error:
> Error in if (direct[direct$id == id, "is.direct"] && any(model$psill <  :
>   missing value where TRUE/FALSE needed
>
> 2: In fit.variogram(object, model, fit.sills = fit.sills,  ... :
>   value out of range in 'bessel_k'
> 3 ...
>
>
> # a quick visual of the data in the field
> rescale = function(x, to=c(1,10)) (x - min(x)) * ((max(to) - 
> min(to))/(max(x) - min(x)))
> require("rgl")
> sparse_df=as.data.frame(sparse)
> spheres3d(sparse_df$x, sparse_df$y, sparse_df$depth_cm, 
> radius=rescale(sparse_df$theta_percent))
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo


-- 
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770



More information about the R-sig-Geo mailing list