[R-sig-Geo] negative range from fit.variogram through autofitVariogram

Mark Connolly mark_connolly at acm.org
Tue May 4 13:31:10 CEST 2010


I took the expedient of detecting the negative range and recording the 
offending time and space slice.  Then I went back and checked each model 
type individually.

This is perfectly acceptable but I did want to alert you to the 
possibility that autofitVariogram and therefore autoKrige could have 
issues with some data.

Since you asked, my preference would be that autofitVariogram returned 
the best usable model and an error if none can be found.  You could keep 
the current behavior as the default with some selector parameter 
(best="sserror" (default), best="usable" (alternative).  Or maybe just 
some mention in the help that an unusable model might be returned.

During the process of iterating through the data, I record the model 
information including the sserror.  My expectation is that I can use 
this information to look for suspicious models after the processing is 
complete.  I might be fooling myself, but I am trying to not have to go 
through thousands of variograms assessing the fits by eye.

Thanks,
Mark



On 05/04/2010 05:05 AM, Paul Hiemstra wrote:
> Hi Mark,
>
> Thanks for the reproducible example. The problem is that when I look 
> at the sample variogram, the semivariance values start high and end 
> low. This is best illustrated by:
>
> plot(variogram(theta_percent~1, sparse))
>
> You see that there are outliers in the data that cause high 
> semi-variance at a short distance. I would say that is 'strange' ;). 
> You can try and identify which point causes this by:
>
> plot(variogram(theta_percent~1, sparse, cloud = TRUE), identify = TRUE)
> # Click on the plot to identify the point pairs
>
> There is not really one value I think that causes this. It might be 
> also attributable to the fact that your dataset is somewhat sparse in 
> the distance range from 20-40 m.
>
> As a quick fix you can restrict the model selection to "Sph", this 
> works. And I would not be enthusiastic about using Ste. This is 
> because the main difference between different kappa values if the 
> behavior at short distances, but you don't have a lot of data on the 
> short distance to fit this value on in a meaningful way.
>
> At this stage I don't see a (relatively quick) fix that could solve 
> this problem in an automatic way and on a more fundamental level. Do 
> you have any suggestions? From an implementation point of view I can 
> let automap discard any model that has negative values in it, this 
> would ensure that the user gets at least the Sph model back.
>
> cheers,
> Paul
>
> Mark Connolly wrote:
>> I am using autofitVariogram during the process of interpolating a 
>> large set of daily observations through a volume.  Each volume is 
>> decomposed into 2D layers prior to selecting a model to use for 
>> interpolation.  I made it through 2010 interpolations and then ran 
>> into a failed interpolation when the best model selected by 
>> autofitVariogram had a negative range.  This was rejected by the 
>> krige function.  I see mention of negative sills but not of negative 
>> ranges.
>>
>> It appears that autofitVariogram is having some issues with the trial 
>> arguments sent to fit.variogram.  This is repeatable.  Not sure if 
>> this is a bug for some package or a data issue.  The data values do 
>> not look overly strange.
>>
>>
>>
>> # data
>> sparse =
>> structure(list(x = c(740381.862, 740456.052, 740503.958, 740551.752,
>> 740559.502, 740502.995, 740446, 740389.229, 740371.693, 740428.25,
>> 740484.918, 740541.356, 740549.277, 740474.724, 740418.118, 740370.187,
>> 740354.321, 740410.53, 740467.451, 740523.772, 740522.433, 740474.797,
>> 740400.293, 740343.175, 740336.067, 740392.917, 740449.622, 740506.162,
>> 740495.664, 740448.693, 740382.062, 740325.464, 740318.174, 740430.337,
>> 740488.37, 740477.578, 740429.695, 740373.133, 740325.408, 740631.842,
>> 740688.362, 740744.857, 740726.149, 740695.778, 740621.663, 740613.553,
>> 740670.205, 740726.566, 740733.965, 740660.272, 740620.315, 740594.82,
>> 740651.714, 740708.217, 740690.056, 740659.603, 740575.902, 740576.796,
>> 740558.179), y = c(181644.086, 181620.772, 181605.577, 181590.417,
>> 181568.637, 181586.847, 181604.615, 181622.136, 181565.531, 181547.708,
>> 181530.169, 181512.439, 181490.328, 181513.956, 181531.875, 181547.048,
>> 181508.946, 181491.148, 181473.394, 181455.233, 181436.726, 181452.342,
>> 181475.522, 181492.661, 181451.96, 181434.265, 181416.566, 181398.729,
>> 181382.764, 181397.808, 181418.748, 181436.677, 181395.477, 181360.409,
>> 181342.547, 181327.09, 181341.971, 181359.55, 181374.453, 181546.959,
>> 181528.576, 181510.58, 181497.501, 181507.159, 181530.759, 181490.008,
>> 181472.209, 181453.968, 181432.87, 181456.085, 181468.588, 181433.854,
>> 181415.758, 181397.497, 181384.359, 181393.795, 181420.579, 181376.982,
>> 181363.899), depth_cm = c(-8, -8, -8, -8, -8, -8, -8, -8, -8,
>> -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8,
>> -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8,
>> -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8,
>> -8, -8), theta_percent = c(23.63, 19.68, 23.81, 22.01, 23.98,
>> 12.8, 14.92, 20.49, 22.59, 24.32, 20.24, 23.03, 12.97, 19.09,
>> 39.2, 12.09, 24.52, 25.57, 25.5, 19.76, 19.17, 21.98, 7.5, 22.75,
>> 17.85, 17.75, 17.95, 26.93, 18.84, 22.95, 23.71, 25.03, 40.69,
>> 9.7, 24.66, 17.43, 16.3, 24.13, 19.98, 23.35, 12.16, 17.24, 14.29,
>> 34.42, 21.84, 25.63, 20.51, 25.87, 24.44, 22.35, 8.57, 21.43,
>> 25.63, 21.56, 21.49, 17.66, 25.61, 24.11, 28.31)), .Names = c("x",
>> "y", "depth_cm", "theta_percent"), row.names = c("1", "5", "10",
>> "15", "20", "25", "30", "35", "40", "45", "50", "55", "60", "65",
>> "70", "75", "80", "85", "90", "95", "100", "105", "109", "114",
>> "119", "124", "129", "134", "139", "144", "149", "154", "159",
>> "164", "169", "174", "179", "184", "188", "193", "198", "203",
>> "208", "213", "218", "223", "228", "233", "238", "243", "248",
>> "253", "258", "263", "268", "273", "278", "283", "288"), class = 
>> "data.frame")
>>
>>
>> # the broken fit for best search
>> require("automap")
>> coordinates(sparse) = c("x", "y", "depth_cm")
>> proj4string(sparse) = CRS("+init=epsg:32119")
>> v.fit <- autofitVariogram(theta_percent~1, sparse)
>>
>>
>> There were 50 or more warnings (use warnings() to see the first 50)
>> > warnings()
>> Warning messages:
>> 1: In getModel(initial_sill - initial_nugget, m, initial_range,  ... :
>>   An error has occured during variogram fitting. Used:
>>         nugget: 34.1432533936652
>>         model:  Exp
>>         psill:  13.2004974623731
>>         range:  53.1549477005646
>>         kappa:  NA
>>   as initial guess. This particular variogram fit is not taken into 
>> account.
>> Gstat error:
>> Error in if (direct[direct$id == id, "is.direct"] && any(model$psill 
>> <  :
>>   missing value where TRUE/FALSE needed
>>
>> 2: In fit.variogram(object, model, fit.sills = fit.sills,  ... :
>>   value out of range in 'bessel_k'
>> 3 ...
>>
>>
>> # a quick visual of the data in the field
>> rescale = function(x, to=c(1,10)) (x - min(x)) * ((max(to) - 
>> min(to))/(max(x) - min(x)))
>> require("rgl")
>> sparse_df=as.data.frame(sparse)
>> spheres3d(sparse_df$x, sparse_df$y, sparse_df$depth_cm, 
>> radius=rescale(sparse_df$theta_percent))
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>



More information about the R-sig-Geo mailing list