[R-sig-Geo] R/gstat UK with indicators
David Rossiter
rossiter at itc.nl
Tue Feb 19 14:39:07 CET 2008
I am sure Edzer can answer this, but I post it to the list because it
may be of general interest.
If I compute an indicator:
> data(meuse)
> meuse$i.Pb.med <- meuse$lead < median(meuse$lead)
I can of course compute an indicator variogram and krige at new
locations:
> vi <- variogram(i.Pb.med ~1, meuse)
> plot(vi, pl=T)
> (vim <- fit.variogram(vi, vgm(0.17, "Sph", 600, 0.1)))
model psill range
1 Nug 0.08772438 0.0000
2 Sph 0.18157082 691.2488
> plot(vi, pl=T, model=vim)
> ki <- krige(i.Pb.med ~1, meuse, newdata=meuse.grid, model=vm)
[using ordinary kriging]
> summary(ki$var1.pred)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.03628 0.33830 0.64020 0.58080 0.81610 0.95040
I can also fit a logistic regression model to the indicator with a GLM:
> (gm <- glm(i.Pb.med ~ dist, family="binomial", data=meuse at data))
(Intercept) dist
-2.562 11.662
Degrees of Freedom: 154 Total (i.e. Null); 153 Residual
Null Deviance: 214.9 Residual Deviance: 131.2 AIC: 135.2
Here the ~ in the formula is interpreted according to the link function.
I can compute residuals from the GLM, get the residual variogram, and
model it:
> meuse$glm.r <- residuals(gm)
> v <- variogram(glm.r ~ 1, meuse, cutoff=1000)
> plot(v, pl=T)
> vm <- vgm(0.45, "Exp", 450, 0.6) # eyeball fit
But, what happens when I use the same formula in gstat?
First, just to compute the residual variogram:
> vr <- variogram(i.Pb.med ~ dist, meuse, cutoff=1000)
The two residual variograms are very different:
> max(v$gamma)
[1] 1.001015
> max(vr$gamma)
[1] 0.1741679
So clearly the formula (i.Pb.med ~ dist) is being interpreted as a
linear model, not a logit model.
Second, an attempt at kriging with external drift:
> ked <- krige(i.Pb.med ~ dist, loc=meuse, newdata=meuse.grid, model=vm)
[using universal kriging]
Since I could not specify a link function, it seems that a linear model
is being fit (not a logit model).
And we see very strange results for an indicator:
> summary(ked$var1.pred)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.1235 0.2719 0.6550 0.6056 0.9139 1.4990
So (after all that buildup), I just want to know if my interpretation is
correct, that gstat is not interpreting the formula in krige() or in
variogram() as a GLM, rather as a linear model, which doesn't make sense
with indicators.
Thank you,
D G Rossiter
Senior University Lecturer
Department of Earth Systems Analysis (DESA)
International Institute for Geo-Information Science and Earth
Observation (ITC)
Enschede, The Netherlands
Internet: http://www.itc.nl/personal/rossiter
More information about the R-sig-Geo
mailing list