[R-sig-Geo] Comparison of prediction performance (mapping accuracy) - how to test if a method B is significantly more accurate than method A?
Tomislav Hengl
hengl at spatial-analyst.net
Thu Aug 28 17:10:22 CEST 2014
Dear list,
I'm trying to standardize a procedure to compare performance of
competing spatial prediction methods. I know that this has been
discussed in various literature and on various mailing lists, but I
would be interested in any opinion I could get.
I am comparing (see below) 2 spatial prediction methods
(regression-kriging and inverse distance interpolation) using 5-fold
cross-validation and then testing if the difference between the two is
significant. What I concluded is that there are two possible tests for
the final residuals:
1. F-test to compare variances (cross-validation residuals),
2. t-test to compare mean values,
Both tests might be important, nevertheless the F-test ("var.test")
seems to be more interesting to really be able to answer "is the method
B significantly more accurate than method A?". It appears that the
second test ("t.test") is only important if it fails -> which would mean
that one of the methods systematically over or under-estimates the mean
value (which should be 0). Did I maybe miss some important test?
Thank you!
R> library(GSIF)
R> library(gstat)
R> library(sp)
R> set.seed(2419)
R> demo(meuse, echo=FALSE)
R> omm1 <- fit.gstatModel(meuse, log1p(om)~dist+soil, meuse.grid)
Fitting a linear model...
Fitting a 2D variogram...
Saving an object of class 'gstatModel'...
R> rk1 <- predict(omm1, meuse.grid)
R> meuse.s <- meuse[!is.na(meuse$om),]
R> ok1 <- krige.cv(log1p(om)~1, meuse.s, nfold=5)
R> var.test(ok1$residual, rk1 at validation$residual, alternative = "greater")
F test to compare two variances
data: ok1$residual and rk1 at validation$residual
F = 1.2283, num df = 152, denom df = 152, p-value =
0.103
alternative hypothesis: true ratio of variances is greater than 1
95 percent confidence interval:
0.9398662 Inf
sample estimates:
ratio of variances
1.228322
R> ## No significant difference
R> t.test(ok1$residual, rk1 at validation$residual)
Welch Two Sample t-test
data: ok1$residual and rk1 at validation$residual
t = -0.0204, df = 300.842, p-value = 0.9837
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.07084667 0.06939220
sample estimates:
mean of x mean of y
0.0004766718 0.0012039089
R> ## Again, no significant difference
R> sessionInfo()
R version 3.0.3 (2014-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
other attached packages:
[1] randomForest_4.6-7 nortest_1.0-2
[3] gstat_1.0-19 GSIF_0.4-2
[5] sp_1.0-15 gap_1.1-12
More information about the R-sig-Geo
mailing list