[R-sig-Geo] Comparison of prediction performance (mapping accuracy) - how to test if a method B is significantly more accurate than method A?

Sat Aug 30 16:30:56 CEST 2014

Dear Tom/list,

The subject could also be look as the same problem encoutered in 
ensemble forecast (e.g. meteorology).

If you could have more "folders" in your analysis (you can see each 
folder as a member of an ensemble)
you could compare the two methods as it is done in ensemble forecast, in 
meteorology and hydrology. Both disciplines provide tools which help to 
study the accuracy, uncertainty and bias related to a forecast. Based on 
this methodological framework, it could be possible to compare both 
methods on several criterias.

For those which could be interested by the subject :

Brochero (2013) provides a good review of this subject in is Ph.D. 
thesis and describe several indicators (Chapter 1). Hydroinformatics and 
diversity in hydrological ensemble prediction systems.
  http://theses.ulaval.ca/archimede/meta/29908

The site below also provides a quick and simple review of the typical 
indicators use in meteorological forecast:
http://www.eumetcal.org/resources/ukmeteocal/verification/www/english/courses/msgcrs/index.htm

However, in your case this approach seems limited by the number of 
k-folders use.

This is just an idea that is worth explorating. In my research, I entend 
to explore this approach. Any comments/suggestions?

Le 8/28/2014 11:28 AM, Tim Appelhans a écrit :
> On 08/28/2014 05:10 PM, Tomislav Hengl wrote:
>> Dear list,
>>
>> I'm trying to standardize a procedure to compare performance of
>> competing spatial prediction methods. I know that this has been
>> discussed in various literature and on various mailing lists, but I
>> would be interested in any opinion I could get.
>>
>> I am comparing (see below) 2 spatial prediction methods
>> (regression-kriging and inverse distance interpolation) using 5-fold
>> cross-validation and then testing if the difference between the two is
>> significant. What I concluded is that there are two possible tests for
>> the final residuals:
>> 1. F-test to compare variances (cross-validation residuals),
>> 2. t-test to compare mean values,
> If you think in terms of accuracy vs. precision, I'd say both tests are
> equally important. Ideally you want your method to be precise (low
> variance) and accurate (low deviation around mean). What I usually tend
> to do is repeated random sub-sampling with 100+ runs.
>> Both tests might be important, nevertheless the F-test ("var.test")
>> seems to be more interesting to really be able to answer "is the
>> method B significantly more accurate than method A?". It appears that
>> the second test ("t.test") is only important if it fails -> which
>> would mean that one of the methods systematically over or
>> under-estimates the mean value (which should be 0). Did I maybe miss
>> some important test?
>>
>> Thank you!
>>
>> R> library(GSIF)
>> R> library(gstat)
>> R> library(sp)
>> R> set.seed(2419)
>> R> demo(meuse, echo=FALSE)
>> R> omm1 <- fit.gstatModel(meuse, log1p(om)~dist+soil, meuse.grid)
>> Fitting a linear model...
>> Fitting a 2D variogram...
>> Saving an object of class 'gstatModel'...
>> R> rk1 <- predict(omm1, meuse.grid)
>> R> meuse.s <- meuse[!is.na(meuse$om),]
>> R> ok1 <- krige.cv(log1p(om)~1, meuse.s, nfold=5)
>> R> var.test(ok1$residual, rk1 at validation$residual, alternative =
>> "greater")
>>
>>          F test to compare two variances
>>
>> data:  ok1$residual and rk1 at validation$residual
>> F = 1.2283, num df = 152, denom df = 152, p-value =
>> 0.103
>> alternative hypothesis: true ratio of variances is greater than 1
>> 95 percent confidence interval:
>>   0.9398662       Inf
>> sample estimates:
>> ratio of variances
>>            1.228322
>> R> ## No significant difference
>> R> t.test(ok1$residual, rk1 at validation$residual)
>>
>>          Welch Two Sample t-test
>>
>> data:  ok1$residual and rk1 at validation$residual
>> t = -0.0204, df = 300.842, p-value = 0.9837
>> alternative hypothesis: true difference in means is not equal to 0
>> 95 percent confidence interval:
>>   -0.07084667  0.06939220
>> sample estimates:
>>     mean of x    mean of y
>> 0.0004766718 0.0012039089
>> R> ## Again, no significant difference
>>
>> R> sessionInfo()
>> R version 3.0.3 (2014-03-06)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> other attached packages:
>> [1] randomForest_4.6-7 nortest_1.0-2
>> [3] gstat_1.0-19       GSIF_0.4-2
>> [5] sp_1.0-15          gap_1.1-12
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

	[[alternative HTML version deleted]]