[R-sig-Geo] negative r-squares

Edzer Pebesma edzer.pebesma at uni-muenster.de
Thu Sep 9 21:15:23 CEST 2010


Pinar, Jason,

>From the script below it seems no adjustment for degrees of freedom
is being made.

In this case R2 can become negative because you use a different
test and train set. Suppose the test set contains one single
extreme that is not present in the training set. In that case, the
mean of the test values is, in terms of sum of squares, a better
predicter than your regression model that didn't know about this
outlier. Don't forget that the mean of the test set does contain
this outlier. Hence, R2 can easily become negative when evaluated
over a different data set then the regression model was derived from.

On 09/09/2010 06:25 PM, Jason Gasper wrote:
> Hello Pinar,
> 
> I don't know for sure what your calculation is, but R2 values can range
> from -inf to 1 if an adjusted R2 is being used. In other words, one
> possibility is that your adjusting for degrees of freedom using some
> variation of the following (n-1/n-k)(1-R2) where the adjusted R2 is
> equivalent to simple regression when k=1.  So when the estimated R2 less
> than or equal to 0 that means the model forecast is inferior to the mean
> (really poor fit). Another way of looking at a negative R2 is that the
> fit is worse than a horizontal line, so the sum-of-squares from the
> model is larger than the sum-of-squares from a horizontal line. Again,
> poor fit.
> 
> Cheers-Jason
> 
> 
> Pinar Aslantas Bostan wrote:
>> Hi all,
>>
>> I am working about comparison of kriging and regression methods. I
>> have one dependent (PREC) and seven independent variables. I created
>> 10 different test and train datasets. I am using train datasets for
>> building the models and test datasets for calculating error (RMSE) and
>> r-squares. When I obtained prediction values for grid, then I use
>> overlay() to get predictions for test dataset. For example:
>>
>> # regression kriging
>> # dem is the grid (I want to get predictions for each pixel of dem)
>> and dem$rk.pred1 contains regression kriging predictions
>>> test1$rk.predicted = dem$rk.pred1[overlay(dem, test1)]
>>
>> # calculating r-square values based on test values
>>> ss <-(test1$PREC-mean(test1$PREC))*(test1$PREC-mean(test1$PREC))
>>> sst1<-sum(ss)
>>> e <-(test1$PREC-test1$rk.predicted)*(test1$PREC-test1$rk.predicted)
>>> sse.rk<-sum(e)
>>> rk1.r.square<-1-(sse.rk/sst1)
>>
>> My problem is that, for some datasets the methods can be resulted with
>> negative r-squares. Here I gave an example about regression kriging
>> but also same problem may occur for linear regression. I checked the
>> dependent and independent variables and there is no problem with them.
>> Are there anyone who knows another function instead of overlay() for
>> the same purpose? (I thougt that maybe the problem is because of
>> overlay function) or do you have any idea about reason of negative
>> r-square values?
>>
>> Best regards,
>> Pinar
>>
>>
>> ********************************************************************************
>>
>> Pinar Aslantas Bostan
>> Research Assistant
>> Department of Geodetic and
>> Geographic Information Technologies (GGIT)
>> Middle East Technical University
>> 06531 Ankara/TURKEY
>> aslantas at metu.edu.tr
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> 

-- 
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
http://www.52north.org/geostatistics      e.pebesma at wwu.de



More information about the R-sig-Geo mailing list