[R-sig-Geo] spatialreg::predict.sarlm

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Fri May 15 17:48:01 CEST 2020


On Fri, 15 May 2020, Weldensie Embaye wrote:

> I am trying to run out-of-sample prediction using 
> spatialreg::predict.sarlm package. I tried to prepare my data set as per 
> the requirements.However, it did not work.
>
> Below is the code.

This is your code, but it cannot be reproduced. Please provide a fully 
reproducible example that has available data, and which others can help 
you to debug. If you can make the input data and weights files available 
on a link, that might be easier than trying to use built-in data, 
especially as you are loading weights generated outside the workflow.

A further problem is that prediction needs both data sets and weights, 
because predictions for test observations are autocorrelated with training 
set observations, linked through the weights links.

This leads to the problematic status of train/test on spatial data, 
because ever-present autocorrelation (or entity misspecification) leaks 
between training and test sets (which are then no longer independent). 
There are steps that can be taken to handle this (see for point support 
the blockCV package), not so far there is no such literature (I think) for 
spatial econometrics models.

Please also post plain text, not HTML, to make copying and pasting code 
simpler.

Hope this helps,

Roger

> csv_train <- read.csv("train.csv") #upload your csv. R needs to know your
> region ID
>> hhid <- csv_train$hhid #Explicitly created a variable to hold the regions
>> nb_train <- read.gwt2nb('train.gwt', region.id=hhid)
>> Q_train<-nb2listw(nb_train)
>> csv_test <- read.csv("test.csv") #upload your csv. R needs to know your
> region ID
>> hhid <- csv_test$hhid #Explicitly created a variable to hold the regions
>> nb_test <- read.gwt2nb('test.gwt', region.id=hhid)
>> Q_test<-nb2listw(nb_test)
>> y_test <- dataTable$annualrent[test]
>> ######## spatial lag process model
>> GM5<-spatialreg::lagsarlm(formula, data=traindata, listw=Q_train)
>> summary(GM5)
>
> Call:spatialreg::lagsarlm(formula = formula, data = traindata, listw =
> Q_train)
>
> Residuals:
>      Min        1Q    Median        3Q       Max
> -54.38677  -7.32082  -0.78772   5.69047  90.56889
>
> Type: lag
> Coefficients: (asymptotic standard errors)
>             Estimate Std. Error z value  Pr(>|z|)
> (Intercept) -20.63671    4.08382 -5.0533 4.343e-07
> coveredpri   -4.04663    4.71961 -0.8574  0.391219
> coveredsha   -3.70054    2.57827 -1.4353  0.151207
> vipprivate   -3.19640    4.57774 -0.6982  0.485022
> vipshare     -2.26742    2.30487 -0.9838  0.325238
> unoveredla   -1.29846    3.62671 -0.3580  0.720322
> flushpriva   10.22320    3.31060  3.0880  0.002015
> electricit   11.25641    2.07508  5.4246 5.810e-08
> privatetap    6.48159    2.63420  2.4606  0.013872
> publictap    -2.91930    3.46341 -0.8429  0.399285
> watertanke   -5.52987    7.58483 -0.7291  0.465959
> protectewe   -5.44122    5.17858 -1.0507  0.293389
> river        -1.76395    3.57625 -0.4932  0.621843
> numberofro   12.47265    0.94757 13.1628 < 2.2e-16
> roof         -2.83454    3.62586 -0.7818  0.434357
> externalwa   10.19840    1.92089  5.3092 1.101e-07
> floor         3.81212    2.55901  1.4897  0.136307
>
> Rho: 0.3742, LR test value: 4.3541, p-value: 0.03692
> Asymptotic standard error: 0.16111
>    z-value: 2.3227, p-value: 0.020197
> Wald statistic: 5.3948, p-value: 0.020197
>
> Log likelihood: -1842.303 for lag model
> ML residual variance (sigma squared): 273.25, (sigma: 16.53)
> Number of observations: 436
> Number of parameters estimated: 19
> AIC: 3722.6, (AIC for lm: 3725)
> LM test for residual autocorrelation
> test value: 1.3748, p-value: 0.24099
>
>> GM5_predict <- spatialreg::predict.sarlm(GM5, newdata = testdata, listw =
> Q_test)
> Error in spatialreg::predict.sarlm(GM5, newdata = testdata, listw = Q_test)
> :
>  mismatch between newdata and spatial weights. newdata should have
> region.id as row.names
>>
>
> Any idea, what is going on?
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en



More information about the R-sig-Geo mailing list