[R-sig-Geo] Error while using predict.sarlm

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Sat May 25 18:18:50 CEST 2019


On Fri, 24 May 2019, Amitha Puranik wrote:

> I am facing an error while using predict.sarlm to make predictions for spatial
> lag model generated using lagsarlm. I used the following code:
>
> predicted = predict(fit.lag, listw=weightmatrix, newdata=missed_data,
> pred.type="TS", zero.policy = T)
>
> For the argument newdata, I have passed the same data missed_data which I
> used to fit the spatial lag model.
>
> When I run the above code, I get the following error message: “Error in
> predict.sarlm(fit.lag, listw = weightmatrix, newdata = missed_data,  :
> mismatch between newdata and spatial weights. newdata should have region.id
> as row.names”

The predict method has to identify the weights applying to the newdata. So 
it uses the region.id attribute of the neighbour object, and the row.names 
of the newdata object. If they do not match, it error-exits. If shp below 
was read in the typical way, the default region.id may be the FID of the 
input file (0, ..., (n-1)), but the default row.names of newdata may be 1, 
..., n.

For example:

> library(sf)
Linking to GEOS 3.7.2, GDAL 3.0.0, PROJ 6.1.0
> boston_506 <- st_read(system.file(
+                                   "shapes/boston_tracts.shp",
+                                   package="spData")[1])
Reading layer `boston_tracts' from data source 
`/home/rsb/lib/r_libs/spData/shapes/boston_tracts.shp' using driver `ESRI 
Shapefile'
Simple feature collection with 506 features and 36 fields
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: -71.52311 ymin: 42.00305 xmax: -70.63823 ymax: 
42.67307
epsg (SRID):    4267
proj4string:    +proj=longlat +datum=NAD27 +no_defs
> nb_q <- spdep::poly2nb(boston_506)
> lw_q <- spdep::nb2listw(nb_q, style="W")
> boston_489 <- boston_506[!is.na(boston_506$median),]
> nb_q_489 <- spdep::poly2nb(boston_489)
> lw_q_489 <- spdep::nb2listw(nb_q_489, style="W", zero.policy=TRUE)
> form <- formula(log(median) ~ CRIM + ZN + INDUS + CHAS +
+                 I((NOX*10)^2) + I(RM^2) + AGE + log(DIS) +
+                 log(RAD) + TAX + PTRATIO + I(BB/100) +
+                 log(I(LSTAT/100)))
> suppressPackageStartupMessages(library(spatialreg))
>
> eigs_489 <- eigenw(lw_q_489)
>
> SLM_489 <- lagsarlm(form, data=boston_489,
+           listw=lw_q_489, zero.policy=TRUE,
+           control=list(pre_eig=eigs_489))
>
> nd <- boston_506[is.na(boston_506$median),]
> t0 <- exp(predict(SLM_489, newdata=nd, listw=lw_q,
+                   pred.type="TS", zero.policy=TRUE))
> str(attr(lw_q, "region.id"))
  chr [1:506] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" 
"15" "16" ...
> str(row.names(nd))
  chr [1:17] "13" "14" "15" "17" "43" "50" "312" "313" "314" "317" "337" 
"346" "355" ...
> all(row.names(nd) %in% attr(lw_q, "region.id"))
[1] TRUE
# introduce a wrong row.name
> row.names(nd)[1] <- "0"
> all(row.names(nd) %in% attr(lw_q, "region.id"))
[1] FALSE
> t0 <- exp(predict(SLM_489, newdata=nd, listw=lw_q,
+                   pred.type="TS", zero.policy=TRUE))
Error in predict.sarlm(SLM_489, newdata = nd, listw = lw_q,
   pred.type = "TS",  :
   mismatch between newdata and spatial weights. newdata should have
   region.id as row.names

In this case, the row.names of the input object to spdep::poly2nb() and 
the region.id matched, as the newdata were subsetted from the same object. 
We don't know the values for your data, but you should be able to check 
them. It is important that they align the data with the weights correctly 
for obvious reasons.

Hope this helps,

Roger

>
> I have obtained the weight matrix from the function below
>
> weightMat <- function(shp){
>
>  dnb <- knearneigh(coordinates(shp), k=4)
>
>  dnb <- knn2nb(dnb) #create nb
>
>  lw <- nb2listw(dnb, style="W",zero.policy=TRUE) #create lw
>
>  return(lw)
>
> }
>
> To cross check and make sure there are no discrepancies, I have run the
> following lines
>
> length(weightmatrix$weights)
>
> nrow(missed_data)
>
> nrow(coordinates(shape))
>
> For all the codes above, the result is 182, which is the sample size of
> data.
>
> Can anyone offer me some guidance in solving this problem? Thanks for your
> help.
>
>
>       Thanks & regards,
>
> *Amitha Puranik*
>
> Assistant Professor,
>
> Department of Statistics, PSPH
>
> Phone:0820-2922407
> Address:Department of Statistics,
>
> Health Sciences Library, Level 6,
>
> Manipal Academy of Higher Education,Manipal,Karnataka,India
>
> An Institute of Eminence (Status Accorded by MHRD)
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en


More information about the R-sig-Geo mailing list