[R-sig-Geo] Error in predict.sarlm: non-unique row.names given
Roger Bivand
Roger@B|v@nd @end|ng |rom nhh@no
Mon Jul 8 05:32:01 CEST 2019
Do provide a complete reproducible example. I really appeal to all posting
questions to give potential helpers something to work on. Asking for
reproducible examples is the absolutely dominant response to postings that
lack them, if they get any response at all.
Start with this and work backwards until you can reproduce your
misunderstanding:
col <- st_read(system.file("shapes/columbus.shp", package="spData"))
train <- col[col$EW == 1,]
test <- col[col$EW == 0,]
col.nb <- spdep::poly2nb(col)
train.nb <- spdep::poly2nb(train)
test.nb <- spdep::poly2nb(test)
attr(col.nb, "region.id")
attr(train.nb, "region.id")
attr(test.nb, "region.id")
train.mod <- lagsarlm(CRIME ~ INC + HOVAL, data=train,
listw=spdep::nb2listw(train.nb))
try(preds <- predict(train.mod, newdata=test,
listw=spdep::nb2listw(test.nb)))
preds[2]
try(preds1 <- predict(train.mod, newdata=col,
listw=spdep::nb2listw(col.nb)))
# warning
preds1[4]
try(preds2 <- predict(train.mod, newdata=test,
listw=spdep::nb2listw(col.nb)))
preds2[2]
Using the complete set of weights permits the spatial process to flow
between neighbouring members of train/test sets.
Your problem is probably that your two data objects do not use row.names
as expected:
attr(test.nb, "region.id") <- as.character(1:length(test.nb))
attr(train.nb, "region.id") <- as.character(1:length(train.nb))
train.mod1 <- lagsarlm(CRIME ~ INC + HOVAL, data=train,
listw=spdep::nb2listw(train.nb))
try(preds3 <- predict(train.mod, newdata=test,
listw=spdep::nb2listw(test.nb)))
# Error in predict.sarlm(train.mod, newdata = test, listw =
# spdep::nb2listw(test.nb)) :
# mismatch between newdata and spatial weights. newdata should have
# region.id as row.names
as is obvious. So when the predict method is trying to assign the newdata
neighbours (it needs to identify the correct rows in newdata based on the
"region.id" attribute of the provided weights), it fails as described.
Use the whole data weights when predicting for the test set newdata=, or
if the two graphs do not neighbour each other, that is train.nb is
separate from test.nb (think two islands), make sure that the region.ids
and row.names do not overlap between test and train sets.
Please use the example to explore the problem in your workflow, (re-)read
Goulard et al. (2017), and the help page, and report back. Remember that
you can only predict for a test set of reasonable size (because as you see
from the underlying article, you probably need an inverted nxn matrix in
the spatial lag model case).
Hope this clarifies
Roger
On Mon, 8 Jul 2019, Jiawen Ng wrote:
> Another question on predict.sarlm!
>
> Here is the line of code that is producing the error:
> pred <- spatialreg::predict.sarlm(model, df, test.listw,zero.policy = T)
>
> Here is the error:
>
> Error in mat2listw(W, row.names = region.id.mixed, style = style) :
> non-unique row.names given
> In addition: Warning messages:
> 1: In spatialreg::predict.sarlm(model, df, test.listw, :
> some region.id are both in data and newdata
> 2: In subset(attr(listw.mixed, "region.id"), attr(listw.mixed, "region.id")
> %in% :
> longer object length is not a multiple of shorter object length
>
> Any idea how I can solve the non-unique row.names error?
>
> Thank you!
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
More information about the R-sig-Geo
mailing list