[R] GAM: mismatch between nb/polys supplied area names and data area names

Sat Jan 27 03:19:53 CET 2018

Hello, I am new to R and running R version 3.4.3 (2017-11-30),
x86_64-apple-darwin15.6.0 (64-bit), macOS High Sierra 10.13.2.

I am running the gam package to model disease incidence (negative binomial
distribution) as a function of two covariates, and wish to incorporate
spatial correlation among areal neighbors, n = 50 polygons, identified by
"id".  For data observed over discrete spatial units, a Markov random field
can be used through the GAM syntax:
s(id, bs="mrf", xt = list(nb = nb), where the latter nb refers to an object
with a neighbor list.

The error is "Error in smooth.construct.mrf.smooth.spec(object, dk$data,
dk$knots) :
mismatch between nb/polys supplied area names and data area names".

I have read the documentation, studied the function, and looked at the
traceback, and have been careful to use matching variable types for area
names in nb and data, but have not been successful.

Any advice would be appreciated. Below are code and a data sample.

Thank you,

Susan

code:

#read in data; from a numeric polygon "IDENTIFIER" create a new polygon
"id" that is a factor
datawide <- read_csv("~/Long/ALLWIDEDATA.csv")
datawide <- transform(datawide, id = factor(formatC(IDENTIFIER, width = 2,
flag = "0")))
#NB: the new area ids are: "01", "02",..."50"

#read in the shapefile and create the neighborhood object nb,
#names(nb) must correspond to the levels of the covariate of the smooth
(i.e. the area labels)
shape<-readOGR(dsn="~/Long/MENHShape",layer=("MENHShape2"))
nb <- poly2nb(shape, queen = TRUE, snap=100, row.names=datawide$id)
names(nb) <- attr(nb, "region.id")

#run GAM spatial neighbors plus covariates
gamspcov <- gam(y2008_2014rate~s(id, bs="mrf", xt = list(nb = nb), k=20) +
s(deermi2) + s(t14), family=nb( ), data=datawide, method="MLE")

#ERROR MESSSAGE
#Error in smooth.construct.mrf.smooth.spec(object, dk$data, dk$knots) :
#  mismatch between nb/polys supplied area names and data area names
#In addition: Warning message:
#  In if (all.equal(sort(a.name), sort(levels(k))) != TRUE) stop("mismatch
between nb/polys supplied area names and data area names") :
#  the condition has length > 1 and only the first element will be used

data sample:

IDENTIFIER (num) id (factor) caserate (num) deer (num) temp (num) <more
columns>
1 01 0.0 2.0 -9.0 <etc.>
2 02 3.1 8.5 -7.0 <etc.>
...
50 50 200.0 25.0 -3 <etc.>

	[[alternative HTML version deleted]]