[R-sig-Geo] Getting spgm Models to work at Substate Levels - SAMHSA NSDUH Data - with Added nb Links

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Thu Aug 8 21:41:39 CEST 2019


On Thu, 8 Aug 2019, Stuart Reece wrote:

> Thankyou so much again Roger.
>
> Both the connectivity maps and regressions work fine with the state data 
> so I hope you will not mind if I use the substate data from the link 
> mentioned earlier at :
>
> https://www.samhsa.gov/data/sites/default/files/cbhsq-reports/NSDUHsubstateShapeFile2016/ShapeFile2016.zip
>
> and then just subset for one state.  New York would be good to choose as 
> it has Richmond Island off the southern tip of Long Island which is not 
> connected to any other area.  So it naturally forms a major point of 
> interest.
>
> I have done my level best to follow your instructions very carefully.

Thanks for your patience - now there is something to work on. It is easier 
to extract the code if you post plain text only, so my abbreviation is:

library(sf)
USC <- st_read(dsn="SubstateRegionData141516.shp")
USCReduced <- USC[,c(1:6,10,14,18,22,26,30,34,38,42,46,50,66)]
NYsf <- subset(USCReduced, ST_NAME=="New York")
NYsf$Ix <- 0:14
NYsf$YearDt <- as.POSIXct(as.Date("01/01/2015", format="%d/%m/%Y"),
tz="GMT")
NYsf <- NYsf[,c(19,20,1:5,9,6,7,10,14,15,12,8,13,16,17,18)]
library(spdep)
NYsfnb <- poly2nb(NYsf)
NYsfnb
NYsfnb[[8]] <- as.integer(5)
NYsfnb[[5]] <- as.integer(unique(sort(c(NYsfnb[[5]], 8))))
NYsflw  <- nb2listw(NYsfnb)
library(splm)
ml_obj <- spml(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, lag=TRUE, effect="individual", model="random",
  spatial.error="b"))
gm_obj <- spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, lag=TRUE, moments="fullweights", method="g2sls",
  model="random", spatial.error=TRUE)
# Error in x[, ii] : subscript out of bounds
gm_obj <- spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, lag=TRUE, moments="weights", method="g2sls",
  model="random", spatial.error=TRUE)
# Error in x[, ii] : subscript out of bounds
gm_obj <- spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, lag=TRUE, moments="initial", method="g2sls",
  model="random", spatial.error=TRUE)
# Error in x[, ii] : subscript out of bounds
gm_obj <- spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, moments="fullweights", spatial.error=TRUE)
# Error in x[, ii] : subscript out of bounds

Could you please try to run the model through plm - this subset seems only 
to have one time period? Perhaps plm is more forgiving than splm? Maybe 
that is confusing the internals?

summary(lm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf))
library(spatialreg)
summary(lagsarlm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw))

both complete without trouble?

Roger

>
> My code follows - obviously including the path to the file on my machine.
>
> The connectivity map - closely following your own model script of course -  is perfect.  The inserted link works just exactly correctly as you will see.
>
> The spml model also works OK albeit with some mysterious error which I do not well understand, but I do not think it matters much either.
>
>
>
> However the spgm model does not run at all.
>
> Of course this is also the most important model.
>
> This throws the usual error "Error in x[, ii] : subscript out of bounds".
>
> I have included the traceback notes, but I do not understand them at all.
>
>
>
> Thankyou again for all of your kind and gracious advice.
>
>
>
> Yours sincerely,
>
> Stuart.
>
>
>
>
>
> ################################################################################################################################################################################################
>
> ################## Using NYC as a Reproducible Example
>
> # Substate Shapefile - Single Year
>
>
>
> USC <- st_read(dsn=" [path to file]  /NSDUH/SubstateRegionData141516.shp")
>
> dim(USC)
>
> names(USC)
>
>
>
> USCReduced <- USC[,c(1:6,10,14,18,22,26,30,34,38,42,46,50,66)]
>
> dim(USCReduced)
>
> names(USCReduced)
>
>
>
> NYsf <- subset(USCReduced, ST_NAME=="New York")
>
> NYsf$Ix <- 0:14
>
> NYsf$YearDt <- as.POSIXct(as.Date("01/01/2015", format="%d/%m/%Y"), tz="GMT")
>
> class(NYsf$YearDt)
>
> dim(NYsf)
>
> names(NYsf)
>
> NYsf <- NYsf[,c(19,20,1:5,9,6,7,10,14,15,12,8,13,16,17,18)]
>
> dim(NYsf)
>
> names(NYsf)
>
> glimpse(NYsf)
>
> print(NYsf, n=15)
>
>
>
> NYsfnb <- poly2nb(NYsf)
>
> str(NYsfnb, max.level=0)
>
> str(NYsfnb, max.level=1)
>
> summary(NYsfnb)
>
>
>
> attr(NYsfnb, "region.id")[1:8]
>
>
>
> NYsfnbb <- NYsfnb
>
>
>
> attr(NYsfnbb, "region.id")
>
> NYsfnbb[[8]]                           # == Region 322 on the Original map
>
> NYsfnbb[[8]] <- as.integer(5)
>
> NYsfnbb[[8]]                           # == Region 322 on the Original map
>
> NYsfnbb[[5]]                           # == Region 319 on the Original map
>
> NYsfnbb[[5]] <- as.integer(c(6,7,8))
>
> NYsfnbb[[5]]
>
> card(NYsfnbb)
>
> summary(NYsfnbb)
>
>
>
> NYsflw  <- nb2listw(NYsfnb, zero.policy=TRUE)
>
> NYsflww <- nb2listw(NYsfnbb)
>
> str(NYsfnbb, max.level = 0)
>
> str(NYsflww, max.level = 0)
>
>
>
> can.be.simmed(NYsflww)
>
>
>
> ############### US Substate Test Mapping::
>
> coordsSubstateNY <- st_coordinates(st_centroid(st_geometry(NYsf), of_largest_polygon = TRUE))
>
>
>
> ## SubState_q1
>
> dxxx <- diffnb(NYsfnb, NYsfnbb)
>
> plot(st_geometry(NYsf), border = "grey50")
>
> plot(NYsfnb, coordsSubstateNY, add = TRUE, lwd=0.5)
>
> plot(dxxx, coordsSubstateNY, add = TRUE, col = "red", lwd=2)
>
> title(main=paste("Differences (red) in New York Sub-State GAL Queen Weights (Black) First Order - USA", cex.main=1))
>
>
>
> ### Checking w Regressions
>
> summary(spml(asinh(smiyr) ~ cigmon * asinh(mrjmon),
>
>             data=NYsf, listw=NYsflww,
>
>             lag=TRUE, effect="individual",
>
>             model="random", spatial.error="b"))
>
>
>
> summary(spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon),
>
>             data=NYsf, listw=NYsflww,
>
>             lag=TRUE, moments="fullweights", method="g2sls",
>
>             model="random", spatial.error=TRUE))
>
>
>
> #    Error in x[, ii] : subscript out of bounds
>
> #
>
> # > traceback()
>
> # 13: na.omit.data.frame(list(`yend[, i]` = c(0, 0, 0, 0, 0, 0, 0,
>
> #                                             0, 0, 0, 0, 0, 0, 0, 0), H = numeric(0)))
>
> # 12: na.omit(list(`yend[, i]` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>
> #                                  0, 0, 0, 0), H = numeric(0)))
>
> # 11: model.frame.default(formula = yend[, i] ~ H - 1, drop.unused.levels = TRUE)
>
> # 10: stats::model.frame(formula = yend[, i] ~ H - 1, drop.unused.levels = TRUE)
>
> # 9: eval(mf, parent.frame())
>
> # 8: eval(mf, parent.frame())
>
> # 7: lm(yend[, i] ~ H - 1)
>
> # 6: fitted.values(lm(yend[, i] ~ H - 1))
>
> # 5: spgm.tsls(ywithin, wywithin, Xwithin, Hwithin)
>
> # 4: ivplm.w2sls(Y = y, X = x, lag = TRUE, listw = Ws, listw2 = Ws2,
>
> #                twow = twow, lag.instruments = lag.instruments, T = T, N = N,
>
> #                NT = NT)
>
> # 3: spsarargm(formula = formula, data = data, index = index, listw = listw,
>
> #              listw2 = listw2, moments = moments, lag = lag, endog = endog,
>
> #              instruments = instruments, verbose = verbose, effects = effects,
>
> #              control = control, lag.instruments = lag.instruments, optim.method = optim.method,
>
> #              pars = pars, twow = twow)
>
> # 2: spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data = NYsf, listw = NYsflww,
>
> #         lag = TRUE, moments = "fullweights", model = "random", spatial.error = TRUE)
>
> # 1: summary(spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data = NYsf,
>
> #                 listw = NYsflww, lag = TRUE, moments = "fullweights", model = "random",
>
> #                 spatial.error = TRUE))
>
>
>
>
>
> -----Original Message-----
> From: Roger Bivand <Roger.Bivand using nhh.no>
> Sent: Thursday, 8 August, 2019 9:35 PM
> To: Stuart Reece <stuart.reece using bigpond.com>
> Cc: 'Dr Stuart Reece' <asreece using bigpond.net.au>; 'Barry Rowlingson' <b.rowlingson using gmail.com>; 'R-sig-geo Mailing List' <r-sig-geo using r-project.org>
> Subject: RE: [R-sig-Geo] Adding a Few Neighbour Relationships to a nb List
>
>
>
> On Thu, 8 Aug 2019, Stuart Reece wrote:
>
>
>
>> Thankyou for this advice Roger.
>
>> Happy to provide my work for your review but I am not sure how.
>
>> This list server has a limit of 50KB and these files are about 100MB....
>
>> Love to assist but I would need advice on how best to proceed....???
>
>
>
> Choose a built-in data set, such as Produc, change two states to ALASKA and HAWAII, drop the ones you chose from the map and any other states not on your map, and use that data set, putting the script needed at the head of your example. Then create the weights, initially with no links to ALASKA and HAWAII, and run the model. Next, edit the neighbour object in script form, and try again. Then anyone with your map object (you gave a link), your script, and the plm and splm packages can reproduce your problem.
>
>
>
> Do also run traceback() after errors, and post what you see.
>
>
>
> Roger
>
>
>
>> Many thanks again,
>
>> Stuart.
>
>>
>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en



More information about the R-sig-Geo mailing list