[R-sig-Geo] Error in simulation R-code

Roger Bivand Roger.Bivand at nhh.no
Thu Jul 16 00:13:46 CEST 2009


On Wed, 15 Jul 2009, Steve Hong wrote:

> Dear List,
>
> I have a question about simulation code. Here are the code and error
> message.
>
>> sim.sp <- function(data,CM,n,N)
> + {
> +   C <- matrix(rep(NA,N),ncol=1)
> + for(i in 1:N)
> + {
> + j <- n
> + xx <- which(colSums(CM[j,])==1)
> + V <- names(xx)
> + V <- paste(V, collapse="+")
> + V <- paste("SBA~", V)
> + rd <- round(nrow(data)*(2/3))
> + d <- sample(seq(1:nrow(data)),rd)
> + dat1 <- data[d,]
> + dat2 <- data[-d,]
> + crd <- cbind(dat1$Longitude,dat1$Latitude)
> + dist80 <- dnearneigh(crd,0,100,longlat=F)
> + dist80sw <- nb2listw(dist80, style="B")
> + fm <- errorsarlm(as.formula(V), data=dat1, listw=dist60sw)
> + pred <- predict(fm,dat2)
> + C[i,1] <- cor(dat2$SBA,pred)
> + out <- cbind(C)
> + }
> + colMeans(out)
> + }
>>
>> sim.sp(df2007.5k.s2,CM,1,1000)
> Error in nb2listw(dist80, style = "B") : Empty neighbour sets found
>
> I guess it means that there are some observations without neighborhoods from
> random selection process. Is there any way to proceed simulation like using
> only ones with neighborhood sets?
>
> Any suggestion will be appreciated!!

This is the fourth separate posting of this message, in addition 
cross-posted to R-help. Please do consider the bandwidth costs and the 
negative consequences of cross-posting. If you can't wait for someone else 
to use their time to solve your (simple) problem, perhaps a little 
reflection is called for?

Have you read ?dnearneigh and ?nb2listw? The examples in ?dnearneigh show 
how to use the maximum 1st nearest neighbour distances to set the d2= 
argument to a value ensuring that each observation has at least one 
neighbour. Given your use of Longitude and Latitude as coordinate names, 
are you sure that longlat= should be FALSE, or are the names simply 
careless? This could affect what you think 100 units is - if coordinates 
in m, it is 100m, if in US surveyors feet, then 100ft, if degrees, 100 
degrees ... which affects the d2= value. dist80 looks an odd name for 
d2=100 as well, doesn't it? You check for zero neighbour counts by:

any(card(nb) == 0)

In ?nb2listw, you find the zero.policy= argument, which has a 
default value of FALSE, but which you can set to TRUE, so avoiding the 
error in your simulation if used consistently in subsequent function calls 
to functions taking that argument, like errorsarlm(). So:

zp <- !any(card(nb) == 0)
..., zero.policy=zp, ...

might be OK, unless you also need to check that there are any neighbours 
at all.

I'm not at all sure that this simulation is going to get you anywhere 
sensible - why are you trying to do it? I do hope you are setting the seed 
before running it, otherwise you won't know what is going wrong in the 
situations you choose. You are posting from a gmail address, and so not 
give any affiliation in your signature. Is this a homework problem?

Again, you only took 40 minutes in sending 4 copies of the same message to 
two lists. Replying has taken about the same time. Reading two help pages 
would have taken you much less.

Roger Bivand

>
> Thank you!
>
> Steve Hong
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list