[R-sig-Geo] Error in simulation R-code

Roger Bivand Roger.Bivand at nhh.no
Thu Jul 16 12:44:13 CEST 2009

On Wed, 15 Jul 2009, Steve Hong wrote:

> First I apologize all of you for annoying messages. Since I did not receive
> the mail I sent, I thought there might be some errors.

OK. Note that there may be latency issues - occasionally, it takes much 
longer for the mail servers to process submitted postings. You can also 
check in gmane, nabble, or the list archives to see whether postings have 
got through, for this list on:




> On Wed, Jul 15, 2009 at 5:13 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>> On Wed, 15 Jul 2009, Steve Hong wrote:
>>  Dear List,
>>> I have a question about simulation code. Here are the code and error
>>> message.
>>>  sim.sp <- function(data,CM,n,N)
>>> + {
>>> +   C <- matrix(rep(NA,N),ncol=1)
>>> + for(i in 1:N)
>>> + {
>>> + j <- n
>>> + xx <- which(colSums(CM[j,])==1)
>>> + V <- names(xx)
>>> + V <- paste(V, collapse="+")
>>> + V <- paste("SBA~", V)
>>> + rd <- round(nrow(data)*(2/3))
>>> + d <- sample(seq(1:nrow(data)),rd)
>>> + dat1 <- data[d,]
>>> + dat2 <- data[-d,]
>>> + crd <- cbind(dat1$Longitude,dat1$Latitude)
>>> + dist80 <- dnearneigh(crd,0,100,longlat=F)
>>> + dist80sw <- nb2listw(dist80, style="B")
>>> + fm <- errorsarlm(as.formula(V), data=dat1, listw=dist60sw)
>>> + pred <- predict(fm,dat2)
>>> + C[i,1] <- cor(dat2$SBA,pred)
>>> + out <- cbind(C)
>>> + }
>>> + colMeans(out)
>>> + }
>>>> sim.sp(df2007.5k.s2,CM,1,1000)
>>> Error in nb2listw(dist80, style = "B") : Empty neighbour sets found
>>> I guess it means that there are some observations without neighborhoods
>>> from
>>> random selection process. Is there any way to proceed simulation like
>>> using
>>> only ones with neighborhood sets?
>>> Any suggestion will be appreciated!!
>> This is the fourth separate posting of this message, in addition
>> cross-posted to R-help. Please do consider the bandwidth costs and the
>> negative consequences of cross-posting. If you can't wait for someone else
>> to use their time to solve your (simple) problem, perhaps a little
>> reflection is called for?
>> Have you read ?dnearneigh and ?nb2listw? The examples in ?dnearneigh show
>> how to use the maximum 1st nearest neighbour distances to set the d2=
>> argument to a value ensuring that each observation has at least one
>> neighbour. Given your use of Longitude and Latitude as coordinate names, are
>> you sure that longlat= should be FALSE, or are the names simply careless?
>> This could affect what you think 100 units is - if coordinates in m, it is
>> 100m, if in US surveyors feet, then 100ft, if degrees, 100 degrees ... which
>> affects the d2= value. dist80 looks an odd name for d2=100 as well, doesn't
>> it? You check for zero neighbour counts by:
> Dist80, Longitude, and Latitude are careless. Please ignore that. Actually,
> Longitude and Latitude are UTM values (In that case longlat=F, right?). I
> think '100' is 100 km. It was OK when I try to get predicted value without
> simulation.

OK. Please check the ranges of your coordinate vectors; if the y vector is 
roughly in the single digit thousands, you are that number of km from the 
Equator, if in single digit millions, they are metres.

>> any(card(nb) == 0)
>> In ?nb2listw, you find the zero.policy= argument, which has a default value
>> of FALSE, but which you can set to TRUE, so avoiding the error in your
>> simulation if used consistently in subsequent function calls to functions
>> taking that argument, like errorsarlm(). So:
>> zp <- !any(card(nb) == 0)
>> ..., zero.policy=zp, ...
> Where should I add these codes? Can I add the first code (zp <-!...) in
> front of nb2listw? The second one should be in nb2listw(....,
> zero.policy=zp). Is that correct?

Yes, that's right. Add the ..., zero.policy=zp, ... to all the subsequent 
commands using the listw object too - the object does not record the fact 
that it's zero.policy was set to TRUE.

>> might be OK, unless you also need to check that there are any neighbours at
>> all.
>> I'm not at all sure that this simulation is going to get you anywhere
>> sensible - why are you trying to do it? I do hope you are setting the seed
>> before running it, otherwise you won't know what is going wrong in the
>> situations you choose. You are posting from a gmail address, and so not give
>> any affiliation in your signature. Is this a homework problem?
> Fortunately, this is NOT a homework problem. I am post-graduated. I changed
> my email address to a gamil address from the work address (educational
> institute). I did that since I wanted to separate R-related emails from my
> work email address. I am in the lists of R-help, R-sig-geo, R-mixed, and
> R-ecology.

OK. When there are no indications of why work is being done, it has 
sometimes turned out to be a graduate (or undergraduate) who has been 
tasked by a clueless supervisor, who has then abandoned the unfortunate 
person with a tight deadline and no helpful advice.

I'm still not sure that just reporting the correlations between the 
out-of-sample predictions and observed values gets you anywhere useful, 
without knowing how the repeated 2/3 samples affect the autoregressive 
coefficient, which in turn affects the model coefficients. That was more 
what I was lacking for understanding. In addition, we don't know whether 
the observations are very clustered in space, which may lead to very dense 
weights, and poorer performance by the spatial model, especially if those 
weights were not those that generated the data.


>> Again, you only took 40 minutes in sending 4 copies of the same message to
>> two lists. Replying has taken about the same time. Reading two help pages
>> would have taken you much less.
> Again, I sincerely apologize for sending same messages. It is totally my
> misstake. I thought it did not get there since I could not see it in my
> gmail account. If I use gmail address, can't I see that?
> Thank you!!
>> Roger Bivand
>>> Thank you!
>>> Steve Hong
>>>        [[alternative HTML version deleted]]
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> --
>> Roger Bivand
>> Economic Geography Section, Department of Economics, Norwegian School of
>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

More information about the R-sig-Geo mailing list