[R-sig-Geo] R encounters fatal error in errorsarlm - sdpep package

Roger Bivand Roger.Bivand at nhh.no
Mon Sep 30 21:31:09 CEST 2013


On Mon, 30 Sep 2013, Philine Gaffron wrote:

> Thank you for the response, Roger. My answers below. I have also added a 
> question on a warning message I have received and would be glad to find out 
> whether this means I need to change the analysis approach or just have to 
> take it as additional information.

Thanks for reporting back: some points for clarification inline.

>
> Am 25.09.2013 00:53, schrieb Roger Bivand:
>> On Wed, 25 Sep 2013, Philine Gaffron wrote:
>> 
>>> Hello,
>>> 
>>> I am trying to run a spatial regression analysis using the errorsarlm
>>> function in the sdpep package. I first ran into memory problems (on a
>>> larger dataset than the one described below) but have adjusted the
>>> memory size limit:
>>> 
>>>> memory.limit(size = 800000)
>>> [1] 8e+05
>> 
>> Which sets your memory limit under Windows to 781 Gb (size is in Mb) -
>> that probably explains the error exit unless you have a lot of memory.
>> Best avoid making any changes from the default in a fresh session
>> unless you know exactly what you are doing.
>
> I did up the virtual memory of the machine to 800Gb first (it has 1Tb in 
> total and there is currently nothing else running on it) so I hope that that 
> was not the problem. Thanks for the hint, though.
>
>> 
>>> 
>>> However, when I try to execute the following code in R Studio, I
>>> receive the message "R Session aborted - R encountered a fatal error
>>> - The session was terminated"
>> 
>> Always rerun any code with problems outside R Studio, because it can
>> interfere with the reporting of warnings and errors. Certainly
>> errorsarlm() has no reason to cause R to crash - and this is what you
>> are suggesting. If you can reproduce this outside R Studio, and
>> without setting memory limits, I may need to see your data to try to
>> reproduce any error exit from the R session.
>
> I tried this and had no more crashes - especially after I made the changes 
> suggested/outlined below

Does this mean that the R error exit occurred when you used R Studio, and 
did not occur when R was run directly, using exactly the same data and 
code? If so, please also let the R Studio people know, so that they can 
continue debugging.

>
>> 
>>> 
>>>> ElDU.slm.PM25.white <- errorsarlm(PM_PM25 ~ race_white, data =
>>>> ElDorado_urban, listw=ElDoU_listw)
>> 
>> Did you read the advice on ?errorsarlm about the use of alternative
>> methods, such as "LU" or "Matrix" for larger data sets? You are using
>> the default "eigen" method, but this should certainly have worked for
>> the modest n=2081. Can you reproduce the problem with a built-in data
>> set, such as elect80?
>
> I opted for the "LU" method and changed the weights style (see below). Is 
> there any documentation that tells me the implications of using method ="LU"? 
> I only found the note in the spdep documentation "LU_order default FALSE; 
> used in “LU_prepermutate”, note warnings given for lu method" but nothing 
> further. I did receive warnings in the output, though (for several but not 
> all of my datasets) and I am not sure what they mean / what their 
> implications are for the results. Do I need to choose another method? The 
> warnings in every case were:
>

See Bivand, Hauke & Kossowski (2013) Geographical Analysis for some 
details; LU_prepermutate emulates the implementation by finding an initial 
fill-in re-ordering, but in R has little speed-up effect, hence it is 
turned off by default.

> 1: In powerWeights(W = W, rho = lambda, order = con$pWOrder, X = B,:
> not converged within order iterations
>
> 2: In powerWeights(W = t(W), rho = lambda, order = con$pWOrder, X = C,:
> not converged within order iterations

These are unusual, and may signal a mis-specified setting. The default 
number of powers is 250, and lambda^250 is a very small number unless 
lambda is close to its (probably upper) limit. The computation in fact 
only affects the accuracy of the Hausman test on the difference between 
OLS and spatial error coefficients, so may be adjusted if lambda is close 
to limits by setting control argument pWOrder to a larger value than 250, 
see ?powerWeights. As a symptom of lambda close to its limits, it suggests 
mis-specification, possibly in terms of observation support, with 
neighbouring properties sharing important drivers. This may be suggested 
by your data scenario described below; scrutinising the intermediate 
objects in your script may also put you on the track of things behaving in 
unexpected ways. I would expect many observations to have no neighbours 
with a 150m threshold, but you don't show no-neighbour control being 
turned off.

Hope the helps,

Roger

>
>
> I have not tried reproducing the error with the built-in dataset but if that 
> is of interest to you from the point of view of spdep development, I could 
> give it a try.
>
>> 
>>> 
>>> The data was imported from an ArcGIS shape file:
>>>> Parcels2008_inc_POP <- readOGR(".", "Parcels_2008_inc_pop")
>>>> ElDorado_urban <- subset(Parcels2008_inc_POP, COUNTYFP10 == "017")
>>> 
>>> The subset contains data from 2081 polygons
>>> 
>>> The listw object was created as follows:
>>> 
>>>> Eld_U_coords <- coordinates(ElDorado_urban)
>>>> Eld_U_IDs <- row.names(ElDorado_urban)
>>>> ElDoU_nb <- dnearneigh(Eld_U_coords, d1 = 0, d2 = 492, row.names =
>>>> Eld_U_IDs)
>> 
>> Is 492 a well-chosen upper distance threshold? I assume that the input
>> data are projected, so that these are measures in the units of the
>> projection - metres? If Eld_U_coords are actually geographical
>> coordinates, but longlat=NULL, d2 is taken as a planar measure, if you
>> declare longlat=TRUE, it is in km. There is no easy way to trap
>> undeclared geographical coordinates.
>
> The distance thresholds are in feet 
> (NAD_1983_StatePlane_California_II_FIPS_0402_Feet) and are related to 
> dispersion distances for road traffic emissions. For explanation: my polygons 
> are land use parcels for which I have calculated emission loads in grams/area 
> from traffic loads on a road network. Since I am assuming a dispersion 
> distance of 492 feet / 150 m of emissions away from the network in the 
> example above, I am further assuming, that parcels which are within that 
> distance of one another are affected by the same emission sources (i.e. 
> traffic loads on the network).
>
>> 
>>>> ElDoU_listw <- nb2listw(ElDoU_nb, glist=NULL, style="S",
>>>> zero.policy=NULL)
>>> 
>> 
>> Does choosing a different style make a difference ("S" is not often
>> used)?
>
> I chose "S" as a variance stabilising coding schemes following Bivand, 
> Pebesma et al. (2008), p. 253 but have now switched back to "W". Apart from 
> the warnings cited above I encountered no more issues.
>
> Thank you for your help and for making this package available.
>
> Philine
>
>> 
>> Roger
>> 
>>> Is there anything inherent in my approach that could cause this
>>> problem or would one need to look at the data? I include the session
>>> info below.
>>> 
>>> Thanks for any help,
>>> 
>>> Philine
>>> 
>>>> sessionInfo()
>>> R version 3.0.1 (2013-05-16)
>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>> 
>>> locale:
>>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
>>> States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>> [5] LC_TIME=English_United States.1252
>>> 
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>> 
>>> other attached packages:
>>> [1] spdep_0.5-65 Matrix_1.0-12 lattice_0.20-15 rgdal_0.8-11 sp_1.0-13
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] boot_1.3-9 coda_0.16-1 deldir_0.0-22 grid_3.0.1 LearnBayes_2.12
>>> MASS_7.3-26 nlme_3.1-109 splines_3.0.1 tools_3.0.1
>>> 
>>>> library(help=spdep)
>>> Information on package ?spdep?
>>> 
>>> Description:
>>> 
>>> Package: spdep
>>> Version: 0.5-65
>>> Date: 2013-09-04
>>> 
>>> 
>> 
>
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list