[R-sig-Geo] Issues importing BIG shapefile - Limitations of R or something else?

Roger Bivand Roger.Bivand at nhh.no
Mon Mar 30 22:49:56 CEST 2009


On Mon, 30 Mar 2009, Tyler Dean Rudolph wrote:

> Hello Roger,
>
> This certainly does shed a lot of light on the mechanism behind the error
> and your code is greatly appreciated.  Now for the question: did your final
> line of code work???  When I tried it on my machine it took over 4 hours
> without completing, at which time I stopped and tried it on an external
> machine, which gave me the following:

Still running at 13 hours, I'll leave it overnight. It seems to take up to 
10 seconds on some lines (guess). I think that this is what Adrian refered 
to saying that the next release of spatstat should do this faster. 51K 
lines is a seriously large number, after all ...

Roger

>
> *> roads<-readOGR("D:/GIS", layer="road2004")*
> OGR data source with driver: ESRI Shapefile
> Source: "D:/GIS", layer: "road2004"
> with  51218  rows and  4  columns
> Feature type: wkbLineString with 2 dimensions
> *> roads<-as(roads, "SpatialLines")
> *> ## Remove any unpaired coordinates (code by Roger Bivand)
> *> ncrds <- sapply(slot(roads, "lines"), function(x) sapply(slot(x,
> "Lines"),
> +  function(y) nrow(slot(y, "coords"))))
>> summary(unlist(ncrds))*
>   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>   1.00    6.00   16.00   31.99   39.00 3241.00
> *> keep <- sapply(ncrds, function(x) all(x > 1))
>> roads_gt_1 <- roads[keep]
>> length(slot(roads, "lines"))
> *[1] 51218
> *> length(slot(roads_gt_1, "lines"))*
> [1] 51216
> *> test<-as(roads_gt_1, "psp")*
> Erreur in FUN(X[[1L]], ...) : impossible to find the function
> "superimposePSP"
>
> **** For *sessionInfo() *and *traceback()* results see the
> attached print.screen image.* ***
>
> It feels like we are very close to success yet there seems to be an issue in
> the process of conversion to a psp object.  ?
>
> Tyler
>
>
>
> On Mon, Mar 30, 2009 at 3:48 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>
>> On Sun, 29 Mar 2009, Tyler Dean Rudolph wrote:
>>
>> As you've indicated Roger there does appear to be an error in
>>> readShapeLines() under maptools.  However I can't help but wonder why I do
>>> not have any problems using the same command sequence for a subset of
>>> the same data (even after updating my version of sp)...?  With said subset
>>> I
>>> am able to import and convert the .shp file to a SpatialLinesDataFrame
>>> object and then (end result) a psp object.  This latter transformation is
>>> critical in order to produce the type of output I am trying to model using
>>> distmap().
>>>
>>> Now via the alternate path, readOGR() successfully imported the spatial
>>> lines polygon (full coverage) as a SpatialLinesDataFrame object;
>>> **
>>> *> roads<-readOGR("D:/GIS", layer="road2004")*
>>> OGR data source with driver: ESRI Shapefile
>>> Source: "D:/GIS", layer: "road2004"
>>> with  51218  rows and  4  columns
>>> Feature type: wkbLineString with 2 dimensions
>>> However from this point when I try to convert it to a psp object it fails:
>>>
>>
>> Tyler:
>>
>> Thanks again for a full report. The failure in readShapeLines() was caused
>> by a line with only a single coordinate (two, in fact). This isn't a problem
>> in the readOGR() code, and will not be in the latest release of maptools.
>>
>> In spatstat, if spatstat.options("checkpolygons") is TRUE, (xrange[2] <
>> xrange[1]) and (yrange[2] < yrange[1]) are both expected to hold, and
>> cannot, because for single coordinate lines, the min and max are equal.
>>
>> You have two choices - based on the files you made available, either drop
>> the two single coordinate lines:
>>
>> ncrds <- sapply(slot(roads, "lines"), function(x) sapply(slot(x, "Lines"),
>>  function(y) nrow(slot(y, "coords"))))
>> summary(unlist(wh_singleton))
>> keep <- sapply(ncrds, function(x) all(x > 1))
>> roads_gt_1 <- roads[keep]
>> length(slot(roads, "lines"))
>> length(slot(roads_gt_1, "lines"))
>> test <- as(roads_gt_1, "psp")
>>
>> or set checkpolygons to FALSE in spatstat.options(). I'm running the former
>> at the moment - deleting the SpatialLinesDataFrame object after coercing to
>> SpatialLines would save some memory.
>>
>> Hope this helps,
>>
>> Roger
>>
>>
>>> ** *> test<-as(roads, "psp")*
>>>
>>> Error in owin(xr, yr, poly = list(x = xr[c(1, 2, 2, 1)], y = yr[c(1, 1,  :
>>>  xrange should be a vector of length 2 giving (xmin, xmax)
>>>
>>> Here I will provide similar information as before, in hopes that we might
>>> be
>>> able to shed light on the issue, which appears to involve the window
>>> being used to spatially frame the data:
>>>
>>> *> sessionInfo()*
>>> R version 2.8.1 (2008-12-22)
>>> i386-pc-mingw32
>>>
>>> locale:
>>>
>>> LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252;LC_MONETARY=English_Canada.1252;LC_NUMERIC=C;LC_TIME=English_Canada.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices datasets  utils     methods   base
>>>
>>> other attached packages:
>>> [1] rgdal_0.6-7     spatstat_1.15-0 deldir_0.0-7    gpclib_1.4-3
>>> mgcv_1.4-1.1    maptools_0.7-20 sp_0.9-32       foreign_0.8-29
>>> loaded via a namespace (and not attached):
>>> [1] grid_2.8.1      lattice_0.17-17 nlme_3.1-89     tools_2.8.1
>>>
>>> *> traceback()
>>> *14: stop("xrange should be a vector of length 2 giving (xmin, xmax)")
>>> 13: owin(xr, yr, poly = list(x = xr[c(1, 2, 2, 1)], y = yr[c(1, 1,
>>>       2, 2)]), unitname = unitname(W))
>>> 12: switch(W$type, rectangle = {
>>>       xr <- W$xrange
>>>       yr <- W$yrange
>>>       return(owin(xr, yr, poly = list(x = xr[c(1, 2, 2, 1)], y = yr[c(1,
>>>           1, 2, 2)]), unitname = unitname(W)))
>>>   }, polygonal = {
>>>       return(W)
>>>   }, mask = {
>>>       stop("A mask cannot be converted to a polygon")
>>>   })
>>> 11: as.polygonal(x)
>>> 10: owin2gpc(B)
>>> 9: union.owin(W, Wlist[[i]])
>>> 8: superimposePSP(y, window = window)
>>> 7: FUN(X[[24798L]], ...)
>>> 6: lapply(lin, as.psp.Lines)
>>> 5: as.psp.SpatialLines(y, window = window, marks = marks)
>>> 4: as.psp(y, window = window, marks = marks)
>>> 3: as.psp.SpatialLinesDataFrame(from)
>>> 2: asMethod(object)
>>> 1: as(roads, "psp")
>>>
>>> Why does this error occur now when it works fine for a subset of the same
>>> data?  Is there a fix that could help with this problem?
>>> Tyler
>>>
>>>
>>>
>>> On Sun, Mar 29, 2009 at 11:21 AM, Roger Bivand <Roger.Bivand at nhh.no>
>>> wrote:
>>>
>>> On Sat, 28 Mar 2009, Tyler Dean Rudolph wrote:
>>>>
>>>> Here are the sessionInfo() and traceback() results following the failed
>>>>
>>>>> import.  It looks like I am not using the most recent version of sp so I
>>>>> am
>>>>> trying to figure out the terms in the call to update.package() and with
>>>>> the
>>>>> most recent version I will try again.
>>>>>
>>>>>
>>>> Thanks! The indications are that coordinates is being passed a numeric
>>>> vector, not a matrix, at:
>>>>
>>>> Line(coords = shape$verts[from[i]:to[i], ])
>>>>
>>>> so if from[i] == to[i], the code as shown converts a single coordinate
>>>> (one
>>>> row) matrix into a numeric vector (omitted drop=FALSE in "[" method,
>>>> perhaps
>>>> the most common coding error in S/R).
>>>>
>>>> That is fixable, but a better workaround may be to use readOGR() in rgdal
>>>> rather than readShapeLines() in maptools.
>>>>
>>>> Roger
>>>>
>>>>  The .shp file is ~30 megabytes and
>>>>
>>>>> the associated .dbf is ~15 megabytes, though I don't think the latter is
>>>>> included in the import.  I have 2G of memory on this machine....
>>>>>
>>>>> Tyler
>>>>>
>>>>> **
>>>>> *> roads<-as(as(readShapeLines(fn="D:\\GIS\\road2004.shp"),
>>>>> "SpatialLines"),
>>>>> "psp")
>>>>> *Error in function (classes, fdef, mtable)  :
>>>>>  unable to find an inherited method for function "coordinates", for
>>>>> signature "numeric"
>>>>>
>>>>> *> sessionInfo()*
>>>>> R version 2.8.1 (2008-12-22)
>>>>> i386-pc-mingw32
>>>>>
>>>>> locale:
>>>>>
>>>>>
>>>>> LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252;LC_MONETARY=English_Canada.1252;LC_NUMERIC=C;LC_TIME=English_Canada.1252
>>>>>
>>>>> attached base packages:
>>>>> [1] stats     graphics  grDevices datasets  utils     methods   base
>>>>>
>>>>> other attached packages:
>>>>> [1] spatstat_1.15-0 deldir_0.0-7    gpclib_1.4-3    mgcv_1.4-1.1
>>>>> maptools_0.7-20 foreign_0.8-29  sp_0.9-29
>>>>> loaded via a namespace (and not attached):
>>>>> [1] grid_2.8.1      lattice_0.17-17 tools_2.8.1
>>>>>
>>>>> *> traceback()*
>>>>> 13: stop("unable to find an inherited method for function \"",
>>>>> fdef at generic,
>>>>>
>>>>>      "\", for signature ", cnames)
>>>>> 12: function (classes, fdef, mtable)
>>>>>  {
>>>>>      methods <- .findInheritedMethods(classes, fdef, mtable)
>>>>>      if (length(methods) == 1)
>>>>>          return(methods[[1]])
>>>>>      else if (length(methods) == 0) {
>>>>>          cnames <- paste("\"", sapply(classes, as.character),
>>>>>              "\"", sep = "", collapse = ", ")
>>>>>          stop("unable to find an inherited method for function \"",
>>>>>              fdef at generic, "\", for signature ", cnames)
>>>>>      }
>>>>>      else stop("Internal error in finding inherited methods; didn't
>>>>> return a unique method")
>>>>>  }(list("numeric"), function (obj)
>>>>>  standardGeneric("coordinates"), <environment>)
>>>>> 11: coordinates(coords)
>>>>> 10: Line(coords = shape$verts[from[i]:to[i], ])
>>>>> 9: .shapes2LinesList(shapes[[i]], ID = IDs[i])
>>>>> 8: .shp2LinesDF(Map, proj4string = proj4string)
>>>>> 7: withCallingHandlers(expr, warning = function(w)
>>>>> invokeRestart("muffleWarning"))
>>>>> 6: suppressWarnings(.shp2LinesDF(Map, proj4string = proj4string))
>>>>> 5: readShapeLines(fn = "D:\\GIS\\road2004.shp")
>>>>> 4: .class1(object)
>>>>> 3: as(readShapeLines(fn = "D:\\GIS\\road2004.shp"), "SpatialLines")
>>>>> 2: .class1(object)
>>>>> 1: as(as(readShapeLines(fn = "D:\\GIS\\road2004.shp"), "SpatialLines"),
>>>>>     "psp")
>>>>>
>>>>>       [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-Geo mailing list
>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>>>>
>>>>> --
>>>> Roger Bivand
>>>> Economic Geography Section, Department of Economics, Norwegian School of
>>>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>>>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>>>> e-mail: Roger.Bivand at nhh.no
>>>>
>>>>
>>>>
>>>
>> --
>> Roger Bivand
>> Economic Geography Section, Department of Economics, Norwegian School of
>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list