[R-sig-Geo] Current options for creating/querying vector data WITHOUT loading them into memory?

Jonathan Greenberg jgrn at illinois.edu
Fri Jan 17 19:56:31 CET 2014


Across all vector formats, which do you think would be a good
intermediate between in-memory Spatial* and PostGIS?  I'd put a few
stipulations:
1) The format should be open source and supported by existing APIs (OGR/rgeos)
2) It should be portable (file-based)
3) It should be "scalable" (able to support arbitrarily large vector databases)

Cheers!

--j

On Thu, Jan 16, 2014 at 2:49 PM, Tim Keitt <tkeitt at utexas.edu> wrote:
>
>
>
> On Thu, Jan 16, 2014 at 1:40 PM, Jonathan Greenberg <jgrn at illinois.edu>
> wrote:
>>
>> I've wondered if it would be possible to do something like what Robert
>> did with the raster() package, where the analysis (read/write) was
>> being done on-demand on the data rather than entirely in-memory.
>> Vector data is, of course, much more complicated to come up with
>> elegant solutions than raster data, but I think some basic
>> functionality would be great.  Perhaps spatialite as a backbone (since
>> you can easily install sqlite executable via the Rsqlite package, and
>> there is a now-abandoned but available code base in
>> http://cran.r-project.org/web/packages/SQLiteMap/ (I spoke to the
>> developer who said he won't be updating it) that might allow for a
>> relatively easy cross-platform install of the spatialite addon.
>> Something that would fill in the gap between the Spatial* classes
>> (which won't scale to large datasets) and PostGIS (which requires much
>> more complex installation requirements)?
>>
>> How does spatialite perform in terms of large queries?  I imagine not
>> as well as PostGIS, but does it at least scale memory-wise on most
>> standard queries?
>
>
> I've not used it. Generally sqlite is faster than postgresql but not as
> reliable. I just don't want to learn another syntax variation. Utilizing
> spatial indices for example in spatialite requires explicit modification of
> your SQL queries. There is no automatic index queries based on the planner
> as in postgresql. But its a very useful tool as you can do everything out of
> a single file on disk.
>
> THK
>
>>
>>
>> --j
>>
>> On Thu, Jan 16, 2014 at 1:14 PM, Tim Keitt <tkeitt at utexas.edu> wrote:
>> >
>> >
>> >
>> > On Thu, Jan 16, 2014 at 1:09 PM, Barry Rowlingson
>> > <b.rowlingson at lancaster.ac.uk> wrote:
>> >>
>> >> Well, back when I wrote 'rmap' I abstracted out the storage of the
>> >> data from the data object... So your object in R could represent a
>> >> subset of a shapefile, and the code only grabbed that chunk of the
>> >> shapefile when it was needed, for example to plot (the R object was
>> >> basically the name of the shapefile plus a selection vector).
>> >>
>> >> Then we threw that code out and sp classes were born!
>> >>
>> >>  I've often thought about restoring some of this kind of
>> >> functionality, but R's object-oriented classes just frustrate me. Its
>> >> not so simple to build a superclass of sp class objects. Or maybe it
>> >> is now? For some value of 'simple'...
>> >>
>> >>  Suppose you had a gigantic spatialite db - if you want to work with
>> >> it spatially (mapping, rgeos) you are going to have to get the bits
>> >> you need into main memory, so the simplest is just to load selections
>> >> into sp-class objects. Is that already possible with the OGR
>> >> spatialite driver? Can you also load subsets of shapefiles using some
>> >> SQL passed to the OGR shapefile driver?
>> >>
>> >>  What would you want to do on whole-dataset objects of this class?
>> >> Would you want to do the processing on the database if possible (if
>> >> its PostGIS or Spatialite)? Or have an automatic chunking procedure
>> >> for operations that don't need the whole database at once, such as
>> >> finding centroids of polygons?
>> >>
>> >> Hmmm thoughts thoughts thoughts and no action :( Sorry!
>> >
>> >
>> > Barry,
>> >
>> > I'll have more to say on this in a couple of weeks.
>> >
>> > THK
>> >
>> >>
>> >>
>> >> Barry
>> >>
>> >>
>> >>
>> >> On Thu, Jan 16, 2014 at 6:52 PM, Jonathan Greenberg <jgrn at illinois.edu>
>> >> wrote:
>> >> > r-sig-geo'ers:
>> >> >
>> >> > As vector datasets are getting a lot larger, there is a limitation
>> >> > with the Spatial* formats in that they must be loaded into main
>> >> > memory.  I was curious what folks who have been dealing with massive
>> >> > vector files have come up with working within the R environment?  Has
>> >> > anyone played around with file geodatabases or spatialite formats
>> >> > (for
>> >> > instance)?  How are you creating/querying the data?
>> >> >
>> >> > Thanks!
>> >> >
>> >> > --j
>> >> >
>> >> > --
>> >> > Jonathan A. Greenberg, PhD
>> >> > Assistant Professor
>> >> > Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
>> >> > Department of Geography and Geographic Information Science
>> >> > University of Illinois at Urbana-Champaign
>> >> > 259 Computing Applications Building, MC-150
>> >> > 605 East Springfield Avenue
>> >> > Champaign, IL  61820-6371
>> >> > Phone: 217-300-1924
>> >> > http://www.geog.illinois.edu/~jgrn/
>> >> > AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype:
>> >> > jgrn3007
>> >> >
>> >> > _______________________________________________
>> >> > R-sig-Geo mailing list
>> >> > R-sig-Geo at r-project.org
>> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> >>
>> >> _______________________________________________
>> >> R-sig-Geo mailing list
>> >> R-sig-Geo at r-project.org
>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> >
>> >
>> >
>> >
>> > --
>> > http://www.keittlab.org/
>>
>>
>>
>> --
>> Jonathan A. Greenberg, PhD
>> Assistant Professor
>> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
>> Department of Geography and Geographic Information Science
>> University of Illinois at Urbana-Champaign
>> 259 Computing Applications Building, MC-150
>> 605 East Springfield Avenue
>> Champaign, IL  61820-6371
>> Phone: 217-300-1924
>> http://www.geog.illinois.edu/~jgrn/
>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>
>
>
>
> --
> http://www.keittlab.org/



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007



More information about the R-sig-Geo mailing list