[R-sig-Geo] read shapefile into R larger than 2GB

Carlos Valenzuela jour4life at gmail.com
Wed Apr 4 14:49:20 CEST 2012


Thank you all for your replies.

Rainer - In regards to the size of the .shp, it is due to the .dbf file.
There are over 500,000 cases (for the entire city). Now the kicker is that
we also want to include more data in the future. But, it seems like that
we would be even more difficult as all of you imply.

Roger - I am actually using a 64-bit Windows 7 machine with 12 GB RAM,
where I attempted to use both spdep() and readOGR(). I also tried this on
a 64-bit Linux server with 32 GB RAM, but only trying to use spdep().As I
explained to Rainer, there are 500,000 cases and we do want to make some
inferences on the results that would incorporate even more data...But we
will see if that is possible.

I am going reduce the number of attributes in the table, but I was hoping
that the number of observations is not the issue because it seems like R
was only importing half of the observations.

Best,

Carlos


On 4/4/12 6:48 AM, "Roger Bivand" <Roger.Bivand at nhh.no> wrote:

>On Wed, 4 Apr 2012, Rainer M Krug wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 03/04/12 20:44, Carlos Valenzuela wrote:
>>> Hello all, I was hoping someone may be able to help me with this
>>> problem. I am trying to read a shapefile into R that is larger than
>>> 2GB. I?ve tried
>>
>> In my opinion, a 2GB shape file is insane....
>
>Yes, it doesn't seem well-considered. Is this LiDAR data? Are you
>intending to do statistics on the complete data set? What would you infer
>from the results?
>
>You have not included the output of sessionInfo() - I suspect that you
>are 
>using a 32-bit system, which would fail in any case.
>
>Roger
>
>>
>> Is the attribute table (.dbf) file as big or is it the shp? If it is
>>the 
>> .dbf, you have to look if you need all attributes. If it is the .shp,
>> you could possibly split the shaope file in more then one actual layer?
>> Also, import into a SpatiaLite or even PostGIS database might help you
>>- 
>> then you can easier import a subset of features.
>>
>> Cheers,
>>
>> Rainer
>>
>>
>>> using readShapePoly() in spdep as well as the readOGR() in rgdal with
>>>no luck.,
>>>
>>> Using the readShapePoly(), I get:, ?failed on DBF filefseek? on  a
>>>series of lines (over
>>> 284,000),
>>>
>>> When using rgdal, I get this:
>>>
>>> Warning message:, In readOGR(".", "nameoffiles") :, Deleted feature
>>>IDs: 284284, 284285,
>>> 284286, 284287, 284288, 284289, 284290, 284291, 284292, 284293,
>>>284294, 284295, 284296, 284297,
>>> 284298, 284299, 284300, 284301, 284302, 284303, 284304, 284305,
>>>284306, 284307, 284308, 284309,
>>> 284310, 284311, 284312, 284313, 284314, 284315, 284316, 284317,
>>>284318, 284319, 284320, 284321,
>>> 284322, 284323, 284324, 284325, 284326, 284327, 284328, 284329,
>>>284330, 284331, 284332, 284333,
>>> 284334, 284335, 284336, 284337, 284338, 284339, 284340, 284341,
>>>284342, 284343, 284344, 284345,
>>> 284346, 284347, 284348, 284349, 284350, 284351, 284352, 284353,
>>>284354, 284355, 284356, 284357,
>>> 284358, 284359, 284360, 284361, 284362, 284363, 284364, 284365,
>>>284366, 284367, 284368, 284369,
>>> 284370, 284371, 284372, 284373, 284374, 284375, 284376, 284377,
>>>284378, 284379, 284380, 284381,
>>> 284382, 284383, 284384, 284385, 284386, 284387, 284388, 284389,
>>>284390, 284391, 284392, 284393,
>>> 284394, 284395, 284396, 284397, 284398, 284399, 284400, 284401,
>>>284402, 284403, 284404, 284405,
>>> 284 [... truncated],
>>>
>>> In other words, I only get half of the data imported into R and need
>>>to get all of it in.
>>>
>>> Thank you,
>>>
>>> Carlos
>>>
>>> [[alternative HTML version deleted]]
>>>
>>>
>>>
>>>
>>> _______________________________________________ R-sig-Geo mailing list
>>>R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>>
>> - --
>> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
>>Biology, UCT), Dipl. Phys.
>> (Germany)
>>
>> Centre of Excellence for Invasion Biology
>> Stellenbosch University
>> South Africa
>>
>> Tel :       +33 - (0)9 53 10 27 44
>> Cell:       +33 - (0)6 85 62 59 98
>> Fax :       +33 - (0)9 58 10 27 44
>>
>> Fax (D):    +49 - (0)3 21 21 25 22 44
>>
>> email:      Rainer at krugs.de
>>
>> Skype:      RMkrug
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.11 (GNU/Linux)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>>
>> iEYEARECAAYFAk98CbsACgkQoYgNqgF2egq1jQCggSM/x65ppcpy8oT3FMptgCpP
>> dgkAn3Ls1yTRnyk5zxouGA216vU4iKKX
>> =eY6g
>> -----END PGP SIGNATURE-----
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
>-- 
>Roger Bivand
>Department of Economics, NHH Norwegian School of Economics,
>Helleveien 30, N-5045 Bergen, Norway.
>voice: +47 55 95 93 55; fax +47 55 95 95 43
>e-mail: Roger.Bivand at nhh.no
>



More information about the R-sig-Geo mailing list