[R-sig-Geo] Large dataset on Ripley's K function
Roger Bivand
Roger.Bivand at nhh.no
Thu Nov 22 07:48:42 CET 2007
On Thu, 22 Nov 2007, Sisi wrote:
> Dear Roger Bivand and Dan Putler,
>
> Currently I'm working on the spatial point data analysis using Kernel
> estimation (spatstat), Ripley's K function (spatstat) and space-time K
> function (splancs). My research is at the continental level and the region
> area includes Asia, Europe and Africa. The methods work on small datasets
> (as I have seen solutions posted in R-sig-Geo by others) and this is not my
> real problem. Finding a solution with such a large dataset is the key issue
> here.
>
> I have a really big data set with 3345 points.
This number is not large in itself, and:
set.seed(1)
xy <- runifpoint(3500)
res <- Kest(xy, nlarge=3500)
works without any problems on a 1GB system:
> sessionInfo()
R version 2.6.0 (2007-10-03)
i386-pc-mingw32
locale:
LC_COLLATE=Norwegian (Bokmål)_Norway.1252;LC_CTYPE=Norwegian
(Bokmål)_Norway.1252;LC_MONETARY=Norwegian
(Bokmål)_Norway.1252;LC_NUMERIC=C;LC_TIME=Norwegian (Bokmål)_Norway.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] spatstat_1.12-3 mgcv_1.3-29
(it is always helpful to include the output of sessionInfo())
However note that here:
> xy
planar point pattern: 3500 points
window: rectangle = [0, 1] x [0, 1] units
> object.size(xy)
[1] 57760
only has the simplest window, and my guess is that your window mask is
much richer (continental shorelines? raster landmass mask?). If so, edge
correction will probably involve much more memory use. If you are using a
vector shoreline, and this is the reason for the problems, have you tried
using a raster mask instead?
I have tried using coarse GSHHS shorelines without difficulties (Rgshhs in
maptools), but very possibly with a shoreline with too many details, you
might see problems, or with a raster mask with too high resolution.
Have you considered the problems involved in using geographical
coordinates?
Roger
> The spatial region includes
> Asia, Europe and Africa. When I run the "Kest" function, the error said
> "cannot allocate vector of size 382.8 Mb". I have already enlarge the memory
> to the maximum and make the nlarge=3500 in "Kest". I am not sure if this is
> the correct way of increase nlarge, as nlarge default value is 3000. Prior
> to me changing nlarge the following error message was "number of data points
> exceeds 3000 - computing border estimate only".
>
> Below is part of the code used where the error occurs:
>
> #Ripley's K function
> K <- Kest(p1,nlarge=3500)
> #Error: cannot allocate vector of size 381.8 Mb
> plot(K)
>
> Can you please assist?
> Thanks in advance.
>
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list