[R-sig-Geo] Large dataset on Ripley's K function

Roger Bivand Roger.Bivand at nhh.no
Thu Nov 22 07:48:42 CET 2007


On Thu, 22 Nov 2007, Sisi wrote:

> Dear Roger Bivand and Dan Putler,
>
> Currently I'm working on the spatial point data analysis using Kernel
> estimation (spatstat), Ripley's K function (spatstat) and space-time K
> function (splancs). My research is at the continental level and the region
> area includes Asia, Europe and Africa. The methods work on small datasets
> (as I have seen solutions posted in R-sig-Geo by others) and this is not my
> real problem. Finding a solution with such a large dataset is the key issue
> here.
>
> I have a really big data set with 3345 points.

This number is not large in itself, and:

set.seed(1)
xy <- runifpoint(3500)
res <- Kest(xy, nlarge=3500)

works without any problems on a 1GB system:

> sessionInfo()
R version 2.6.0 (2007-10-03)
i386-pc-mingw32

locale:
LC_COLLATE=Norwegian (Bokmål)_Norway.1252;LC_CTYPE=Norwegian 
(Bokmål)_Norway.1252;LC_MONETARY=Norwegian 
(Bokmål)_Norway.1252;LC_NUMERIC=C;LC_TIME=Norwegian (Bokmål)_Norway.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] spatstat_1.12-3 mgcv_1.3-29

(it is always helpful to include the output of sessionInfo())

However note that here:

> xy
  planar point pattern: 3500 points
window: rectangle = [0, 1] x [0, 1] units
> object.size(xy)
[1] 57760

only has the simplest window, and my guess is that your window mask is 
much richer (continental shorelines? raster landmass mask?). If so, edge 
correction will probably involve much more memory use. If you are using a 
vector shoreline, and this is the reason for the problems, have you tried 
using a raster mask instead?

I have tried using coarse GSHHS shorelines without difficulties (Rgshhs in 
maptools), but very possibly with a shoreline with too many details, you 
might see problems, or with a raster mask with too high resolution.

Have you considered the problems involved in using geographical 
coordinates?

Roger

> The spatial region includes
> Asia, Europe and Africa. When I run the "Kest" function, the error said
> "cannot allocate vector of size 382.8 Mb". I have already enlarge the memory
> to the maximum and make the nlarge=3500 in "Kest". I am not sure if this is
> the correct way of increase nlarge, as nlarge default value is 3000. Prior
> to me changing nlarge the following error message was "number of data points
> exceeds 3000 - computing border estimate only".
>
> Below is part of the code used where the error occurs:
>
> #Ripley's K function
> K <- Kest(p1,nlarge=3500)
> #Error: cannot allocate vector of size 381.8 Mb
> plot(K)
>
> Can you please assist?
> Thanks in advance.
>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list