[R-sig-Geo] Large dataset on Ripley's K function

Roger Bivand Roger.Bivand at nhh.no
Fri Nov 23 15:22:05 CET 2007


On Fri, 23 Nov 2007, Sisi wrote:

> Dear all,
>
> Thanks for your quick reply!
>
> For Roger, I checked the sessioninfo and mine is same as yours. My object
> size is indeed quite big, it is 275608. So I simplified the polygon
> and erased all the small islands, only hold the main land of Asia, Europe
> and Africa. After I loaded the polygon window and points, I checked the
> object size. It becomes  86768, which seems quite ok (compared with yours
> 57760). I ran the function again, and still doesn't work, the code and
> errors are as follow:
>
>
> sessionInfo()
> R version 2.6.0 (2007-10-03)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] RColorBrewer_1.0-2 spatstat_1.12-1    mgcv_1.3-27
> maptools_0.6-19    sp_0.9-16
> [6] foreign_0.8-23
>
> loaded via a namespace (and not attached):
> [1] grid_2.6.0     lattice_0.16-5 tools_2.6.0
>
> #Enlarge the memory
> memory.limit(size=4095)
> # creat polygon to be used as a window in ppp
> library(maptools)
> library(sp)
> library(spatstat)
> a <- readShapePoly("D:/5Bird/Result/New2Project.shp"[1])
> plot(a)
>
> # Define the window
> W <- as(as(a, "SpatialPolygons"), "owin")
> plot(W)
>
> # Import point data and creat ppp
> cases <- coordinates(readShapePoints("D:/5Bird/Result/BirdPoint.shp"))
> str(cases)
>
> #num [1:3345, 1:2] 5431396 5305812 5343499 1975828 2128985 ...
> #- attr(*, "dimnames")=List of 2
>  #..$ : NULL
>  #..$ : chr [1:2] "coords.x1" "coords.x2"
>
> colnames(cases)<-c("x","y")
> p1 <- ppp(cases[,1], cases[,2], window=W)
> plot(p1)
> points(cases, pch=20,col="red")
>
> object.size(p1)
> #[1] 86768
> #Ripley's K function
> K <- Kest(p1,nlarge=Inf)
> #Error: cannot allocate vector of size 38.6 Mb
> plot(K)
> #Generate envelope
> en <- envelope(p1, fun=Kest(p1,nlarge=3500), nsim=10)
> #Error: cannot allocate vector of size 38.6 Mb
> plot(en, main="Envelopes of K function based on CSR")
>
> For the question about the geographical coordinates, the projection system
> I'm using is Equidistant Cylindrical , the unit is meter. Just considering
> this projection has no distortion in distance.
>
> For Don, when I set nlarge=3500, the error was: "Error: cannot allocate
> vector of size 381.8 Mb". When I set nlarge=Inf, the error became:"Error:
> cannot allocate vector of size 38.6 Mb". I cannot figure out the cause
> of the errors. When you mentioned edge correction, I get a question to
> consult. Do I still need edge correction if my study area
> covers Asia, Europe and Africa continents. I mean the boundary is the real
> boundary, and beyond it is the sea.

The edge correction is the real issue. K is measuring the number of points 
within areas of distance bands adjusted for the area within the edges. So 
you have to correct - but shorelines may not be appropriate for birds - 
how far can your birds fly over water? Have you considered using a raster 
mask which would permit contact over channels between islands?

Roger

>
> Thank you for your time and best regards,
> Sisi
>
>
>
> On Thu, 22 Nov 2007, Roger Bivand  wrote:
>
> This number is not large in itself, and:
>
> set.seed(1)
> xy <- runifpoint(3500)
> res <- Kest(xy, nlarge=3500)
>
> works without any problems on a 1GB system:
>
>> sessionInfo()
> R version 2.6.0 (2007-10-03)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=Norwegian (Bokmål)_Norway.1252;LC_CTYPE=Norwegian
> (Bokmål)_Norway.1252;LC_MONETARY=Norwegian
> (Bokmål)_Norway.1252;LC_NUMERIC=C;LC_TIME=Norwegian (Bokmål)_Norway.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] spatstat_1.12-3 mgcv_1.3-29
>
> (it is always helpful to include the output of sessionInfo())
>
> However note that here:
>
>> xy
> planar point pattern: 3500 points
> window: rectangle = [0, 1] x [0, 1] units
>> object.size(xy)
> [1] 57760
>
> only has the simplest window, and my guess is that your window mask is
> much richer (continental shorelines? raster landmass mask?). If so, edge
> correction will probably involve much more memory use. If you are using a
> vector shoreline, and this is the reason for the problems, have you tried
> using a raster mask instead?
>
> I have tried using coarse GSHHS shorelines without difficulties (Rgshhs in
> maptools), but very possibly with a shoreline with too many details, you
> might see problems, or with a raster mask with too high resolution.
>
> Have you considered the problems involved in using geographical
> coordinates?
>
> Roger
> On 22/11/2007, Putler, Dan <dan.putler at sauder.ubc.ca> wrote:
>
>>  Hi Sisi,
>>
>> By setting nlarge to 3500 you are creating problems (given that you have
>> just over 3300 observations) since you may be using either the isotropic or
>> Ripley's method for border correction. You want to use the border method
>> (which is what was used when nlarge was at its original value of 3000). To
>> quote the documentation of the function:
>>
>> If the point pattern X contains more than about 3000 points, the isotropic
>> and translation edge cor-
>> rections can be computationally prohibitive. The computations for the
>> border method are much
>> faster, and are statistically efficient when there are large numbers of
>> points. Accordingly, if the
>> number of points in X exceeds the threshold nlarge, then only the border
>> correction will be com-
>> puted. Setting nlarge=Inf will prevent this from happening. Setting
>> nlarge=0 is equivalent
>> to selecting only the border correction with correction="border".
>>
>> My guess is that Roger's suggested approach of simplifying the boundaries
>> is correct (my experience is that his suggestions are always on target),
>> which would allow you to take advantage of the other methods, but the using
>> the border method only can be justified in your case (and a lot easier to
>> implement).
>>
>> Dan
>>
>>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list