[R-sig-Geo] Point pattern analysis

Sun Feb 15 22:17:29 CET 2009

Dear Michel,

> I'm new to Spatial Data Analysis and have just begun working through
> "Applied Spatial Data Analysis wit R" by Bivand et al. For my research I
> would like to use SDA to be able to tell more about my restaurant data set
> than just pinpointing them on a google map. So far, from reading the
> literature on SDA I've been able to construct the following questions.

Interesting problem. Let me know if you need help collecting data. ;)

> 
> 1. How far / close are restaurants from each other? (answered by using
> kernel density estimation)
> 2. Which type of restaurants stand next to each other?
> 3. How are the restaurants positioned relatlivey from each other?
> 4. What's the difference between restaurant A and restaurant B?

Questions 2 and 3 are much alike, and I believe that question 4 is too
general and not necessarily about the spatial distribution of the
restaurants.

Depending on the number of different types of restaurants, you may want
to estimate a different surface for each type. Basically, you may
consider a multivariate point pattern, so that you estimate a different
surface for each type and  you compare then to see if they are similar
or not. This will address the question of whether the spatial
distribution of different types of restaurants is the same or not. This
is discussed in Diggle et al. (2005, JRSS Series A). Some of the methods
described in the paper are implemented in package spatialkernel. 

You may also want to compute bivariate K-functions (see 'k12hat' in
splancs; 'Kmulti' in spatstat) to detect differences between the spatial
distributions of types of restaurants. This will give you a partial
answer to Question 2.

If you have a set of covariates for each restaurant and you want to
estimate their effect and how they explain the spatial distribution of
the data you can check Diggle et al. (2006, Biometrics). There is also
an example of this in Bivand et al. (2008).

I am not sure about the best way of tackling Question 3 (and why this is
important). Have you considered to test for whether a certain type of
restaurant tends to appear around a particular area of the city? For
example, are Chinese restaurants clustered around Chinatown?

Finally, another option is to aggregate your data (counts per
neighbourhood, for example) and do a similar analysis as in disease
mapping.

> I've exported a subset of my dataset to CSV in order to import it in R.
> Currently, my CSV file is of the form
> 
> *restaurant name; latitude; longitude; type*
> Amigo;52.996058;6.564229;Italian
> Bella Italia;52.99281;6.560353;Italian
> Isola Bella;52.993764;6.560245;Italian

I would not use long/lat but UTM to do your analysis. You can do this
very easily with R.

> 
> I've tried to import the CSV in R by doing:
> 
> library(spatstat)
> info <- read.csv(file = "sample.csv", sep = ";", strip.white = TRUE)
> win <- owin(c(0,100),c(0,100))
> pattern <- ppp(info$lat, info$lng, window = win, marks=info$name)
> 
> However, if I plot the pattern, the points are all cluttered. What advice
> could you give me on setting the window size?

If you try to plot more than 10,000 points, then I am not surprised that
they are all cluttered. :) I would plot the estimated intensity of the
point patterns. Or you may aggregate your data and produce a map based
on the neighbourhoods in your area.

Hope this helps.

Virgilio