[R-sig-Geo] Point pattern analysis

Nicholas Lewin-Koh nikko at hailmail.net
Mon Feb 16 17:26:50 CET 2009


Hi,
Yes, if you need help rating restaurants, put me in your grant too :-)
Seriously, there are many ways to skin a cat. I don't think cartograms
will help you much
in this particular case. If you have data besides your point pattern, eg
postal codes, census data,
zoning, ... You could look for the obvious patterns, eg Italian
restaurants clustered in little Italy,
and Chinese in china town, and then look for the more interesting not so
obvious patterns. 

But from your description, it seems like there might be other questions
that should guide your analysis.
Context should drive your exploration of the data.

As Virgilo pointed out you won't get much milage plotting 10000 points.
You need some way of aggregating.
Glyphs might be one way if you have some polygonal unit that makes
sense, such as census blocks. I am not a big
fan of pie charts, but if you have only a few categories they my show a
pattern. Kernel density estimation is limited,
it will show you the spatial distribution of one particular type. 

Another route that might be interesting is if you have street maps, look
at clustering of restaurants on different
streets. It may show interesting patterns, ie fast food clustered near
freeways and walmarts.

The sky is the limit. Once you have done a lot of this more basic EDA,
than think about what kind of analytical
methods you want to use to address specific questions. You are more
likely to get what you want. You might
want to look at flowingdata.com there are some nice map visualizations
there.

Nicholas







> ------------------------------
> 
> Message: 10
> Date: Sun, 15 Feb 2009 22:17:29 +0100
> From: Virgilio Gomez Rubio <Virgilio.Gomez at uclm.es>
> Subject: Re: [R-sig-Geo] Point pattern analysis
> To: Michel Barbosa <cicaboo at gmail.com>
> Cc: r-sig-geo at stat.math.ethz.ch
> Message-ID: <1234732649.8833.84.camel at Virgilio-Gomez>
> Content-Type: text/plain
> 
> Dear Michel,
> 
> > I'm new to Spatial Data Analysis and have just begun working through
> > "Applied Spatial Data Analysis wit R" by Bivand et al. For my research I
> > would like to use SDA to be able to tell more about my restaurant data set
> > than just pinpointing them on a google map. So far, from reading the
> > literature on SDA I've been able to construct the following questions.
> 
> Interesting problem. Let me know if you need help collecting data. ;)
> 
> > 
> > 1. How far / close are restaurants from each other? (answered by using
> > kernel density estimation)
> > 2. Which type of restaurants stand next to each other?
> > 3. How are the restaurants positioned relatlivey from each other?
> > 4. What's the difference between restaurant A and restaurant B?
> 
> 
> Questions 2 and 3 are much alike, and I believe that question 4 is too
> general and not necessarily about the spatial distribution of the
> restaurants.
> 
> Depending on the number of different types of restaurants, you may want
> to estimate a different surface for each type. Basically, you may
> consider a multivariate point pattern, so that you estimate a different
> surface for each type and  you compare then to see if they are similar
> or not. This will address the question of whether the spatial
> distribution of different types of restaurants is the same or not. This
> is discussed in Diggle et al. (2005, JRSS Series A). Some of the methods
> described in the paper are implemented in package spatialkernel. 
> 
> You may also want to compute bivariate K-functions (see 'k12hat' in
> splancs; 'Kmulti' in spatstat) to detect differences between the spatial
> distributions of types of restaurants. This will give you a partial
> answer to Question 2.
> 
> If you have a set of covariates for each restaurant and you want to
> estimate their effect and how they explain the spatial distribution of
> the data you can check Diggle et al. (2006, Biometrics). There is also
> an example of this in Bivand et al. (2008).
> 
> I am not sure about the best way of tackling Question 3 (and why this is
> important). Have you considered to test for whether a certain type of
> restaurant tends to appear around a particular area of the city? For
> example, are Chinese restaurants clustered around Chinatown?
> 
> Finally, another option is to aggregate your data (counts per
> neighbourhood, for example) and do a similar analysis as in disease
> mapping.
> 
> > I've exported a subset of my dataset to CSV in order to import it in R.
> > Currently, my CSV file is of the form
> > 
> > *restaurant name; latitude; longitude; type*
> > Amigo;52.996058;6.564229;Italian
> > Bella Italia;52.99281;6.560353;Italian
> > Isola Bella;52.993764;6.560245;Italian
> 
> I would not use long/lat but UTM to do your analysis. You can do this
> very easily with R.
> 
> > 
> > I've tried to import the CSV in R by doing:
> > 
> > library(spatstat)
> > info <- read.csv(file = "sample.csv", sep = ";", strip.white = TRUE)
> > win <- owin(c(0,100),c(0,100))
> > pattern <- ppp(info$lat, info$lng, window = win, marks=info$name)
> > 
> > However, if I plot the pattern, the points are all cluttered. What advice
> > could you give me on setting the window size?
> 
> If you try to plot more than 10,000 points, then I am not surprised that
> they are all cluttered. :) I would plot the estimated intensity of the
> point patterns. Or you may aggregate your data and produce a map based
> on the neighbourhoods in your area.
> 
> Hope this helps.
> 
> Virgilio
> 
>



More information about the R-sig-Geo mailing list