[R-sig-Geo] merging a data.frame with a map
Roger Bivand
Roger.Bivand at nhh.no
Wed Aug 20 22:13:13 CEST 2008
On Wed, 20 Aug 2008, Michael Friendly wrote:
> This may be a simple question, but I'm a newbie with the maptools and sp
> packages. I have a set of shapefiles for Ontario "Forward sortation
> areas", the first 3 characters of the postal code (FSA), that I can
> subset to give a map of Toronto. [My cudos to the team that put maptools
> and sp together! This part is a lot more straight-forward than when I
> last looked.]
>
> I have other data attributes for these areas in a separate file, that
> I'd like to display on the map, but I don't know how to merge this with
> the map object. I'm not sure if this matters, but to display the map in
> Toronto-centric terms, I also have to rotate it slightly so that the
> conventional northern boundary line is horizontal. I used elide() for
> this, all I could find in the help/examples for rotation. In the end,
> I'd like to write out a new shapefile containing the rotated coordinates
> together with the data attributes from my data file. I show the steps
> I've done so far below.
>
> # using readShapeSpatial
>
>> ontario
> <-readShapeSpatial("ForwardSortationAreas_JUL07_ON_region.shp", IDvar="FSA",
> proj4string=CRS("+proj=longlat +datum=NAD83") )
>> toronto <- ontario[ontario$F=="M",]
>> summary(toronto)
> Object of class SpatialPolygonsDataFrame
> Coordinates:
> min max
> r1 -80 -79
> r2 44 44
> Is projected: FALSE
> proj4string : [+proj=longlat +datum=NAD83]
> Data attributes:
> FSA FSA_NAME F PR M1B : 1 TORONTO :48
> K: 0 35:102 M1C : 1 NORTH YORK :22 L: 0 M1E : 1
> SCARBOROUGH:17 M:102 M1G : 1 ETOBICOKE :12 N: 0
> M1H : 1 EAST YORK : 3 P: 0 M1J : 1 ACTON : 0
> (Other):96 (Other) : 0
>>
>
> # plot with spplot, rotate so Steeles Ave. is horizontal
> torontoR <- elide(toronto, rotate=12.7)
> spplot(torontoR, zcol="FSA", colorkey=FALSE)
>
> In the associated data file, the first column is FSA. I've used that as
> the rownames attribute, but I could just use it as a separate column if
> that makes it easier to merge with the map
Have a look at ?"spCbind-methods" in maptools. That should "quasi-cbind"
the SpatialPolygonsDataFrame torontoR with the data frame crimeTO, if the
row names of crimeTO and the Polygons ID slots of torontoR match exactly.
If they don't match, it won't know how to align the geometries and the
data frame rows.
Try:
tor_ID <- sapply(slot(torontoR, "polygons"), function(x) slot(x, "ID"))
cri_ID <- row.names(crimeTO)
length(tor_ID)
length(cri_ID)
mm <- match(tor_ID, cri_ID)
length(mm)
any(is.na(mm))
all.equal(sort(tor_ID), sort(cri_ID))
The row order of crimeTO is arbitrary (if I remember correctly), but the
match must be exact. If your geometries and the data in the csv file are
from the same period (same FSA keys), it ought to be OK. In other
jurisdictions, one does see counties or municipalities merging or
splitting as geometries over time (new FIPS entering, old ones being
retired), hence the stress on exact matching. Many data collection
services now grasp why changing aggregations over time should be avoided
if possible, fortunately, though obviously other concerns (like falling
population) must also be accommodated.
So, if the IDs look OK:
torontoR2 <- spCbind(torontoR, crimeTO)
spplot(torontoR2, "Inmates.per.10K")
writeSpatialShape(torontoR2, "RotatedToronto")
(untried, but it shouldn't be too far from the mark).
Hope this helps,
Roger
PS: in the fullness of time, and following an innovation in S4 methods
made last week in R's development version, the oddish spCbind and spRbind
methods may become just regular cbind and rbind, but not quite yet ...
>
>> # read the attribute file
>> crime <- read.csv("Provincial-edited.csv", row.names=1)
>> crime[,9:12] <- crime[,9:12]*100
>> # Toronto subset
>> crimeTO <- crime[substr(rownames(crime),1,1)=="M",]
>> str(crimeTO)
> 'data.frame': 102 obs. of 13 variables:
> $ Area : Factor w/ 209 levels "Acton","Ajax",..: 166 166
> 166 166 166 166 166 166 166 166 ...
> $ Total.Jail.Cost : int 159807 69755 284576 164400 24035 86526
> 269514 140899 138122 119962 ...
> $ Ontario.Rank : int 107 239 37 104 340 204 40 123 129 147 ...
> $ Number.of.Inmates : int 10 2 11 5 3 7 13 6 3 3 ...
> $ Inmates.per.10K : num 1.6 0.6 2.4 1.8 1.4 2 2.8 2.2 1.4 1.4 ...
> $ Total.Days.Sentenced : int 1496 653 2664 1539 225 810 2523 1319 1293
> 1123 ...
> $ Population : int 64632 35695 46757 28458 23511 35005 47224
> 27825 22138 21122 ...
> $ Median.Household.Income : int 65479 91372 58015 45150 51433 48425 46191
> 44522 59426 58962 ...
> $ Low.Income : num 17 7 21 30 23 29 27 29 19 16 ...
> $ Unemployed : num 5 3 6 9 6 8 7 7 5 5 ...
> $ Completed.University : num 17 26 18 15 23 15 13 20 20 22 ...
> $ Single.Female.Parent.Homes: num 20 12 21 20 16 22 22 21 18 16 ...
> $ Public.Housing.Units : int 496 NA 1226 521 149 173 743 685 295 124
> ...
>>
>> head(crimeTO)
> Area Total.Jail.Cost Ontario.Rank Number.of.Inmates Inmates.per.10K
> Total.Days.Sentenced Population
> M1B Scarborough 159807 107 10
> 1.6 1496 64632
> M1C Scarborough 69755 239 2
> 0.6 653 35695
> M1E Scarborough 284576 37 11
> 2.4 2664 46757
> M1G Scarborough 164400 104 5
> 1.8 1539 28458
> M1H Scarborough 24035 340 3
> 1.4 225 23511
> M1J Scarborough 86526 204 7
> 2.0 810 35005
> Median.Household.Income Low.Income Unemployed Completed.University
> Single.Female.Parent.Homes
> M1B 65479 17 5 17
> 20
> M1C 91372 7 3 26
> 12
> M1E 58015 21 6 18
> 21
> M1G 45150 30 9 15
> 20
> M1H 51433 23 6 23
> 16
> M1J 48425 29 8 15
> 22
> Public.Housing.Units
> M1B 496
> M1C NA
> M1E 1226
> M1G 521
> M1H 149
> M1J 173
>>
>
> So, I now want to merge torontoR with crimeTO using FSA (or
> rownames(crimeTO)) as the area ID and write new files so I can avoid
> doing these steps in future.
>
> Can someone help?
>
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list