[R-sig-Geo] Match polygon and dataframe IDs after raster::extract

Edzer Pebesma edzer.pebesma at uni-muenster.de
Thu Nov 19 22:42:41 CET 2015



On 19/11/15 16:36, Loïc Dutrieux wrote:
> Thanks Edzer,
> 
> But then, how safe is it to match attributes and geometries based on the
> record order? Is there any chance that records get shuffled during the
> dataframe manipulation steps, or the raster::extract?

Can you think of a reason why the author of the code would do that?

To be sure I would read the docs and then try to understand

library(raster)
getMethod("extract", c("Raster", "SpatialPolygons"))

> Also one note, SpatialPointsDataFrame() gives a warning, but
> SpatialPolygonsDataFrame() doesn't. (I made a mistake in my initial
> example, I want to create a SpatialPolygonsDataFrame, not a
> SpatialPointsDataFrame).

as documented, SpatialPolygonsDataFrame has argument match.ID default to
TRUE.

> 
> Cheers,
> Loïc
> 
> 
> On 11/19/2015 03:08 PM, Edzer Pebesma wrote:
>>
>>
>> On 19/11/15 14:37, Loïc Dutrieux wrote:
>>> Hi all,
>>>
>>> I'm trying to look at correlation between two raster layers, for
>>> different polygons. So I use raster::extract to get all the raster
>>> values for every polygon, do the calculation and feed the output back to
>>> a SpatialPolygonDataFrame.
>>> I got it working, but I have a doubt regarding the order of the rows;
>>> and it doesn't look like I can use match.ID = TRUE.
>>>
>>> See the example below.
>>>
>>> library(raster)
>>> library(dplyr)
>>>
>>> # Create brick with 2 layers
>>> b <- brick(ncol=36, nrow=18, nl=2)
>>> b[[1]] <- rnorm(ncell(b))
>>> b[[2]] <- rnorm(ncell(b))
>>>
>>> # Create sp
>>> cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60),
>>> c(-180,-20))
>>> cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
>>> cds3 <- rbind(c(-10,20), c(50,60), c(70,-10))
>>> polys <- spPolygons(cds1, cds2, cds3)
>>>
>>> # Extract all values
>>> df0 <- extract(b, polys, df = TRUE)
>>>
>>> # Compute correlation betwen the two layers for every polygons (sorry
>>> for the pipe)
>>> df1 <- group_by(df0, ID) %>%
>>>    summarise(cor = cor(layer.1, layer.2)) %>%
>>>    data.frame()
>>>
>>> # Attach to df to spdf
>>> spdf <- SpatialPointsDataFrame(polys, df1)
>>>
>>>
>>>
>>> How do I know for sure that the order of the rows in the dataframe did
>>> not get mixed up? Can I just assume that they will remain in the same
>>> order?
>>
>> If it does reshuffle and you didn't specify match.ID, it will warn you:
>>
>>> pts = matrix(1:4,2,2,dimnames=list(c("a", "b"), NULL))
>>> pts
>>    [,1] [,2]
>> a    1    3
>> b    2    4
>>> df = data.frame(a = 2:3, row.names = c("b", "a"))
>>> df
>>    a
>> b 2
>> a 3
>>> SpatialPointsDataFrame(pts, df)
>>    coordinates a
>> a      (1, 3) 3
>> b      (2, 4) 2
>> Warning message:
>> In SpatialPointsDataFrame(pts, df) :
>>    forming a SpatialPointsDataFrame based on maching IDs, not on record
>> order. Use match.ID = FALSE to match on record order
>>> SpatialPointsDataFrame(pts, df, match.ID = FALSE)
>>    coordinates a
>> b      (1, 3) 2
>> a      (2, 4) 3
>>
>> Using match.ID = FALSE ensures the input order of coordinates and points
>> is kept.
>>
>>>
>>> The dataframe returned by extract has an ID column, but the IDs do not
>>> correspond to the polygons IDs. For instance if I remove the second
>>> polygon (which has the ID "2"), the IDs in the df extracted are still 1
>>> and 2 (instead of 1 and 3).
>>>
>>> # Quick function to get polygons IDs
>>> getPolyID <- function(x) {
>>>    sapply(x at polygons, function(x) {x at ID} )
>>> }
>>>
>>>
>>> getPolyID(polys)
>>> # [1] "1" "2" "3"
>>> getPolyID(polys[-2])
>>> # [1] "1" "3"
>>>
>>> unique(extract(b, polys[-2], df = TRUE)$ID)
>>> # [1] 1 2
>>>
>>>
>>> Any suggestions?
>>>
>>> Thanks,
>>> Loïc
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>>
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
> 
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

-- 
Edzer Pebesma
Institute for Geoinformatics (ifgi),  University of Münster,
Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081
Journal of Statistical Software:   http://www.jstatsoft.org/
Computers & Geosciences:   http://elsevier.com/locate/cageo/
Spatial Statistics Society http://www.spatialstatistics.info

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20151119/b0f31fe4/attachment.bin>


More information about the R-sig-Geo mailing list