[R-sig-Geo] Problem with extract {raster} function

Kalomenopoulos, Manos manos.kalomenopoulos at wur.nl
Thu Jun 6 13:18:32 CEST 2013


Thanks a lot, Robert!
Could you please tell me how you cleaned the shapefile. I will need to clean my whole dataset, as this was just a small extract of it.
Thanks in advance,
Manos

-------------------------------------------------------
Manos Kalomenopoulos
Lab. of Geo-Information Science and Remote Sensing, Center for Geo-Information - Wageningen University
Department for Biogeochemical Integration, Max-Planck-Institute for Biogeochemistry, Jena
-------------------------------------------------------


-----Original Message-----
From: Robert J. Hijmans [mailto:r.hijmans at gmail.com] 
Sent: dinsdag 28 mei 2013 3:15
To: Kalomenopoulos, Manos
Cc: r-sig-geo at r-project.org
Subject: Re: [R-sig-Geo] Problem with extract {raster} function

Manos,

I think this happens because s1pol is invalid (self-intersecting).
s2pol also has that problem but it works anyway. I have cleaned your shapefile and will send it to you via a separate email. With the cleaned shp file the problem does not occur. The script below shows the test for validity and the difference between the two data sources.

Robert

library(raster)
s1pol <- shapefile('1site.shp')
s2 <- shapefile('cln.shp')
library(rgeos)
gIsValid(s1pol)
# [1] FALSE
# Warning message:
# In RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid") :
#   Ring Self-intersection at or near point -4997474.25179449 -227339.61631866
gIsValid(s2)
#  [1] TRUE

# see the difference

r <- raster('1site.tif')
df1 <- extract(r,s1pol, df=TRUE, weights=TRUE, cellnumbers=TRUE)
names(df1) <- c('ID','cell', 'r', 'weight') # Extraction for test area a <- tapply(df1$weight, df1$cell, sum)
r1 <- raster(r)
r1[as.integer(names(a))] = a

df2 <- extract(r, s2, df=TRUE, weights=TRUE, cellnumbers=TRUE)
names(df2) <- c('ID','cell', 'r', 'weight') # Extraction for test area b <- tapply(df2$weight, df2$cell, sum)
r2 <- raster(r)
r2[as.integer(names(b))] <- b

s <- stack(r1, r2)
names(s) <- c('original', 'cleaned')
plot(s)


On Mon, May 20, 2013 at 11:25 AM, Kalomenopoulos, Manos <manos.kalomenopoulos at wur.nl> wrote:
> Hi everybody,
> I am experiencing a problem with the extract{raster} function, specifically the 'weights' argument.
>
> I have a raster and a polygon shapefile with 5 polygons (land cover classes).
> I am using extract to obtain cellnumbers and the weights (percentage of cover of each class per raster cell).
> The sum of the five weights per cell should sum to 100% (except for those cells not completely inside my AOI).
> However, there is an considerable amount of cells were that is not the case.
> I have been trying to track down the error, but can't understand it so far.
>
> I localized the erroneous cells in one example site (10x10km) and clipped the original shapefile to a smaller area containing these cells.
> The script below does the extraction once for the 10x10km sample and once for a smaller testsite containing 10 erroneous cells.
> Surprisingly, doing the extraction with the smaller testsite ALL weights sum up to 100%, even though the polygon data is exactly the same....
> Any ideas why and/or suggestions how to make it work properly for the bigger extent?
>
> Here my script:
>
>
> library(raster)
> library(reshape)
>
> # raster of AOI
>  r <- raster('1site.tif')
> # polygon shapefile of AOI with 5 different classes
>   s1pol <- shapefile('1site.shp')
>
> # polygon shapefile of test area with observed errors (clipped from s1pol)
>   s2pol <- shapefile('test.shp')
> # extract class names
>   classes <- s1pol$class
> ######################################################################
> ##
> # Extraction for complete AOI
>   df1 <- extract(r,s1pol, df=TRUE, weights=TRUE, cellnumbers=TRUE)
>     names(df1) <- c('ID','cell', 'r', 'weight1') # Extraction for test 
> area
>
> # Extraction for smaller site
>   df2 <- extract(r,s2pol, df=TRUE, weights=TRUE, cellnumbers=TRUE)
>     names(df2) <- c('ID','cell', 'r', 'weight2') 
> ######################################################################
> #####################
> ### Compairing
>
> # Reshape df's keeping only ID, cell and weight
>   df1new<-cast(df1[,c(1,2,4)],cell~ID)
>     names(df1new) <- c('cell',classes)
>
>   df2new<-cast(df2[,c(1,2,4)],cell~ID)
>     names(df2new) <- c('cell',classes) # Calculate class totals
>   df1new$totals <- mapply(df1new[,2], df1new[,3], 
> df1new[,4],df1new[,5],df1new[,6], FUN=function(v,w,x,y,z) 
> sum(c(v,w,x,y,z),na.rm=TRUE))
>
>   df2new$totals <- mapply(df2new[,2], df2new[,3], 
> df2new[,4],df2new[,5],df2new[,6], FUN=function(v,w,x,y,z) 
> sum(c(v,w,x,y,z),na.rm=TRUE))
>
> # Merge the two dataframes and select only rows with differnt totals
>   df <- as.data.frame(merge(df1new,df2new,by='cell'))
>
>     dfe <- subset(df, totals.x-totals.y !=0)
>       dfe$error <- dfe$totals.x-dfe$totals.y
>
> head(dfe)
>
> Link to the files needed to run this script:
> https://www.dropbox.com/s/n283rwew1ypxghv/extractProblem.7z
>
>
> Thanks in advance!
> -------------------------------------------------------
> Manos Kalomenopoulos
> Lab. of Geo-Information Science and Remote Sensing, Center for 
> Geo-Information - Wageningen University Department for Biogeochemical 
> Integration, Max-Planck-Institute for Biogeochemistry, Jena
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-sig-Geo mailing list