[R-sig-Geo] Methodology to compare crop maps

Sat Jun 11 06:27:33 CEST 2011

Hi Thiago,

You are not saying _what_ goes wrong. That makes it difficult to help.
I am guessing that you are missing some of the steps below. You do:

cane_sa[cane_sa>0]<-16

# I do not know your exact objectives but it seems strange to use 0 here,
# as even a cell that is covered for 1/600 of its area with cane would
be classified as cane.
# but irrespective of the threshold you choose, I think you need:

cane_sa[cane_sa < 16] <- NA
veg_sa <- crop(vegmap,south_america)
newveg_sa <- cover(cane_sa, veg_sa)

# and then continue with your code again:

new_vegmap <- merge(newveg_sa, vegmap)

Robert

On Thu, Jun 9, 2011 at 3:51 PM, Thiago Veloso <thi_veloso at yahoo.com.br> wrote:
>  Robert, Rich and Luca,
>
>  Thank you very much for the suggestions.
>
>  Robert, I still haven't managed to implement all of your directions due to the burocracy to obtain the sugarcane raster (or shape) files from INPE.
>
>  In the meanwhile, I am trying to convert mcgill data (you're right about the units) from fraction of land covered by a crop to presence/absence. This is because my ultimate goal is to add a new plant functional type (PFT) to a map where 15 PFTs already exist (http://www.sage.wisc.edu/download/potveg/global_potveg.html). This way, cells where cane is present should contain the value "16".
>
>  It seemed to be a simple task, but what I am doing is changing the map substantially. Please take a peek at the code:
>
> #Loading required packages
> library(raster)
> library(maptools)
>
> #Loading SAGE vegetation map and McGill sugarcane map
> vegmap<-raster("data/vegtype.nc")
> cane<-raster("data/sugarcane_5min.nc")
>
>> vegmap
> class       : RasterLayer
> dimensions  : 360, 720, 259200  (nrow, ncol, ncell)
> resolution  : 0.5, 0.5  (x, y)
> extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
> coord. ref. : +proj=longlat +datum=WGS84
> values      : data/vegtype.nc
> min         : ?
> max         : ?
>
>> cane
> class       : RasterLayer
> dimensions  : 2160, 4320, 9331200  (nrow, ncol, ncell)
> resolution  : 0.08333333, 0.08333334  (x, y)
> extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
> coord. ref. : +proj=longlat +datum=WGS84
> values      : data/sugarcane_5min.nc
> min         : ?
> max         : ?
>
> #Resampling McGill map to match SAGE's resolution
> cane_coarser<-aggregate(cane,fact=6.00000024,fun=mean)
>
>> cane_coarser
> class       : RasterLayer
> dimensions  : 360, 720, 259200  (nrow, ncol, ncell)
> resolution  : 0.5, 0.5  (x, y)
> extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
> coord. ref. : +proj=longlat +datum=WGS84
> values      : /tmp/R_raster_tmp/raster_tmp_39924224710.grd
> min value   : 0
> max value   : 0.5680773
>
> #Loading SA shapefile
> south_america<-readShapePoly("shapes/southamerica.shp")
>
> #Cropping sugar cane map
> cane_sa<-crop(cane_coarser,south_america)
>
>> cane_sa
> class       : RasterLayer
> dimensions  : 137, 93, 12741  (nrow, ncol, ncell)
> resolution  : 0.5, 0.5  (x, y)
> extent      : -81.49999, -34.99999, -56, 12.5  (xmin, xmax, ymin, ymax)
> coord. ref. : +proj=longlat +datum=WGS84
> values      : in memory
> min value   : 0
> max value   : 0.5237272
>
> #Below is an attempt to replace fraction with presence
> cane_sa[cane_sa>0]<-16
>
> #Check result
> plot(cane_coarser)
>
> #Adding cane map to SAGE vegetation map
> new_vegmap<merge(vegmap,cane_sa)
>
> #Writing out updated map to a netcdf file
> if(require(ncdf)){
> writeRaster(new_vegmap,filename="/home/thiago/IBIS/data/inpue/new_vegmap.nc",format="CDF",overwrite=TRUE)
> }
>
>  Everything goes fine until cropping the resampled map. However, replacing grater than zero values results in a totally different map.
>
>  What am I doing wrong? Maybe spatial data frames are harder to deal with (for newbies like me), or maybe the massive amount of NA's and tiny numbers (0.000000e+00, 2.433714e-03 and etc) produced after agregation is the problem...
>
>  Thanks in advance and best wishes,
>
>  Thiago.
>
> --- On Fri, 3/6/11, Robert Hijmans <r.hijmans at gmail.com> wrote:
>
>> From: Robert Hijmans <r.hijmans at gmail.com>
>> Subject: Re: [R-sig-Geo] Methodology to compare crop maps
>> To: r-sig-geo at r-project.org
>> Date: Friday, 3 June, 2011, 13:17
>> >  I am working with crops
>> planted area maps from two distinct sources.
>> > One of the maps is based on a maximum NDVI
>> composition, and the other
>> > map uses joint information from satellite and census
>> to estimate the
>> > planted area.
>> >
>> >  Although the sources employ different methodologies
>> to map the area
>> > where the crop exists, the results should be
>> comparable.
>> >
>> >  After downloading the datasets, I have performed a
>> visual inpection,
>> > and they show reasonable agreement. However, I need a
>> more robust
>> > comparison method. Could anybody point out a
>> methodology which allows
>> > me to show the difference between both maps?
>> >
>> >  Here is an example of each one of the maps:
>> > http://www.geog.mcgill.ca/landuse/pub/Data/175crops2000/NetCDF/sugarcane_5min.nc.gz
>> > (in netcdf) and http://www.dsr.inpe.br/laf/canasat/en/map.html (not
>> > available to download directly, but I can get it in
>> shapefile)
>> >
>>
>>
>> Thiago,
>>
>> I assume that the Brazilian data has a much higher spatial
>> resolution than
>> the mcgill data (that I think I am familiar with), and it
>> probably has a
>> different CRS. And I assume that you can get it as a the
>> original raster
>> file (and not as shapefile) for the Brazilian data. If I am
>> not mistaken,
>> the mcgill data has the fraction of land area covered by a
>> crop. I assume
>> that the Brazilan data is presence/absence. If so I would
>> use the raster
>> package and aggregate the Brazilian data to a cell size
>> that is similar to
>> the mcgill data (~9 km), computing the fraction of cells
>> that have sugarcane
>> (sum divided by the number of cells, make sure to handle NA
>> values). Then
>> use function projectRaster to transform the mcgill data to
>> the same
>> extent/resolution as the aggregated Brazilian data. Now you
>> have two layers
>> that you can compare in different ways.
>>
>> You can make plots, compute correlation, etc. Of course the
>> p-values are no
>> good because of spatial autocorrelation.
>>
>> library(raster)
>> x <- y <- raster(nc=100, nr=100)
>> x[] <- runif(ncell(r))
>> y[] <- runif(ncell(r))
>> plot(x, y)
>> m <- lm(values(x), values(y))
>> summary(m)
>> abline(m)
>>
>> hist(x-y)
>> plot(x-y)
>> cor(values(x), values(y))
>>
>>
>> Perhaps you want to treat your data as presence/absence
>> (with presence being
>> > 0 or some another threshold). These can then be easily
>> compared with the
>> crosstab function which returns, in this case, a confusion
>> matrix which can
>> be directly interpreted or used to compute some statistics
>> from.
>> crosstab(x>0, y>0)
>>
>> crosstab(x>0.5, y>0.5)
>>
>> And there surely are many other approaches possible, which
>> is why I think
>> that R is the way to go in this case: it is easy, flexible
>> and fast.
>>
>> Robert
>>
>>
>> --
>> View this message in context: http://r-sig-geo.2731867.n2.nabble.com/Methodology-to-compare-crop-maps-tp6431598p6435902.html
>> Sent from the R-sig-geo mailing list archive at
>> Nabble.com.
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
>