[R-sig-Geo] Methodology to compare crop maps

Thu Jun 16 13:53:15 CEST 2011

Dear Thiago,

That's why I suggested:

> cane_sa[cane_sa < 16] <- NA
> veg_sa <- crop(vegmap,south_america)
> newveg_sa <- cover(cane_sa, veg_sa)
> new_vegmap <- merge(newveg_sa, vegmap)

i.e. set all values you do not want to NA, and then use 'cover' to
replace NA values in the new raster with the values of the old raster,
and then use merge.

Best, Robert

On Thu, Jun 16, 2011 at 4:11 AM, Thiago Veloso <thi_veloso at yahoo.com.br> wrote:
>  Dear Robert,
>
>  Sorry for not stating my question concisely. Still, you captured what was wrong in my code: that was exactly the threshold of the existence of the crop. Raising the limit helped to alleviate the crop existence map at a coarser resolution.
>
>  Following the code, I experienced another difficulty. The merge function replaces the whole contents from one map to the other. Is there any way to impose conditions? Something like "merge if pixel value is greater than..."?
>
>  Best wishes,
>
>  Thiago Veloso.
>
> --- On Sat, 11/6/11, Robert J. Hijmans <r.hijmans at gmail.com> wrote:
>
>> From: Robert J. Hijmans <r.hijmans at gmail.com>
>> Subject: Re: [R-sig-Geo] Methodology to compare crop maps
>> To: "Thiago Veloso" <thi_veloso at yahoo.com.br>
>> Cc: r-sig-geo at r-project.org
>> Date: Saturday, 11 June, 2011, 1:27
>> Hi Thiago,
>>
>> You are not saying _what_ goes wrong. That makes it
>> difficult to help.
>> I am guessing that you are missing some of the steps below.
>> You do:
>>
>> cane_sa[cane_sa>0]<-16
>>
>> # I do not know your exact objectives but it seems strange
>> to use 0 here,
>> # as even a cell that is covered for 1/600 of its area with
>> cane would
>> be classified as cane.
>> # but irrespective of the threshold you choose, I think you
>> need:
>>
>> cane_sa[cane_sa < 16] <- NA
>> veg_sa <- crop(vegmap,south_america)
>> newveg_sa <- cover(cane_sa, veg_sa)
>>
>> # and then continue with your code again:
>>
>> new_vegmap <- merge(newveg_sa, vegmap)
>>
>> Robert
>>
>> On Thu, Jun 9, 2011 at 3:51 PM, Thiago Veloso <thi_veloso at yahoo.com.br>
>> wrote:
>> >  Robert, Rich and Luca,
>> >
>> >  Thank you very much for the suggestions.
>> >
>> >  Robert, I still haven't managed to implement all of
>> your directions due to the burocracy to obtain the sugarcane
>> raster (or shape) files from INPE.
>> >
>> >  In the meanwhile, I am trying to convert mcgill data
>> (you're right about the units) from fraction of land covered
>> by a crop to presence/absence. This is because my ultimate
>> goal is to add a new plant functional type (PFT) to a map
>> where 15 PFTs already exist (http://www.sage.wisc.edu/download/potveg/global_potveg.html).
>> This way, cells where cane is present should contain the
>> value "16".
>> >
>> >  It seemed to be a simple task, but what I am doing
>> is changing the map substantially. Please take a peek at the
>> code:
>> >
>> > #Loading required packages
>> > library(raster)
>> > library(maptools)
>> >
>> > #Loading SAGE vegetation map and McGill sugarcane map
>> > vegmap<-raster("data/vegtype.nc")
>> > cane<-raster("data/sugarcane_5min.nc")
>> >
>> >> vegmap
>> > class       : RasterLayer
>> > dimensions  : 360, 720, 259200  (nrow, ncol, ncell)
>> > resolution  : 0.5, 0.5  (x, y)
>> > extent      : -180, 180, -90, 90  (xmin, xmax,
>> ymin, ymax)
>> > coord. ref. : +proj=longlat +datum=WGS84
>> > values      : data/vegtype.nc
>> > min         : ?
>> > max         : ?
>> >
>> >> cane
>> > class       : RasterLayer
>> > dimensions  : 2160, 4320, 9331200  (nrow, ncol,
>> ncell)
>> > resolution  : 0.08333333, 0.08333334  (x, y)
>> > extent      : -180, 180, -90, 90  (xmin, xmax,
>> ymin, ymax)
>> > coord. ref. : +proj=longlat +datum=WGS84
>> > values      : data/sugarcane_5min.nc
>> > min         : ?
>> > max         : ?
>> >
>> > #Resampling McGill map to match SAGE's resolution
>> >
>> cane_coarser<-aggregate(cane,fact=6.00000024,fun=mean)
>> >
>> >> cane_coarser
>> > class       : RasterLayer
>> > dimensions  : 360, 720, 259200  (nrow, ncol, ncell)
>> > resolution  : 0.5, 0.5  (x, y)
>> > extent      : -180, 180, -90, 90  (xmin, xmax,
>> ymin, ymax)
>> > coord. ref. : +proj=longlat +datum=WGS84
>> > values      :
>> /tmp/R_raster_tmp/raster_tmp_39924224710.grd
>> > min value   : 0
>> > max value   : 0.5680773
>> >
>> > #Loading SA shapefile
>> >
>> south_america<-readShapePoly("shapes/southamerica.shp")
>> >
>> > #Cropping sugar cane map
>> > cane_sa<-crop(cane_coarser,south_america)
>> >
>> >> cane_sa
>> > class       : RasterLayer
>> > dimensions  : 137, 93, 12741  (nrow, ncol, ncell)
>> > resolution  : 0.5, 0.5  (x, y)
>> > extent      : -81.49999, -34.99999, -56, 12.5
>>  (xmin, xmax, ymin, ymax)
>> > coord. ref. : +proj=longlat +datum=WGS84
>> > values      : in memory
>> > min value   : 0
>> > max value   : 0.5237272
>> >
>> > #Below is an attempt to replace fraction with
>> presence
>> > cane_sa[cane_sa>0]<-16
>> >
>> > #Check result
>> > plot(cane_coarser)
>> >
>> > #Adding cane map to SAGE vegetation map
>> > new_vegmap<merge(vegmap,cane_sa)
>> >
>> > #Writing out updated map to a netcdf file
>> > if(require(ncdf)){
>> >
>> writeRaster(new_vegmap,filename="/home/thiago/IBIS/data/inpue/new_vegmap.nc",format="CDF",overwrite=TRUE)
>> > }
>> >
>> >  Everything goes fine until cropping the resampled
>> map. However, replacing grater than zero values results in a
>> totally different map.
>> >
>> >  What am I doing wrong? Maybe spatial data frames are
>> harder to deal with (for newbies like me), or maybe the
>> massive amount of NA's and tiny numbers (0.000000e+00,
>> 2.433714e-03 and etc) produced after agregation is the
>> problem...
>> >
>> >  Thanks in advance and best wishes,
>> >
>> >  Thiago.
>> >
>> > --- On Fri, 3/6/11, Robert Hijmans <r.hijmans at gmail.com>
>> wrote:
>> >
>> >> From: Robert Hijmans <r.hijmans at gmail.com>
>> >> Subject: Re: [R-sig-Geo] Methodology to compare
>> crop maps
>> >> To: r-sig-geo at r-project.org
>> >> Date: Friday, 3 June, 2011, 13:17
>> >> >  I am working with crops
>> >> planted area maps from two distinct sources.
>> >> > One of the maps is based on a maximum NDVI
>> >> composition, and the other
>> >> > map uses joint information from satellite and
>> census
>> >> to estimate the
>> >> > planted area.
>> >> >
>> >> >  Although the sources employ different
>> methodologies
>> >> to map the area
>> >> > where the crop exists, the results should be
>> >> comparable.
>> >> >
>> >> >  After downloading the datasets, I have
>> performed a
>> >> visual inpection,
>> >> > and they show reasonable agreement. However,
>> I need a
>> >> more robust
>> >> > comparison method. Could anybody point out a
>> >> methodology which allows
>> >> > me to show the difference between both maps?
>> >> >
>> >> >  Here is an example of each one of the
>> maps:
>> >> > http://www.geog.mcgill.ca/landuse/pub/Data/175crops2000/NetCDF/sugarcane_5min.nc.gz
>> >> > (in netcdf) and http://www.dsr.inpe.br/laf/canasat/en/map.html (not
>> >> > available to download directly, but I can get
>> it in
>> >> shapefile)
>> >> >
>> >>
>> >>
>> >> Thiago,
>> >>
>> >> I assume that the Brazilian data has a much higher
>> spatial
>> >> resolution than
>> >> the mcgill data (that I think I am familiar with),
>> and it
>> >> probably has a
>> >> different CRS. And I assume that you can get it as
>> a the
>> >> original raster
>> >> file (and not as shapefile) for the Brazilian
>> data. If I am
>> >> not mistaken,
>> >> the mcgill data has the fraction of land area
>> covered by a
>> >> crop. I assume
>> >> that the Brazilan data is presence/absence. If so
>> I would
>> >> use the raster
>> >> package and aggregate the Brazilian data to a cell
>> size
>> >> that is similar to
>> >> the mcgill data (~9 km), computing the fraction of
>> cells
>> >> that have sugarcane
>> >> (sum divided by the number of cells, make sure to
>> handle NA
>> >> values). Then
>> >> use function projectRaster to transform the mcgill
>> data to
>> >> the same
>> >> extent/resolution as the aggregated Brazilian
>> data. Now you
>> >> have two layers
>> >> that you can compare in different ways.
>> >>
>> >> You can make plots, compute correlation, etc. Of
>> course the
>> >> p-values are no
>> >> good because of spatial autocorrelation.
>> >>
>> >> library(raster)
>> >> x <- y <- raster(nc=100, nr=100)
>> >> x[] <- runif(ncell(r))
>> >> y[] <- runif(ncell(r))
>> >> plot(x, y)
>> >> m <- lm(values(x), values(y))
>> >> summary(m)
>> >> abline(m)
>> >>
>> >> hist(x-y)
>> >> plot(x-y)
>> >> cor(values(x), values(y))
>> >>
>> >>
>> >> Perhaps you want to treat your data as
>> presence/absence
>> >> (with presence being
>> >> > 0 or some another threshold). These can then
>> be easily
>> >> compared with the
>> >> crosstab function which returns, in this case, a
>> confusion
>> >> matrix which can
>> >> be directly interpreted or used to compute some
>> statistics
>> >> from.
>> >> crosstab(x>0, y>0)
>> >>
>> >> crosstab(x>0.5, y>0.5)
>> >>
>> >> And there surely are many other approaches
>> possible, which
>> >> is why I think
>> >> that R is the way to go in this case: it is easy,
>> flexible
>> >> and fast.
>> >>
>> >> Robert
>> >>
>> >>
>> >> --
>> >> View this message in context: http://r-sig-geo.2731867.n2.nabble.com/Methodology-to-compare-crop-maps-tp6431598p6435902.html
>> >> Sent from the R-sig-geo mailing list archive at
>> >> Nabble.com.
>> >>
>> >> _______________________________________________
>> >> R-sig-Geo mailing list
>> >> R-sig-Geo at r-project.org
>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> >>
>> >
>> >
>>
>
>
>