[R-sig-Geo] Methodology to compare crop maps
Robert Hijmans
r.hijmans at gmail.com
Fri Jun 3 18:17:06 CEST 2011
> I am working with crops planted area maps from two distinct sources.
> One of the maps is based on a maximum NDVI composition, and the other
> map uses joint information from satellite and census to estimate the
> planted area.
>
> Although the sources employ different methodologies to map the area
> where the crop exists, the results should be comparable.
>
> After downloading the datasets, I have performed a visual inpection,
> and they show reasonable agreement. However, I need a more robust
> comparison method. Could anybody point out a methodology which allows
> me to show the difference between both maps?
>
> Here is an example of each one of the maps:
> http://www.geog.mcgill.ca/landuse/pub/Data/175crops2000/NetCDF/sugarcane_5min.nc.gz
> (in netcdf) and http://www.dsr.inpe.br/laf/canasat/en/map.html (not
> available to download directly, but I can get it in shapefile)
>
Thiago,
I assume that the Brazilian data has a much higher spatial resolution than
the mcgill data (that I think I am familiar with), and it probably has a
different CRS. And I assume that you can get it as a the original raster
file (and not as shapefile) for the Brazilian data. If I am not mistaken,
the mcgill data has the fraction of land area covered by a crop. I assume
that the Brazilan data is presence/absence. If so I would use the raster
package and aggregate the Brazilian data to a cell size that is similar to
the mcgill data (~9 km), computing the fraction of cells that have sugarcane
(sum divided by the number of cells, make sure to handle NA values). Then
use function projectRaster to transform the mcgill data to the same
extent/resolution as the aggregated Brazilian data. Now you have two layers
that you can compare in different ways.
You can make plots, compute correlation, etc. Of course the p-values are no
good because of spatial autocorrelation.
library(raster)
x <- y <- raster(nc=100, nr=100)
x[] <- runif(ncell(r))
y[] <- runif(ncell(r))
plot(x, y)
m <- lm(values(x), values(y))
summary(m)
abline(m)
hist(x-y)
plot(x-y)
cor(values(x), values(y))
Perhaps you want to treat your data as presence/absence (with presence being
> 0 or some another threshold). These can then be easily compared with the
crosstab function which returns, in this case, a confusion matrix which can
be directly interpreted or used to compute some statistics from.
crosstab(x>0, y>0)
crosstab(x>0.5, y>0.5)
And there surely are many other approaches possible, which is why I think
that R is the way to go in this case: it is easy, flexible and fast.
Robert
--
View this message in context: http://r-sig-geo.2731867.n2.nabble.com/Methodology-to-compare-crop-maps-tp6431598p6435902.html
Sent from the R-sig-geo mailing list archive at Nabble.com.
More information about the R-sig-Geo
mailing list