[R-sig-Geo] Isolating only land areas on a global map for computing averages

Sat Nov 9 05:04:41 CET 2019

Rain--
Yes, it is possible to do it with your extant raster stack.  In fact,
pretty much all reasonable approaches will do that.  Anything you do will
create a raster layer with values at every one of your 8192 raster cells:
some values will be 0 or NA.

The trivial answer if you only had a small range of latitude, and small
raster cell sizes relative to your polygon, would be the
raster::rasterize() function to generate a raster mask.

But, with a global raster from +90 to -90, any interpretable answer
averaging over area needs to take into account the different area of a grid
cell at 70 deg latitude vs 0 deg.  See:
https://gis.stackexchange.com/questions/29734/how-to-calculate-area-of-1-x-1-degree-cells-in-a-raster
At that point, you should go ahead and account for fractions of grid cells
over land so your averaging would be over land area.

I'm reticent to give you a complete code solution because without a name,
you may well be a student with this as an assignment.  But, my approach
would be:

create a SpatialPolygonsDataFrame object "gridPoly" from any of your raster
layers via raster::rasterToPolygons()
generate an id value for each of those polygons as row & column indices
from the raster, or as the cell number.
get land polygon SpatialPolygons object "landPoly" with polygons of land
only, not ocean.
create a new SpatialPolygons object from gIntersect::gridPoly, landPoly,
byid = c(TRUE, FALSE))
calculate an area of each polygon in that object via
geosphere::areaPolygon()  {rgeos::gArea()  only works for projected CRS}
create your mask/weights raster layer with either all NA or all 0.
either:
     parse the id values to row & column values.
     use the triplets of row, column, and area to replace the corresponding
NA or 0 values in that mask
or:
     if you used cell numbers, just use the cell numbers and area values in
replacement in that mask

create a second weight raster via raster::area() on one of your raster
layers.
raster multiply your polygon-area and your raster::area values to give the
actual weighs to use.

This still is an approximation, but likely +/- 1-2%.

If this is still complete gibberish to you, either I need more coffee or
you need to consult a good reference on working with spatial data in
general.

On Thu, Nov 7, 2019 at 8:25 AM <rain1290 using aim.com> wrote:

> Hi Tom and others,
>
> Thank you for your response and suggestions!
>
> Yes, I loaded and used the maptools package previously to create
> continents on my world map, among other things. I do think that the easiest
> approach would be to create a raster layer for land, and then water, with
> the values that I have. However, my precipitation values are globally
> distributed - every grid cell has a precipitation value for each year (i.e.
> each grid cell has 138 values/layers/years). So, if I were to create a
> raster of only land areas, how would I have my grid cells coincide with the
> land areas only on that raster?
>
> Also, would it be possible to accomplish this with the raster stack that I
> already have? If so, is there a way to separate all land/water areas this
> way using the maptools package?
>
> Thanks, again, and I really appreciate your help!
>
> -----Original Message-----
> From: Tom Philippi <tephilippi using gmail.com>
> To: rain1290 <rain1290 using aim.com>
> Cc: r-sig-geo <r-sig-geo using r-project.org>
> Sent: Thu, Nov 7, 2019 12:44 am
> Subject: Re: [R-sig-Geo] Isolating only land areas on a global map for
> computing averages
>
> The easiest approach would be to create a separate aligned raster layer
> for land vs water.  There are plenty of coastline polygons available out
> there (e.g., maptools, rworldmap, rnaturalearth packages): choose one in
> your raster CRS (or choose one and spTransform() it).  Then, use a grid
> version of your raster to extract values from that land/water
> SpatialPolygons object.
>
> 1: Your idea of extracting the land/water value at each grid cell centroid
> gives one estimate.  Instead of TRUE/FALSE, think of the numeric
> equivalents 1,0,  then using those as weights for averaging across your
> grid cells.
> 2: A "better" estimate would be to compute the fraction of each grid cell
> that is land, then use those fractional [0, 1] values as weights to compute
> a weighted average of precipitation over land.  At 2.8deg grid cells, a lot
> of heavy rainfall coastal areas would have the grid cell centroid offshore
> and be omitted by approach #1.
> 3: I recommend that you think hard about averaging across cells in Lat Lon
> to estimate average precipitation over land.  The actual area of a ~2.8 by
> 2.8 deg grid cell at the equator is much larger than the same at 70 deg N.
> I would spend the extra hour computing the actual area (in km^2) of land in
> each of your 8192 grid cells, then using those in a raster as weights for
> whatever calculations you do on the raster stack.  [Or you can cheat, as
> the area of a grid cell in degrees is a function of only the latitudes, and
> your required weights are multiplicative.]
>
> Your mileage may vary...
>
> Tom
>
> On Wed, Nov 6, 2019 at 6:18 PM rain1290--- via R-sig-Geo <
> r-sig-geo using r-project.org> wrote:
>
> Hi there,
> I am interested in calculating precipitation medians globally. However, I
> only want to isolate land areas to compute the median. I already first
> created a raster stack, called "RCP1pctCO2median", which contains the
> median values. There are 138 layers, with each layer representing one
> year.  This raster stack has the following attributes:
> class       : RasterStack
> dimensions  : 64, 128, 8192, 138  (nrow, ncol, ncell, nlayers)
> resolution  : 2.8125, 2.789327  (x, y)
> extent      : -181.4062, 178.5938, -89.25846, 89.25846  (xmin, xmax, ymin,
> ymax)
> coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
> names       :    layer.1,    layer.2,    layer.3,    layer.4,
> layer.5,    layer.6,    layer.7,    layer.8,    layer.9,   layer.10,
> layer.11,   layer.12,   layer.13,   layer.14,   layer.15, ...
> min values  : 0.42964514, 0.43375653, 0.51749371, 0.50838983, 0.45366730,
> 0.53099146, 0.49757186, 0.45697752, 0.41382199, 0.46082401, 0.45516687,
> 0.51857087, 0.41005131, 0.45956529, 0.47497867, ...
> max values  :   96.30350,  104.08584,   88.92751,   97.49373,
> 89.57201,   90.58570,   86.67651,   88.33519,   96.94720,  101.58247,
> 96.07792,   93.21948,   99.59785,   94.26218,   90.62138, ...
>
> Previously, I was isolating a specific region by specifying a range of
> longitudes and latitudes to obtain the medians for that region, like this:
> expansion1<-expand.grid(103:120, 3:15)lonlataaa<-extract(RCP1pctCO2Median,
> expansion1)Columnaaa<-colMeans(lonlataaa)
>
> However, with this approach, too much water can mix with land areas, and
> if I narrow the latitude/longitude range on land, I might miss too much
> land to compute the median meaningfully.
> Therefore, with this data, would it be possible to use an IF/ELSE
> statement to tell R that if the "center point" of each grid cell happens to
> fall on land, then it would be considered as land (i.e. that would be TRUE
> - if not, then FALSE)? Even if a grid cell happens to have water mixed with
> land, but the center point of the grid is on land, that would be considered
> land. But can R even tell what is land or water in this case?
> Thank you, and I would greatly appreciate any assistance!
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>

	[[alternative HTML version deleted]]