[R-sig-Geo] How to calculate climatology in rasterbricks

Loïc Dutrieux loic.dutrieux at wur.nl
Sat Jun 4 15:48:45 CEST 2016


Hi Thiago,

Yes, I can reproduce.
It's probably because zApply() defaults to na.rm = TRUE (in fact the 
na.rm = argument is ignored if you pass it), and

sum(c(NA, NA), na.rm = TRUE)
[1] 0 (I was surprised by that).

Perhaps the stackApply() call in the zApply() function definition is 
missing an ellipsis.

Cheers,
Loïc

On 06/03/2016 10:25 PM, Thiago V. dos Santos wrote:
> Dear Vijay and Loïc,
>
>
> I have identified some artifacts in the results when using zApply on my data. To reproduce it, please download this shapefile (https://www.dropbox.com/s/in2z10mlerr2sfu/southern.zip?dl=0) and run the code below (see the comments after the plot commands):
>
> -----------------------
> library(raster)
> library(zoo)
>
> # Create the date sequence
> idx <- seq(as.Date("1961/1/1"), as.Date("1990/12/31"), by = "day")
>
> # Create raster stack and assign dates
> r <- raster(ncol=24, nrow=21, xmn=-58, xmx=-47.5, ymn=-34, ymx=-22, resolution=0.5)
> s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
> s <- setZ(s, idx)
>
> # Load shapefile and crop data
> shp <- shapefile('~/Downloads/southern.shp', warnPRJ=F)
> s.c <- crop(s, extent(shp))
> s.c <- mask(s.c, shp)
> plot(s.c,1)
> plot(shp,add=T) # Looks good
>
> # Loïc's approach using zApply
> sYM <- zApply(s.c, by = as.yearmon, sum)
> sM <- zApply(sYM, by = months, mean)
> plot(sYM,1)  # Note that the "empty" space across the extent was filled with 0's
> plot(sM,1)  # The same "gray area" here
>
> # Vijay's approach using calc
> idxYM <- as.integer(strftime(idx,"%Y%m"))
> idxM <- unique(idxYM)%%100
> meanYM <- calc(s.c, fun=function(x) { by(x, idxYM, sum) })
> meanM <- calc(meanYM, fun=function(x) { by(x, idxM, mean) })
> plot(meanYM,1)  # Note that no 0 was added across the extent
> plot(meanM,1)  # This is OK too, no grey area
> -----------------------
>
> Basically, a portion of the raster extent was filled with 0's after using zApply. It doesn't happen when using calc.
>
> Any ideas on what can be causing this?
>
>   Greetings,
>   -- Thiago V. dos Santos
>
> PhD student
> Land and Atmospheric Science
> University of Minnesota
>
>
>
> On Friday, June 3, 2016 1:47 PM, Vijay Lulla <vijaylulla at gmail.com> wrote:
> Looking at Loïc's answer and from reading ?zApply I learned of
> stackApply, which is used internally by zApply.  While zApply is for
> time series of layers, stackApply is more general and can be used for
> applying functions to groups of layers in a stack/brick.  It is very
> similar to base R's `tapply`.  And it is also fast!
>
>> system.time(meanYM1 <- stackApply(s, idxYM, mean))
>     user  system elapsed
>     0.45    0.03    0.48
>> system.time(meanYM <- calc(s, fun=function(x) { by(x, idxYM, mean)}))
>     user  system elapsed
>    17.50    0.01   17.61
>> all(meanYM[] == meanYM1[])
> [1] TRUE
>
> Thanks Loïc for pointing out zApply.
>
>
> On Fri, Jun 3, 2016 at 11:30 AM, Thiago V. dos Santos via R-sig-Geo
> <r-sig-geo at r-project.org> wrote:
>> Cool Loïc, thanks for showing one more option.
>>
>> By the way, in this case zApply is surprisingly faster than calc:
>>
>>> system.time(meanYM <- calc(s,fun=function(x) { by(x, idxYM, sum) }))
>> user  system elapsed
>> 21.700   0.248  22.095
>>
>>
>>> system.time(sYM <- zApply(s, by = as.yearmon, sum))
>> user  system elapsed
>> 0.811   0.047   0.866
>>
>>   Cheers,
>>   -- Thiago V. dos Santos
>>
>> PhD student
>> Land and Atmospheric Science
>> University of Minnesota
>>
>>
>>
>> On Friday, June 3, 2016 4:25 AM, Loïc Dutrieux <loic.dutrieux at wur.nl> wrote:
>> This can also be done with zApply:
>>
>> library(zoo)
>>
>> sYM <- zApply(s, by = as.yearmon, sum)
>> sM <- zApply(sYM, by = months, mean)
>>
>> Cheers,
>> Loïc
>>
>> On 06/03/2016 02:02 AM, Vijay Lulla wrote:
>>> I think the following StackOverflow question has the answer:
>>> http://stackoverflow.com/questions/16135877/applying-a-function-to-a-multidimensional-array-with-grouping-variable/16136775#16136775
>>>
>>> Following the instructions listed on that page for your case might go
>>> something like below:
>>>
>>>> idxYM <- as.integer(strftime(idx,"%Y%m"))
>>>> idxM <- unique(idxYM)%%100
>>>> meanYM <- calc(s,fun=function(x) { by(x, idxYM, mean) })
>>>> meanYM
>>> class       : RasterBrick
>>> dimensions  : 20, 20, 400, 360  (nrow, ncol, ncell, nlayers)
>>> resolution  : 18, 9  (x, y)
>>> extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
>>> coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
>>> data source : in memory
>>> names       : X196101, X196102, X196103, X196104, X196105, X196106,
>>> X196107, X196108, X196109, X196110, X196111, X196112, X196201,
>>> X196202, X196203, ...
>>> min values  :  0.3728,  0.2725,  0.3421,  0.3652,  0.3342,  0.3185,
>>> 0.3130,  0.3780,  0.3376,  0.3727,  0.3537,  0.3737,  0.3515,  0.3588,
>>>    0.3334, ...
>>> max values  :  0.6399,  0.6652,  0.6583,  0.6640,  0.6359,  0.6761,
>>> 0.6442,  0.6800,  0.6397,  0.6769,  0.6489,  0.6388,  0.6471,  0.6661,
>>>    0.6255, ...
>>>
>>>> meanM <- calc(meanYM, fun=function(x) { by(x, idxM, mean) })
>>>> meanM
>>> class       : RasterBrick
>>> dimensions  : 20, 20, 400, 12  (nrow, ncol, ncell, nlayers)
>>> resolution  : 18, 9  (x, y)
>>> extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
>>> coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
>>> data source : in memory
>>> names       :     X1,     X2,     X3,     X4,     X5,     X6,     X7,
>>>      X8,     X9,    X10,    X11,    X12
>>> min values  : 0.4645, 0.4715, 0.4768, 0.4717, 0.4749, 0.4705, 0.4697,
>>> 0.4724, 0.4629, 0.4774, 0.4736, 0.4708
>>> max values  : 0.5274, 0.5275, 0.5293, 0.5259, 0.5285, 0.5276, 0.5269,
>>> 0.5260, 0.5256, 0.5281, 0.5279, 0.5286
>>>
>>>>
>>>
>>> I'm not sure how [in]efficient this is for actual (i.e. not toy
>>> example) data.  Maybe others more experienced, and knowledgeable,
>>> members can provide better answers.
>>>
>>> HTH,
>>> Vijay.
>>>
>>> On Thu, Jun 2, 2016 at 4:30 PM, Thiago V. dos Santos via R-sig-Geo
>>> <r-sig-geo at r-project.org> wrote:
>>>> Dear all,
>>>>
>>>> I am working with daily time series of meteorological variables. This is an example of the dataset:
>>>>
>>>> library(raster)
>>>>
>>>> # Create date sequence
>>>> idx <- seq(as.Date("1961/1/1"), as.Date("1990/12/31"), by = "day")
>>>>
>>>> # Create raster stack and assign dates
>>>> r <- raster(ncol=20, nrow=20)
>>>> s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
>>>> s <- setZ(s, idx)
>>>>
>>>>
>>>> Now, let's assume those values represent daily precipitation. What I need to do is to integrate daily to monthly values,
>>>> and then take a monthly climatology. Climatology in this case means multi-year average of selected months, e.g., an average of the 30 Octobers from 1961 to 1990, an average of the 30 Novembers from 1961 to 1990 and etc.
>>>>
>>>> On the other hand, let's assume the raster values represent daily temperature. Integrating daily to monthly temperature doesn't make sense. Hence, instead of integrating daily values, I need to take monthly means (e.g. mean value of all days in every month), and then calculate the climatology.
>>>>
>>>> What would be the best approach to achieve that using the raster package?
>>>>
>>>>    Greetings,
>>>>    -- Thiago V. dos Santos
>>>>
>>>> PhD student
>>>> Land and Atmospheric Science
>>>> University of Minnesota
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>>>
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-sig-Geo mailing list