[R-sig-Geo] How to calculate climatology in rasterbricks
Vijay Lulla
vijaylulla at gmail.com
Fri Jun 3 20:47:17 CEST 2016
Looking at Loïc's answer and from reading ?zApply I learned of
stackApply, which is used internally by zApply. While zApply is for
time series of layers, stackApply is more general and can be used for
applying functions to groups of layers in a stack/brick. It is very
similar to base R's `tapply`. And it is also fast!
> system.time(meanYM1 <- stackApply(s, idxYM, mean))
user system elapsed
0.45 0.03 0.48
> system.time(meanYM <- calc(s, fun=function(x) { by(x, idxYM, mean)}))
user system elapsed
17.50 0.01 17.61
> all(meanYM[] == meanYM1[])
[1] TRUE
Thanks Loïc for pointing out zApply.
On Fri, Jun 3, 2016 at 11:30 AM, Thiago V. dos Santos via R-sig-Geo
<r-sig-geo at r-project.org> wrote:
> Cool Loïc, thanks for showing one more option.
>
> By the way, in this case zApply is surprisingly faster than calc:
>
>> system.time(meanYM <- calc(s,fun=function(x) { by(x, idxYM, sum) }))
> user system elapsed
> 21.700 0.248 22.095
>
>
>> system.time(sYM <- zApply(s, by = as.yearmon, sum))
> user system elapsed
> 0.811 0.047 0.866
>
> Cheers,
> -- Thiago V. dos Santos
>
> PhD student
> Land and Atmospheric Science
> University of Minnesota
>
>
>
> On Friday, June 3, 2016 4:25 AM, Loïc Dutrieux <loic.dutrieux at wur.nl> wrote:
> This can also be done with zApply:
>
> library(zoo)
>
> sYM <- zApply(s, by = as.yearmon, sum)
> sM <- zApply(sYM, by = months, mean)
>
> Cheers,
> Loïc
>
> On 06/03/2016 02:02 AM, Vijay Lulla wrote:
>> I think the following StackOverflow question has the answer:
>> http://stackoverflow.com/questions/16135877/applying-a-function-to-a-multidimensional-array-with-grouping-variable/16136775#16136775
>>
>> Following the instructions listed on that page for your case might go
>> something like below:
>>
>>> idxYM <- as.integer(strftime(idx,"%Y%m"))
>>> idxM <- unique(idxYM)%%100
>>> meanYM <- calc(s,fun=function(x) { by(x, idxYM, mean) })
>>> meanYM
>> class : RasterBrick
>> dimensions : 20, 20, 400, 360 (nrow, ncol, ncell, nlayers)
>> resolution : 18, 9 (x, y)
>> extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
>> coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
>> data source : in memory
>> names : X196101, X196102, X196103, X196104, X196105, X196106,
>> X196107, X196108, X196109, X196110, X196111, X196112, X196201,
>> X196202, X196203, ...
>> min values : 0.3728, 0.2725, 0.3421, 0.3652, 0.3342, 0.3185,
>> 0.3130, 0.3780, 0.3376, 0.3727, 0.3537, 0.3737, 0.3515, 0.3588,
>> 0.3334, ...
>> max values : 0.6399, 0.6652, 0.6583, 0.6640, 0.6359, 0.6761,
>> 0.6442, 0.6800, 0.6397, 0.6769, 0.6489, 0.6388, 0.6471, 0.6661,
>> 0.6255, ...
>>
>>> meanM <- calc(meanYM, fun=function(x) { by(x, idxM, mean) })
>>> meanM
>> class : RasterBrick
>> dimensions : 20, 20, 400, 12 (nrow, ncol, ncell, nlayers)
>> resolution : 18, 9 (x, y)
>> extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
>> coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
>> data source : in memory
>> names : X1, X2, X3, X4, X5, X6, X7,
>> X8, X9, X10, X11, X12
>> min values : 0.4645, 0.4715, 0.4768, 0.4717, 0.4749, 0.4705, 0.4697,
>> 0.4724, 0.4629, 0.4774, 0.4736, 0.4708
>> max values : 0.5274, 0.5275, 0.5293, 0.5259, 0.5285, 0.5276, 0.5269,
>> 0.5260, 0.5256, 0.5281, 0.5279, 0.5286
>>
>>>
>>
>> I'm not sure how [in]efficient this is for actual (i.e. not toy
>> example) data. Maybe others more experienced, and knowledgeable,
>> members can provide better answers.
>>
>> HTH,
>> Vijay.
>>
>> On Thu, Jun 2, 2016 at 4:30 PM, Thiago V. dos Santos via R-sig-Geo
>> <r-sig-geo at r-project.org> wrote:
>>> Dear all,
>>>
>>> I am working with daily time series of meteorological variables. This is an example of the dataset:
>>>
>>> library(raster)
>>>
>>> # Create date sequence
>>> idx <- seq(as.Date("1961/1/1"), as.Date("1990/12/31"), by = "day")
>>>
>>> # Create raster stack and assign dates
>>> r <- raster(ncol=20, nrow=20)
>>> s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
>>> s <- setZ(s, idx)
>>>
>>>
>>> Now, let's assume those values represent daily precipitation. What I need to do is to integrate daily to monthly values,
>>> and then take a monthly climatology. Climatology in this case means multi-year average of selected months, e.g., an average of the 30 Octobers from 1961 to 1990, an average of the 30 Novembers from 1961 to 1990 and etc.
>>>
>>> On the other hand, let's assume the raster values represent daily temperature. Integrating daily to monthly temperature doesn't make sense. Hence, instead of integrating daily values, I need to take monthly means (e.g. mean value of all days in every month), and then calculate the climatology.
>>>
>>> What would be the best approach to achieve that using the raster package?
>>>
>>> Greetings,
>>> -- Thiago V. dos Santos
>>>
>>> PhD student
>>> Land and Atmospheric Science
>>> University of Minnesota
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>>
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
More information about the R-sig-Geo
mailing list