[R-sig-Geo] raster: stackApply problems..
Vijay Lulla
v|j@y|u||@ @end|ng |rom gm@||@com
Wed Nov 27 23:22:19 CET 2019
Very nice!
And, just fyi: if you wish to get stackapply_mean to have layers in same
order as ver_mean then you can use stackapply_mean <-
stack(stackapply_mean, layers=order(names(stackapply_mean)))
On Wed, Nov 27, 2019 at 12:43 PM Leonidas Liakos <leonidas_liakos using yahoo.gr>
wrote:
> Thank you for your help!
>
> I tried to fix stackApply according to your instructions.
>
> Now the indices of names are the same and consistent with indices
> enumeration (gist for validation and tests:
> https://gist.github.com/kokkytos/93f315a5ecf59c0b183f9788754bc170).
>
> I've attached a patch file here:
>
> https://gist.github.com/kokkytos/ca2c319134677b19900579665267a7a7
> > stackapply_mean
> class : RasterBrick
> dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
> resolution : 500, 500 (x, y)
> extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
> crs : NA
> source : /tmp/Rtmp9W8UNc/raster/r_tmp_2019-11-27_191205_2929_20324.grd
> names : index_5, index_6, index_7, index_1, index_2, index_3,
> index_4
> min values : 444.6946, 440.2028, 429.6900, 442.7436, 440.0467, 444.9182,
> 437.1589
> max values : 565.8671, 560.1375, 561.7972, 556.2471, 563.8341, 561.7687,
> 560.4509
>
> > ver_mean
> class : RasterStack
> dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
> resolution : 500, 500 (x, y)
> extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
> crs : NA
> names : layer.1, layer.2, layer.3, layer.4, layer.5, layer.6,
> layer.7
> min values : 442.7436, 440.0467, 444.9182, 437.1589, 444.6946, 440.2028,
> 429.6900
> max values : 556.2471, 563.8341, 561.7687, 560.4509, 565.8671, 560.1375,
> 561.7972
>
>
>
>
>
> On 11/26/19 10:58 PM, Vijay Lulla wrote:
>
> Hmm...it appears that stackApply is using different conditions which might
> be causing this problem. Below is the snippet of the code which I think
> might be the problem.
>
> ## For canProcessInMemory
> if (rowcalc) {
> v <- lapply(uin, function(i) fun(x[, ind == i, drop = FALSE], na.rm =
> na.rm))
> }
> else {
> v <- lapply(uin, function(i, ...) apply(x[, ind == i, drop = FALSE], 1,
> fun, na.rm = na.rm))
> }
>
>
> ## If canProcessInMemory is not TRUE
> if (rowcalc) {
> v <- lapply(uin, function(i) fun(a[, ind == uin[i], drop = FALSE], na.rm
> = na.rm))
> }
> else {
> v <- lapply(uin, function(i, ...) apply(a[, ind == uin[i], drop =
> FALSE], 1, fun, na.rm = na.rm))
> }
>
> I think they should both be same but it appears that they aren't and
> that's what you've discovered. Maybe you can try fix(stackApply) to see if
> it really is the problem and can tell us what you find. Anyways, good
> catch...and...sorry for wasting your time.
> Cordially,
> Vijay.
>
> On Tue, Nov 26, 2019 at 2:53 PM Leonidas Liakos <leonidas_liakos using yahoo.gr>
> wrote:
>
>> Thank you!
>> The problem is not with the resulting values but with the index mapping.
>> Values are correct in all three cases.
>>
>> As I wrote in a previous post in the thread (
>> https://stat.ethz.ch/pipermail/r-sig-geo/2019-November/027821.html) ,
>> stackApply behaves inconsistently depending on whether the exported stack
>> will remain in memory or it will be stored, due to its large size, on the
>> hard disk.
>>
>> In the first case the indices are identical to my function (ver_mean) and
>> the lubridate::wday indexing system (and are correct) while in the second
>> they are shuffled.
>>
>> So, while Sunday has index 1 and while in the first case (when the result
>> is in memory) stackApply returns the correct index, in the second case
>> (when the result is stored on the hard disk) it returns index_4! So how can
>> one be sure if index_1 corresponds to Sunday or another day using
>> stackApply since it sometimes enumerates it with index_1 and sometimes
>> index_4?
>>
>>
>> Try to run this example (when the resulting stack remains in memory) to
>> see that the indexes are identical (stackApply = ver_median =
>> lubridate::wday)
>> https://gist.github.com/kokkytos/5d554b5a725bb48d2189e2d1fa0e2206
>>
>> Thank you again
>> On 11/26/19 9:00 PM, Vijay Lulla wrote:
>>
>> I'm sorry for the miscommunication. What I meant to say is that the
>> output from stackApply and zApply are the same (because zApply uses
>> stackApply internally) except the names. The behavior of stackApply makes
>> sense because AFAIUI R doesn't automatically reorder vectors/indices that
>> it receives. Your observation about inconsistent result with ver_mean is
>> very valid though! And, that's what I meant with my comment that using
>> sapply with the explicit ordering that you want is the best way to control
>> what raster package will output. In R the input order should be maintained
>> (this is the prime difference between SQL and R) but packages/tools do not
>> always adhere to this...so it's never clear how the output will be
>> ordered. Sorry for the confusion.
>>
>>
>> On Tue, Nov 26, 2019 at 12:22 PM Leonidas Liakos <
>> leonidas_liakos using yahoo.gr> wrote:
>>
>>> Why do they seem logical since they do not match?
>>>
>>> Check for example index 1 (Sunday). The results are different for the
>>> three processes
>>>
>>> > stackapply_mean
>>> class : RasterBrick
>>> dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
>>> resolution : 500, 500 (x, y)
>>> extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
>>> crs : NA
>>> source :
>>> /tmp/RtmpkRMXLb/raster/r_tmp_2019-11-26_191359_7710_20324.grd
>>> names : index_5, index_6, index_7, index_1, index_2,
>>> index_3, index_4
>>> min values : 440.0467, 444.9182, 437.1589, 444.6946, 440.2028, 429.6900,
>>> 442.7436
>>> max values : 563.8341, 561.7687, 560.4509, 565.8671, 560.1375, 561.7972,
>>> 556.2471
>>>
>>>
>>> > ver_mean
>>> class : RasterStack
>>> dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
>>> resolution : 500, 500 (x, y)
>>> extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
>>> crs : NA
>>> names : layer.1, layer.2, layer.3, layer.4, layer.5,
>>> layer.6, layer.7
>>> min values : 442.7436, 440.0467, 444.9182, 437.1589, 444.6946, 440.2028,
>>> 429.6900
>>> max values : 556.2471, 563.8341, 561.7687, 560.4509, 565.8671, 560.1375,
>>> 561.7972
>>>
>>>
>>> > z
>>> class : RasterBrick
>>> dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
>>> resolution : 500, 500 (x, y)
>>> extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
>>> crs : NA
>>> source :
>>> /tmp/RtmpkRMXLb/raster/r_tmp_2019-11-26_191439_7710_04780.grd
>>> names : X1, X2, X3, X4, X5,
>>> X6, X7
>>> min values : 440.0467, 444.9182, 437.1589, 444.6946, 440.2028, 429.6900,
>>> 442.7436
>>> max values : 563.8341, 561.7687, 560.4509, 565.8671, 560.1375, 561.7972,
>>> 556.2471
>>> : 1, 2, 3, 4, 5, 6, 7
>>>
>>>
>>> On 11/26/19 7:03 PM, Vijay Lulla wrote:
>>>
>>> If you read the code/help for `stackApply` and `zApply` you'll see that
>>> the results that you obtain make sense (at least they seem
>>> sensible/reasonable to me). IMO, if you want to control the ordering of
>>> your layers then just use sapply, like how you've used for ver_mean. IMO,
>>> this is the only reliable (safe?), and quite a readable, way to accomplish
>>> what you're trying to do.
>>> Just my 2 cents.
>>> -- Vijay.
>>>
>>> On Tue, Nov 26, 2019 at 11:19 AM Leonidas Liakos via R-sig-Geo <
>>> r-sig-geo using r-project.org> wrote:
>>>
>>>> I added raster::zApply in my tests to validate the results. However, the
>>>> indices of the names of the results are different now. Recall that the
>>>> goal is to calculate from a raster stack time series the mean per day of
>>>> the week. And that problem I have is that stackApply, zApply and
>>>> calc/sapply return different indices in the result names. New code is
>>>> available here:
>>>> https://gist.github.com/kokkytos/93f315a5ecf59c0b183f9788754bc170
>>>> I'm really curious about missing something.
>>>>
>>>>
>>>> On 11/20/19 3:30 AM, Frederico Faleiro wrote:
>>>> > Hi Leonidas,
>>>> >
>>>> > both results are in the same order, but the name is different.
>>>> > You can rename the first as in the second:
>>>> > names(res) <- names(res2)
>>>> >
>>>> > I provided an example to help you understand the logic.
>>>> >
>>>> > library(raster)
>>>> > beginCluster(2)
>>>> > r <- raster()
>>>> > values(r) <- 1
>>>> > # simple sequential stack from 1 to 6 in all cells
>>>> > s <- stack(r, r*2, r*3, r*4, r*5, r*6)
>>>> > s
>>>> > res <- clusterR(s, stackApply, args = list(indices=c(2,2,3,3,1,1), fun
>>>> > = mean))
>>>> > res
>>>> > res2 <- stackApply(s, c(2,2,3,3,1,1), mean)
>>>> > res2
>>>> > dif <- res - res2
>>>> > # exatly the same order because the difference is zero for all layers
>>>> > dif
>>>> > # rename
>>>> > names(res) <- names(res2)
>>>> >
>>>> > Best regards,
>>>> >
>>>> > Frederico Faleiro
>>>> >
>>>> > On Tue, Nov 19, 2019 at 4:15 PM Leonidas Liakos via R-sig-Geo
>>>> > <r-sig-geo using r-project.org <mailto:r-sig-geo using r-project.org>> wrote:
>>>> >
>>>> > I run the example with clusterR:
>>>> >
>>>> > no_cores <- parallel::detectCores() -1
>>>> > raster::beginCluster(no_cores)
>>>> > ?????? res <- raster::clusterR(inp, raster::stackApply, args =
>>>> > list(indices=c(2,2,3,3,1,1),fun = mean))
>>>> > raster::endCluster()
>>>> >
>>>> > And the result is:
>>>> >
>>>> > > res
>>>> > class?????????? : RasterBrick
>>>> > dimensions : 180, 360, 64800, 3?? (nrow, ncol, ncell, nlayers)
>>>> > resolution : 1, 1?? (x, y)
>>>> > extent???????? : -180, 180, -90, 90?? (xmin, xmax, ymin, ymax)
>>>> > crs?????????????? : +proj=longlat +datum=WGS84 +ellps=WGS84
>>>> > +towgs84=0,0,0
>>>> > source???????? : memory
>>>> > names?????????? : layer.1, layer.2, layer.3
>>>> > min values :???????? 1.5,???????? 3.5,???????? 5.5
>>>> > max values :???????? 1.5,???????? 3.5,???????? 5.5??
>>>> >
>>>> >
>>>> > layer.1, layer.2, layer.3 (?)
>>>> >
>>>> > So what corrensponds to what?
>>>> >
>>>> >
>>>> > If I run:
>>>> >
>>>> > res2 <- stackApply(inp,c(2,2,3,3,1,1),mean)
>>>> >
>>>> > The result is:
>>>> >
>>>> > > res2
>>>> > class : RasterBrick
>>>> > dimensions : 180, 360, 64800, 3 (nrow, ncol, ncell, nlayers)
>>>> > resolution : 1, 1 (x, y)
>>>> > extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
>>>> > crs : +proj=longlat +datum=WGS84 +ellps=WGS84
>>>> +towgs84=0,0,0
>>>> > source : memory
>>>> > names : index_2, index_3, index_1
>>>> > min values : 1.5, 3.5, 5.5
>>>> > max values : 1.5, 3.5, 5.5
>>>> >
>>>> > There is no consistency with the names of the output and obscure
>>>> > correspondence with the indices in the case of clusterR
>>>> >
>>>> >
>>>> > [[alternative HTML version deleted]]
>>>> >
>>>> > _______________________________________________
>>>> > R-sig-Geo mailing list
>>>> > R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>> >
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo using r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>
>>>
>>>
>>
>
--
Vijay Lulla
Assistant Professor,
Dept. of Geography, IUPUI
425 University Blvd, CA-207C.
Indianapolis, IN-46202
vlulla using iupui.edu
ORCID: https://orcid.org/0000-0002-0823-2522
Online: http://vijaylulla.com | https://github.com/vlulla
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list