[R-sig-Geo] raster: stackApply problems..
Leonidas Liakos
|eon|d@@_||@ko@ @end|ng |rom y@hoo@gr
Wed Nov 27 18:43:13 CET 2019
Thank you for your help!
I tried to fix stackApply according to your instructions.
Now the indices of names are the same and consistent with indices
enumeration (gist for validation and tests:
https://gist.github.com/kokkytos/93f315a5ecf59c0b183f9788754bc170).
I've attached a patch file here:
https://gist.github.com/kokkytos/ca2c319134677b19900579665267a7a7
> stackapply_mean
class : RasterBrick
dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
resolution : 500, 500 (x, y)
extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
crs : NA
source : /tmp/Rtmp9W8UNc/raster/r_tmp_2019-11-27_191205_2929_20324.grd
names : index_5, index_6, index_7, index_1, index_2,
index_3, index_4
min values : 444.6946, 440.2028, 429.6900, 442.7436, 440.0467, 444.9182,
437.1589
max values : 565.8671, 560.1375, 561.7972, 556.2471, 563.8341, 561.7687,
560.4509
> ver_mean
class : RasterStack
dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
resolution : 500, 500 (x, y)
extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
crs : NA
names : layer.1, layer.2, layer.3, layer.4, layer.5,
layer.6, layer.7
min values : 442.7436, 440.0467, 444.9182, 437.1589, 444.6946, 440.2028,
429.6900
max values : 556.2471, 563.8341, 561.7687, 560.4509, 565.8671, 560.1375,
561.7972
On 11/26/19 10:58 PM, Vijay Lulla wrote:
> Hmm...it appears that stackApply is using different conditions which
> might be causing this problem. Below is the snippet of the code which
> I think might be the problem.
>
> ## For canProcessInMemory
> if (rowcalc) {
> v <- lapply(uin, function(i) fun(x[, ind == i, drop = FALSE], na.rm
> = na.rm))
> }
> else {
> v <- lapply(uin, function(i, ...) apply(x[, ind == i, drop = FALSE],
> 1, fun, na.rm = na.rm))
> }
>
>
> ## If canProcessInMemory is not TRUE
> if (rowcalc) {
> v <- lapply(uin, function(i) fun(a[, ind == uin[i], drop = FALSE],
> na.rm = na.rm))
> }
> else {
> v <- lapply(uin, function(i, ...) apply(a[, ind == uin[i], drop =
> FALSE], 1, fun, na.rm = na.rm))
> }
>
> I think they should both be same but it appears that they aren't and
> that's what you've discovered. Maybe you can try fix(stackApply) to
> see if it really is the problem and can tell us what you find.
> Anyways, good catch...and...sorry for wasting your time.
> Cordially,
> Vijay.
>
> On Tue, Nov 26, 2019 at 2:53 PM Leonidas Liakos
> <leonidas_liakos using yahoo.gr <mailto:leonidas_liakos using yahoo.gr>> wrote:
>
> Thank you!
> The problem is not with the resulting values but with the index
> mapping. Values are correct in all three cases.
>
> As I wrote in a previous post in the thread
> (https://stat.ethz.ch/pipermail/r-sig-geo/2019-November/027821.html)
> , stackApply behaves inconsistently depending on whether the
> exported stack will remain in memory or it will be stored, due to
> its large size, on the hard disk.
>
> In the first case the indices are identical to my function
> (ver_mean) and the lubridate::wday indexing system (and are
> correct) while in the second they are shuffled.
>
> So, while Sunday has index 1 and while in the first case (when the
> result is in memory) stackApply returns the correct index, in the
> second case (when the result is stored on the hard disk) it
> returns index_4! So how can one be sure if index_1 corresponds to
> Sunday or another day using stackApply since it sometimes
> enumerates it with index_1 and sometimes index_4?
>
>
> Try to run this example (when the resulting stack remains in
> memory) to see that the indexes are identical (stackApply =
> ver_median = lubridate::wday)
> https://gist.github.com/kokkytos/5d554b5a725bb48d2189e2d1fa0e2206
>
> Thank you again
>
> On 11/26/19 9:00 PM, Vijay Lulla wrote:
>> I'm sorry for the miscommunication. What I meant to say is that
>> the output from stackApply and zApply are the same (because
>> zApply uses stackApply internally) except the names. The
>> behavior of stackApply makes sense because AFAIUI R doesn't
>> automatically reorder vectors/indices that it receives. Your
>> observation about inconsistent result with ver_mean is very valid
>> though! And, that's what I meant with my comment that using
>> sapply with the explicit ordering that you want is the best way
>> to control what raster package will output. In R the input order
>> should be maintained (this is the prime difference between SQL
>> and R) but packages/tools do not always adhere to this...so it's
>> never clear how the output will be ordered. Sorry for the confusion.
>>
>>
>> On Tue, Nov 26, 2019 at 12:22 PM Leonidas Liakos
>> <leonidas_liakos using yahoo.gr <mailto:leonidas_liakos using yahoo.gr>> wrote:
>>
>> Why do they seem logical since they do not match?
>>
>> Check for example index 1 (Sunday). The results are different
>> for the three processes
>>
>> > stackapply_mean
>> class : RasterBrick
>> dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
>> resolution : 500, 500 (x, y)
>> extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
>> crs : NA
>> source :
>> /tmp/RtmpkRMXLb/raster/r_tmp_2019-11-26_191359_7710_20324.grd
>> names : index_5, index_6, index_7, index_1,
>> index_2, index_3, index_4
>> min values : 440.0467, 444.9182, 437.1589, 444.6946,
>> 440.2028, 429.6900, 442.7436
>> max values : 563.8341, 561.7687, 560.4509, 565.8671,
>> 560.1375, 561.7972, 556.2471
>>
>>
>> > ver_mean
>> class : RasterStack
>> dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
>> resolution : 500, 500 (x, y)
>> extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
>> crs : NA
>> names : layer.1, layer.2, layer.3, layer.4,
>> layer.5, layer.6, layer.7
>> min values : 442.7436, 440.0467, 444.9182, 437.1589,
>> 444.6946, 440.2028, 429.6900
>> max values : 556.2471, 563.8341, 561.7687, 560.4509,
>> 565.8671, 560.1375, 561.7972
>>
>>
>> > z
>> class : RasterBrick
>> dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
>> resolution : 500, 500 (x, y)
>> extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
>> crs : NA
>> source :
>> /tmp/RtmpkRMXLb/raster/r_tmp_2019-11-26_191439_7710_04780.grd
>> names : X1, X2, X3, X4,
>> X5, X6, X7
>> min values : 440.0467, 444.9182, 437.1589, 444.6946,
>> 440.2028, 429.6900, 442.7436
>> max values : 563.8341, 561.7687, 560.4509, 565.8671,
>> 560.1375, 561.7972, 556.2471
>> : 1, 2, 3, 4, 5, 6, 7
>>
>>
>> On 11/26/19 7:03 PM, Vijay Lulla wrote:
>>> If you read the code/help for `stackApply` and `zApply`
>>> you'll see that the results that you obtain make sense (at
>>> least they seem sensible/reasonable to me). IMO, if you
>>> want to control the ordering of your layers then just use
>>> sapply, like how you've used for ver_mean. IMO, this is the
>>> only reliable (safe?), and quite a readable, way to
>>> accomplish what you're trying to do.
>>> Just my 2 cents.
>>> -- Vijay.
>>>
>>> On Tue, Nov 26, 2019 at 11:19 AM Leonidas Liakos via
>>> R-sig-Geo <r-sig-geo using r-project.org
>>> <mailto:r-sig-geo using r-project.org>> wrote:
>>>
>>> I added raster::zApply in my tests to validate the
>>> results. However, the
>>> indices of the names of the results are different now.
>>> Recall that the
>>> goal is to calculate from a raster stack time series the
>>> mean per day of
>>> the week. And that problem I have is that stackApply,
>>> zApply and
>>> calc/sapply return different indices in the result
>>> names. New code is
>>> available here:
>>> https://gist.github.com/kokkytos/93f315a5ecf59c0b183f9788754bc170
>>> I'm really curious about missing something.
>>>
>>>
>>> On 11/20/19 3:30 AM, Frederico Faleiro wrote:
>>> > Hi Leonidas,
>>> >
>>> > both results are in the same order, but the name is
>>> different.
>>> > You can rename the first as in the second:
>>> > names(res) <- names(res2)
>>> >
>>> > I provided an example to help you understand the logic.
>>> >
>>> > library(raster)
>>> > beginCluster(2)
>>> > r <- raster()
>>> > values(r) <- 1
>>> > # simple sequential stack from 1 to 6 in all cells
>>> > s <- stack(r, r*2, r*3, r*4, r*5, r*6)
>>> > s
>>> > res <- clusterR(s, stackApply, args =
>>> list(indices=c(2,2,3,3,1,1), fun
>>> > = mean))
>>> > res
>>> > res2 <- stackApply(s, c(2,2,3,3,1,1), mean)
>>> > res2
>>> > dif <- res - res2
>>> > # exatly the same order because the difference is zero
>>> for all layers
>>> > dif
>>> > # rename
>>> > names(res) <- names(res2)
>>> >
>>> > Best regards,
>>> >
>>> > Frederico Faleiro
>>> >
>>> > On Tue, Nov 19, 2019 at 4:15 PM Leonidas Liakos via
>>> R-sig-Geo
>>> > <r-sig-geo using r-project.org
>>> <mailto:r-sig-geo using r-project.org>
>>> <mailto:r-sig-geo using r-project.org
>>> <mailto:r-sig-geo using r-project.org>>> wrote:
>>> >
>>> > I run the example with clusterR:
>>> >
>>> > no_cores <- parallel::detectCores() -1
>>> > raster::beginCluster(no_cores)
>>> > ?????? res <- raster::clusterR(inp,
>>> raster::stackApply, args =
>>> > list(indices=c(2,2,3,3,1,1),fun = mean))
>>> > raster::endCluster()
>>> >
>>> > And the result is:
>>> >
>>> > > res
>>> > class?????????? : RasterBrick
>>> > dimensions : 180, 360, 64800, 3?? (nrow, ncol,
>>> ncell, nlayers)
>>> > resolution : 1, 1?? (x, y)
>>> > extent???????? : -180, 180, -90, 90?? (xmin, xmax,
>>> ymin, ymax)
>>> > crs?????????????? : +proj=longlat +datum=WGS84
>>> +ellps=WGS84
>>> > +towgs84=0,0,0
>>> > source???????? : memory
>>> > names?????????? : layer.1, layer.2, layer.3
>>> > min values :???????? 1.5,???????? 3.5,???????? 5.5
>>> > max values :???????? 1.5,???????? 3.5,???????? 5.5??
>>> >
>>> >
>>> > layer.1, layer.2, layer.3 (?)
>>> >
>>> > So what corrensponds to what?
>>> >
>>> >
>>> > If I run:
>>> >
>>> > res2 <- stackApply(inp,c(2,2,3,3,1,1),mean)
>>> >
>>> > The result is:
>>> >
>>> > > res2
>>> > class : RasterBrick
>>> > dimensions : 180, 360, 64800, 3 (nrow, ncol,
>>> ncell, nlayers)
>>> > resolution : 1, 1 (x, y)
>>> > extent : -180, 180, -90, 90 (xmin, xmax,
>>> ymin, ymax)
>>> > crs : +proj=longlat +datum=WGS84
>>> +ellps=WGS84 +towgs84=0,0,0
>>> > source : memory
>>> > names : index_2, index_3, index_1
>>> > min values : 1.5, 3.5, 5.5
>>> > max values : 1.5, 3.5, 5.5
>>> >
>>> > There is no consistency with the names of the
>>> output and obscure
>>> > correspondence with the indices in the case of
>>> clusterR
>>> >
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > _______________________________________________
>>> > R-sig-Geo mailing list
>>> > R-sig-Geo using r-project.org
>>> <mailto:R-sig-Geo using r-project.org>
>>> <mailto:R-sig-Geo using r-project.org
>>> <mailto:R-sig-Geo using r-project.org>>
>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>>
>>>
>>
>
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list