[R-sig-Geo] 'xyValues' from 'raster' package slow

Robert J. Hijmans r.hijmans at gmail.com
Tue Dec 1 22:14:33 CET 2009


Karl and others: There is a new version of xyValues (in raster version
0.9.8-31)  that --in a single test-- is 30 times faster then the
previous version (for files read with rgdal; I have not yet updated
the function for the 'native formats'). Robert

On Tue, Dec 1, 2009 at 10:29 AM, Robert J. Hijmans <r.hijmans at gmail.com> wrote:
> Dear Karl,
>
> I agree that xyValues is not very fast; I can probably speed it up a
> bit (there can be a lot of '==' in there, I do not know where all the
> gc()s comes from), but for now here are some suggestions:
>
> # given
> x = seq(0, 20, length = 100)
> y = seq(55, 70, length = 100)
> xy = expand.grid(x, y)
>
> # reading all values into memory first will speed things up
> # but this may not be feasible (memory wise):
> r = raster("file.ers", values=TRUE)
> z = xyValues(r, xy)
>
> An alternative, assuming that what you want is all values for a
> rectangular area (I cannot determine that from the example).
> r = raster("file.ers")
> sr <- rowFromY(r, min(x))
> er <- rowFromY(r, max(x))
> nr <- er-sr+1
> sc <- colFromX(r, min(y))
> ec <- colFromX(r, max(y))
> r <- readBlock(r, sr, nr, sc, ec)
> z <- values(r)
>
> ## Or this (same assumption; but much less verbose):
> r = raster("file.ers")
> rr <- crop(r, extent(min(x), max(x), min(y), max(y)))
> z <- values(rr)
>
> # If you do not want all the values for the rectangle, it could still be
> # quicker to do the above,  followed by xyValues,
> # under the assumption that rr is small enough to keep all values
> # in memory while r was not.
>
> r = raster("file.ers")
> rr <- crop(r, extent(min(x), max(x), min(y), max(y)))
> z = xyValues(rr, xy)
>
>
> Hope this helps,
> Robert
>
> On Tue, Dec 1, 2009 at 4:11 AM, Karl Ove Hufthammer <karl at huftis.org> wrote:
>> Dear list members
>>
>> I'm using the 'raster' package to load a rather large (~2 GB) file of
>> bathymetry data. To extract elevation/depth data for various positions,
>> I use the 'xyValues' function.
>>
>> However, it turns that this operations is extremely slow for extracting
>> a even moderately large number of values. Here is in essence my code:
>>
>> r = raster("file.ers")
>> x = seq(0, 20, length = 100)
>> y = seq(55, 70, length = 100)
>> xy = expand.grid(x, y)
>> z = xyValues(r, xy)
>>
>> Am I doing something wrong here? Using 'readBin' or 'readBinFragments'
>> on the same file, to extract the same values, takes almost no time. But
>> then I have to calculate the indices, positions &c. myself, which is
>> tedious and easy get wrong (having to take into account half-pixel
>> shifts and other details).
>>
>> Using Rprof indicates that a lot of 'self.time' is taken up by 'gc' (and
>> '=='), which does sound a bit strange.
>>
>> --
>> Karl Ove Hufthammer
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>



More information about the R-sig-Geo mailing list