[R-sig-Geo] 'xyValues' from 'raster' package slow

Robert J. Hijmans r.hijmans at gmail.com
Tue Dec 1 19:29:52 CET 2009


Dear Karl,

I agree that xyValues is not very fast; I can probably speed it up a
bit (there can be a lot of '==' in there, I do not know where all the
gc()s comes from), but for now here are some suggestions:

# given
x = seq(0, 20, length = 100)
y = seq(55, 70, length = 100)
xy = expand.grid(x, y)

# reading all values into memory first will speed things up
# but this may not be feasible (memory wise):
r = raster("file.ers", values=TRUE)
z = xyValues(r, xy)

An alternative, assuming that what you want is all values for a
rectangular area (I cannot determine that from the example).
r = raster("file.ers")
sr <- rowFromY(r, min(x))
er <- rowFromY(r, max(x))
nr <- er-sr+1
sc <- colFromX(r, min(y))
ec <- colFromX(r, max(y))
r <- readBlock(r, sr, nr, sc, ec)
z <- values(r)

## Or this (same assumption; but much less verbose):
r = raster("file.ers")
rr <- crop(r, extent(min(x), max(x), min(y), max(y)))
z <- values(rr)

# If you do not want all the values for the rectangle, it could still be
# quicker to do the above,  followed by xyValues,
# under the assumption that rr is small enough to keep all values
# in memory while r was not.

r = raster("file.ers")
rr <- crop(r, extent(min(x), max(x), min(y), max(y)))
z = xyValues(rr, xy)


Hope this helps,
Robert

On Tue, Dec 1, 2009 at 4:11 AM, Karl Ove Hufthammer <karl at huftis.org> wrote:
> Dear list members
>
> I'm using the 'raster' package to load a rather large (~2 GB) file of
> bathymetry data. To extract elevation/depth data for various positions,
> I use the 'xyValues' function.
>
> However, it turns that this operations is extremely slow for extracting
> a even moderately large number of values. Here is in essence my code:
>
> r = raster("file.ers")
> x = seq(0, 20, length = 100)
> y = seq(55, 70, length = 100)
> xy = expand.grid(x, y)
> z = xyValues(r, xy)
>
> Am I doing something wrong here? Using 'readBin' or 'readBinFragments'
> on the same file, to extract the same values, takes almost no time. But
> then I have to calculate the indices, positions &c. myself, which is
> tedious and easy get wrong (having to take into account half-pixel
> shifts and other details).
>
> Using Rprof indicates that a lot of 'self.time' is taken up by 'gc' (and
> '=='), which does sound a bit strange.
>
> --
> Karl Ove Hufthammer
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>



More information about the R-sig-Geo mailing list