[R-sig-Geo] Tiled processing...

Ashton Shortridge ashton at msu.edu
Thu Oct 30 14:46:15 CET 2008


Hi Jonathan,

I use GDAL to read subsets of raster and image files, though usually I employ 
the python bindings. I don't actually have answers, but maybe ideas about how 
to get there...

> 1) The code doesn't actually seem to work -- I get a tiny, unreadable
> TIFF out of the back of this algorithm -- what is wrong?
I don't know; however, you can check to see that you are actually reading the 
data that you think you are. Insert a line like print(x) or print(elevnew) in 
the middle. If that works, then the problem is in how you are writing it.

> 2) I'm actually unclear about exactly what this is doing -- am I really
> just creating a large in-memory matrix (ds2) and then writing the entire
> output image out to disk, or is this somehow writing line-by-line?  If
> the former is true, how do I modify this to write within the loop,
> rather than all-at-once?
Yes, that is what your code seems to be doing - you are writing each line to 
an object in memory and then saving it outside of the loop as a tiff. In lots 
of batch processing you can add to a file incrementally (and therefore work 
with really large files in a memory-limited environment), but from my limited 
perspective, it doesn't appear that this is practical with gdal.

Your code therefore is not saving you lots of memory. It is saving you some - 
you don't have the entire input image in memory, just a row at a time, but 
the output is sitting there the whole time.

Another possibility might be to write small tiffs within the loop and then 
merge the tiffs, possibly (probably) outside of R, e.g. in a batch file 
employing command-line gdal.

Hope this helps, 

Ashton

On Wednesday 29 October 2008 09:44:51 pm Jonathan Greenberg wrote:
> I've recently got back into using R to perform spatial analyses, and I'm
> trying to figure out how to perform "true" tiled processing, e.g.
> controlled reading of subsets of an input file, performing a function on
> this subset, and writing the output, subset by subset, to an output file
> and, finally, setting up the appropriate "header" info (the metadata).
>
> Using suggestions Tim Keitt and Roger Bivand gave me some years back, I
> wrote this simple code (all it should do is take an elevation geotiff
> and add 1000 to the elevations, writing the output):
>
> ***
>
> library(rgdal)
> ds1 <- GDAL.open("elev.tif")
> driver <- new('GDALDriver', 'GTiff')
> ds2 <- new('GDALTransientDataset', driver, nrow(ds1), ncol(ds1), type =
> "Float32")
>   for (row in nrow(ds1)) {
>     row
>     x <- getRasterData(ds1, offset = c(row - 1, 0), region = c(1,
> ncol(ds1)))
>     elevnew <- x+1000
>     putRasterData(ds2, elevnew, offset = c(row - 1, 0))
>
>   }
> saveDataset(ds2, "out.tif")
> closeDataset(ds2)
> closeDataset(ds1)
>
> ***
>
> Couple of issues/questions:
> 1) The code doesn't actually seem to work -- I get a tiny, unreadable
> TIFF out of the back of this algorithm -- what is wrong?
> 2) I'm actually unclear about exactly what this is doing -- am I really
> just creating a large in-memory matrix (ds2) and then writing the entire
> output image out to disk, or is this somehow writing line-by-line?  If
> the former is true, how do I modify this to write within the loop,
> rather than all-at-once?
>
> Thanks!
>
> --j



-- 
Ashton Shortridge
Associate Professor			ashton at msu.edu
Dept of Geography			http://www.msu.edu/~ashton
235 Geography Building		ph (517) 432-3561
Michigan State University		fx (517) 432-1671




More information about the R-sig-Geo mailing list