[R-sig-Geo] SpatialGridDataFrame to data.frame

Ned Horning horning at amnh.org
Wed Feb 11 19:40:28 CET 2009


Roger,

I used the path to the full path to the image.
spot <- 
stack('/home/nedhorning/IPY/ak_grassdata/293318101_1_1_0/PERMANENT/cellhd/subset.red', 
'/home/nedhorning/IPY/ak_grassdata/293318101_1_1_0/PERMANENT/cellhd/subset.green', 
'/home/nedhorning/IPY/ak_grassdata/293318101_1_1_0/PERMANENT/cellhd/subset.blue')

GDAL Error 4: not recognized as a supported file format.

Ned

Roger Bivand wrote:
> On Wed, 11 Feb 2009, Ned Horning wrote:
>
>> Robert and Roger,
>>
>> Thanks for the information and pointers. The raster package looks 
>> quite interesting and I'll try to get up to speed on some of its 
>> capabilities. Are the man pages the best way to do that or is that a 
>> single document available?
>>
>> I made some progress but still have some questions. I followed the 
>> steps laid out by Robert and everything went fine except I ran into 
>> an error with "predrast <- setValues(predrast, pred, r)" in the for 
>> loop when I tried processing one line at a time and "r <- 
>> setValues(r, pred)" when I ran the full image in one go. The error 
>> was: "values must be a vector." Any idea what I'm doing wrong?
>>
>> I tried to read the GRASS files directly but got a message saying it 
>> is not a supported file format. Can you confirm that is the case or 
>> am I doing something wrong? I was able to read a tiff version of the 
>> image. I am able to run gdalinfo on GRASS files just fine from a 
>> terminal window.
>
> Could you quote verbatim what the actual fname= argument was that you 
> used in readGDAL for the plugin - the incantation isn't obvious?
>
> In readRAST6() it is:
>
> paste(gg$GISDBASE, gg$LOCATION_NAME, mapset, "cellhd", vname[1], sep="/")
>
> where all of GISDBASE, LOCATION_NAME, and mapset need to be discovered 
> - you can use gmeta6() for the first two, and .g_findfile() for the 
> third, please read the code in spgrass6 to see how they work.
>
> Roger
>
>>
>> Thanks again for the help.
>>
>> Ned
>>
>>
>> Robert Hijmans wrote:
>>> Ned,
>>>
>>> This is an example of running a RandomForest prediction with the
>>> raster package (for the simple case that there are no NA values in the
>>> raster data; if there are, you have to into account that "predict'
>>> does not return any values (not even NA) for those cells).
>>>
>>> Robert
>>>
>>> #install.packages("raster", repos="http://R-Forge.R-project.org")
>>> require(raster)
>>> require(randomForest)
>>>
>>> # for single band files
>>> spot <- stack('b1.tif', 'b2.tif', 'b3.tif')
>>> # for multiple band files
>>> # spot <- stackFromFiles(c('bands.tif', 'bands.tif', 'bands.tif'), 
>>> c(1,2,3) )
>>>
>>> # simulate random points and values to model with
>>> xy <- xyFromCell(spot, round(runif(100) * ncell(spot)))
>>> response <- runif(100) * 100
>>> # read values of raster layers at points, and bind to respinse
>>> trainvals <- cbind(response, xyValues(spot, xy))
>>>
>>> # run RandomForest
>>> randfor <- randomForest(response ~ b1 + b2 + b3, data=trainvals)
>>>
>>> # apply the prediction, row by row
>>> predrast <- setRaster(spot)
>>> filename(predrast) <- 'RF_pred.grd'
>>> for (r in 1:nrow(spot)) {
>>>     spot <- readRow(spot, r)
>>>     rowvals <- values(spot, names=TRUE)
>>> # this next line should not be necessary, but it is
>>> # I'll fix that
>>>     colnames(rowvals) <- c('b1', 'b2', 'b3')
>>>     pred <- predict(randfor, rowvals)
>>>     predrast <- setValues(predrast, pred, r)
>>>     predrast <- writeRaster(predrast, overwrite=TRUE)
>>> }
>>>
>>> plot(predrast)
>>>
>>>
>>>
>>>
>>> On Wed, Feb 11, 2009 at 5:09 PM, Roger Bivand <Roger.Bivand at nhh.no> 
>>> wrote:
>>>
>>>> Ned:
>>>>
>>>>
>>>> The three bands are most likely treated as 4-byte integers, so the 
>>>> object
>>>> will be 2732 by 3058 by 3 by 4 plus a little bit. That's 100MB. 
>>>> They may
>>>> get copied too. There are no single byte user-level containers for you
>>>> (there is a raw data type, but you can't calculate with it). Possibly
>>>> saying spot_frame <- slot(spot, "data") will save one copying 
>>>> operation,
>>>> but its hard to tell - your choice of method first adds inn all the
>>>> coordinates, which are 8-byte numbers, so more than doubles its 
>>>> size and
>>>> makes more copies (to 233MB for each copy). Running gc() several times
>>>> manually between steps often helps by making the garbage collector 
>>>> more
>>>> aggressive.
>>>>
>>>> I would watch the developments in the R-Forge package "raster", which
>>>> builds on some of these things, and try to see how that works. If 
>>>> you have
>>>> the GDAL-GRASS plugin for rasters, you can use readGDAL to read 
>>>> from GRASS
>>>> - which would work with raster package functions now. Look at the 
>>>> code of
>>>> recent readRAST6 to see which incantations are needed. If you are 
>>>> going to
>>>> use randomForest for prediction, you can use smaller tiles until 
>>>> you find
>>>> an alternative solution. Note that feeding a data frame of integers 
>>>> to a
>>>> model fitting or prediction function will result in coercion to a
>>>> matrix of doubles, so your subsequent workflow should take that into
>>>> account.
>>>>  Getting more memory is another option, and may be very cost and 
>>>> especially
>>>> time effective - at the moment your machine is swapping. Buying 
>>>> memory may
>>>> save you time programming around too little memory.
>>>>
>>>> Hope this helps,
>>>>
>>>> Roger
>>>>
>>>>
>>>> ---
>>>> Roger Bivand, NHH, Helleveien 30, N-5045 Bergen,
>>>> Roger.Bivand at nhh.no
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: r-sig-geo-bounces at stat.math.ethz.ch on behalf of Ned Horning
>>>> Sent: Wed 11.02.2009 07:40
>>>> To: r-sig-geo at stat.math.ethz.ch
>>>> Subject: [R-sig-Geo] SpatialGridDataFrame to data.frame
>>>>
>>>> Greetings,
>>>>
>>>> I am trying to read an image from GRASS using the spgrass6 command
>>>> readRAST6 and then convert it into a data.frame object so I can use it
>>>> with randomForest. The byte image I'm reading is 2732 rows x 3058
>>>> columns x 3 bands. It's a small subset of a larger image I would 
>>>> like to
>>>> use eventually. I have no problem reading the image using readRAST6 
>>>> but
>>>> when I try to convert it to a data.frame object my linux system
>>>> resources (1BG RAM, 3GB swap) nearly get maxed out and it runs for a
>>>> couple hours before I kill the process. The image is less than 25MB so
>>>> I'm surprised it requires this level of memory. Can someone let me 
>>>> know
>>>> why this is. Should I use something other than the GRASS interface for
>>>> this? These are the commands I'm using:
>>>>
>>>> spot <- readRAST6(c("subset.red", "subset.green", "subset.blue"))
>>>> spot_frame <- as(spot, "data.frame")
>>>>
>>>> Any help would be appreciated.
>>>>
>>>> All the best,
>>>>
>>>> Ned
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>>>>
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>



More information about the R-sig-Geo mailing list