[R-sig-Geo] SpatialGridDataFrame to data.frame

Ned Horning horning at amnh.org
Wed Feb 11 19:43:50 CET 2009


Robert,

Using predrast <- setValues(predrast, as.vector(pred), r) I got another 
error: values must be numeric, integer or logical.

class(pred) = "factor"
dim(pred) = NULL
class(v) = "character"
length(v) == ncol(spot) = TRUE

Ned

Robert Hijmans wrote:
> Strange. You could try
>       predrast <- setValues(predrast, as.vector(pred), r)
>
> But it would be good to know what pred is.
>
> Can you do this:
>
> class(pred)
> dim(pred)
> v <- as.vector(pred)
> class(v)
> length(v) == ncol(spot)
>
>
> Robert
>
>
>
>
> On Wed, Feb 11, 2009 at 11:16 PM, Ned Horning <horning at amnh.org> wrote:
>   
>> Robert and Roger,
>>
>> Thanks for the information and pointers. The raster package looks quite
>> interesting and I'll try to get up to speed on some of its capabilities. Are
>> the man pages the best way to do that or is that a single document
>> available?
>>
>> I made some progress but still have some questions. I followed the steps
>> laid out by Robert and everything went fine except I ran into an error with
>> "predrast <- setValues(predrast, pred, r)" in the for loop when I tried
>> processing one line at a time and "r <- setValues(r, pred)" when I ran the
>> full image in one go. The error was: "values must be a vector." Any idea
>> what I'm doing wrong?
>>
>> I tried to read the GRASS files directly but got a message saying it is not
>> a supported file format. Can you confirm that is the case or am I doing
>> something wrong? I was able to read a tiff version of the image. I am able
>> to run gdalinfo on GRASS files just fine from a terminal window.
>>
>> Thanks again for the help.
>>
>> Ned
>>
>>
>> Robert Hijmans wrote:
>>     
>>> Ned,
>>>
>>> This is an example of running a RandomForest prediction with the
>>> raster package (for the simple case that there are no NA values in the
>>> raster data; if there are, you have to into account that "predict'
>>> does not return any values (not even NA) for those cells).
>>>
>>> Robert
>>>
>>> #install.packages("raster", repos="http://R-Forge.R-project.org")
>>> require(raster)
>>> require(randomForest)
>>>
>>> # for single band files
>>> spot <- stack('b1.tif', 'b2.tif', 'b3.tif')
>>> # for multiple band files
>>> # spot <- stackFromFiles(c('bands.tif', 'bands.tif', 'bands.tif'),
>>> c(1,2,3) )
>>>
>>> # simulate random points and values to model with
>>> xy <- xyFromCell(spot, round(runif(100) * ncell(spot)))
>>> response <- runif(100) * 100
>>> # read values of raster layers at points, and bind to respinse
>>> trainvals <- cbind(response, xyValues(spot, xy))
>>>
>>> # run RandomForest
>>> randfor <- randomForest(response ~ b1 + b2 + b3, data=trainvals)
>>>
>>> # apply the prediction, row by row
>>> predrast <- setRaster(spot)
>>> filename(predrast) <- 'RF_pred.grd'
>>> for (r in 1:nrow(spot)) {
>>>        spot <- readRow(spot, r)
>>>        rowvals <- values(spot, names=TRUE)
>>> # this next line should not be necessary, but it is
>>> # I'll fix that
>>>        colnames(rowvals) <- c('b1', 'b2', 'b3')
>>>        pred <- predict(randfor, rowvals)
>>>        predrast <- setValues(predrast, pred, r)
>>>        predrast <- writeRaster(predrast, overwrite=TRUE)
>>> }
>>>
>>> plot(predrast)
>>>
>>>
>>>
>>>
>>> On Wed, Feb 11, 2009 at 5:09 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>>>
>>>       
>>>> Ned:
>>>>
>>>>
>>>> The three bands are most likely treated as 4-byte integers, so the object
>>>> will be 2732 by 3058 by 3 by 4 plus a little bit. That's 100MB. They may
>>>> get copied too. There are no single byte user-level containers for you
>>>> (there is a raw data type, but you can't calculate with it). Possibly
>>>> saying spot_frame <- slot(spot, "data") will save one copying operation,
>>>> but its hard to tell - your choice of method first adds inn all the
>>>> coordinates, which are 8-byte numbers, so more than doubles its size and
>>>> makes more copies (to 233MB for each copy). Running gc() several times
>>>> manually between steps often helps by making the garbage collector more
>>>> aggressive.
>>>>
>>>> I would watch the developments in the R-Forge package "raster", which
>>>> builds on some of these things, and try to see how that works. If you
>>>> have
>>>> the GDAL-GRASS plugin for rasters, you can use readGDAL to read from
>>>> GRASS
>>>> - which would work with raster package functions now. Look at the code of
>>>> recent readRAST6 to see which incantations are needed. If you are going
>>>> to
>>>> use randomForest for prediction, you can use smaller tiles until you find
>>>> an alternative solution. Note that feeding a data frame of integers to a
>>>> model fitting or prediction function will result in coercion to a
>>>> matrix of doubles, so your subsequent workflow should take that into
>>>> account.
>>>>  Getting more memory is another option, and may be very cost and
>>>> especially
>>>> time effective - at the moment your machine is swapping. Buying memory
>>>> may
>>>> save you time programming around too little memory.
>>>>
>>>> Hope this helps,
>>>>
>>>> Roger
>>>>
>>>>
>>>> ---
>>>> Roger Bivand, NHH, Helleveien 30, N-5045 Bergen,
>>>> Roger.Bivand at nhh.no
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: r-sig-geo-bounces at stat.math.ethz.ch on behalf of Ned Horning
>>>> Sent: Wed 11.02.2009 07:40
>>>> To: r-sig-geo at stat.math.ethz.ch
>>>> Subject: [R-sig-Geo] SpatialGridDataFrame to data.frame
>>>>
>>>> Greetings,
>>>>
>>>> I am trying to read an image from GRASS using the spgrass6 command
>>>> readRAST6 and then convert it into a data.frame object so I can use it
>>>> with randomForest. The byte image I'm reading is 2732 rows x 3058
>>>> columns x 3 bands. It's a small subset of a larger image I would like to
>>>> use eventually. I have no problem reading the image using readRAST6 but
>>>> when I try to convert it to a data.frame object my linux system
>>>> resources (1BG RAM, 3GB swap) nearly get maxed out and it runs for a
>>>> couple hours before I kill the process. The image is less than 25MB so
>>>> I'm surprised it requires this level of memory. Can someone let me know
>>>> why this is. Should I use something other than the GRASS interface for
>>>> this? These are the commands I'm using:
>>>>
>>>> spot <- readRAST6(c("subset.red", "subset.green", "subset.blue"))
>>>> spot_frame <- as(spot, "data.frame")
>>>>
>>>> Any help would be appreciated.
>>>>
>>>> All the best,
>>>>
>>>> Ned
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>>>>
>>>>
>>>>       [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>>>>         
>>>       
>>     
>
>



More information about the R-sig-Geo mailing list