[R-sig-Geo] SpatialGridDataFrame to data.frame

Robert Hijmans r.hijmans at gmail.com
Thu Feb 12 07:40:52 CET 2009


Hi Ned,

Good to hear that. For your other question, have a look at:

require(maptools)
?readShapePoly

require(sp)
?sample.Polygons

Robert


On Thu, Feb 12, 2009 at 2:30 PM, Ned Horning <horning at amnh.org> wrote:
> Robert,
>
> This worked - thanks. It's always uplifting to see an actual image after
> working on something for a while. Now I can start playing with parameters
> and playing with different approaches. I'm just (re)starting my R education
> and I'm pretty slow getting the hang of it but your examples help a lot.
> They also give me more avenues to discover other functionality and different
> ways of doing things. I hope I can keep working with R and GRASS until I
> know it this time. My goal is to get to the point where I can be productive
> with these packages and start training other folks.
>
> The raster package looks very nice and I'll keep an eye on its development.
> One step that I am currently doing in GRASS is to read a Shapefile with
> training data (polygons with an integer attribute representing a land cover
> type) and then randomly create 100 points within each set of polygons
> representing a specific land cover type. I do this for each land cover type
> and then concatenate the results into a text file. This file has the
> coordinates that I use in xyValues to get the pixels values from the SPOT
> image. Is there a way to do this sampling using the raster package or
> another R package that you are familiar with? In GRASS I convert the
> Shapefile to a raster image before doing the random sampling and it would be
> nice if I could skip this step.
>
> All the best,
>
> Ned
>
> Robert Hijmans wrote:
>>
>> I see, in my example, I had a single quantitative variable but you
>> probably have land cover classes or something like that. If the
>> classes are in fact numbers stored as text then use
>>
>> pred <- as.numeric(pred)
>>
>> but if you have words such as 'forest', 'crops', 'water' you could do
>> something like
>>
>> ...
>>   pred <- predict(randfor, rowvals)
>>   pred[pred=='forest'] <- 1
>>   pred[pred=='crops'] <- 2
>>   pred[pred=='water'] <- 3
>>   pred <- as.numeric(pred)
>>   predrast <- setValues(predrast, pred, r)
>> ...
>>
>> not pretty, you could also fit RF with classes that can be interpreted
>> as numbers..
>> Make sure you do not get:
>>
>> Warning message:
>> NAs introduced by coercion
>>
>>
>> which would suggest that some character values could not be
>> transformed to numbers.
>>
>> On Thu, Feb 12, 2009 at 1:56 AM, Ned Horning <horning at amnh.org> wrote:
>>
>>>
>>> Robert,
>>>
>>> Using predrast <- setValues(predrast, as.vector(pred), r) I got another
>>> error: values must be numeric, integer or logical.
>>>
>>> class(pred) = "factor"
>>> dim(pred) = NULL
>>> class(v) = "character"
>>> length(v) == ncol(spot) = TRUE
>>>
>>>
>>>
>>>
>>>
>>> Ned
>>>
>>> Robert Hijmans wrote:
>>>
>>>>
>>>> Strange. You could try
>>>>     predrast <- setValues(predrast, as.vector(pred), r)
>>>>
>>>> But it would be good to know what pred is.
>>>>
>>>> Can you do this:
>>>>
>>>> class(pred)
>>>> dim(pred)
>>>> v <- as.vector(pred)
>>>> class(v)
>>>> length(v) == ncol(spot)
>>>>
>>>>
>>>> Robert
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Feb 11, 2009 at 11:16 PM, Ned Horning <horning at amnh.org> wrote:
>>>>
>>>>
>>>>>
>>>>> Robert and Roger,
>>>>>
>>>>> Thanks for the information and pointers. The raster package looks quite
>>>>> interesting and I'll try to get up to speed on some of its
>>>>> capabilities.
>>>>> Are
>>>>> the man pages the best way to do that or is that a single document
>>>>> available?
>>>>>
>>>>> I made some progress but still have some questions. I followed the
>>>>> steps
>>>>> laid out by Robert and everything went fine except I ran into an error
>>>>> with
>>>>> "predrast <- setValues(predrast, pred, r)" in the for loop when I tried
>>>>> processing one line at a time and "r <- setValues(r, pred)" when I ran
>>>>> the
>>>>> full image in one go. The error was: "values must be a vector." Any
>>>>> idea
>>>>> what I'm doing wrong?
>>>>>
>>>>> I tried to read the GRASS files directly but got a message saying it is
>>>>> not
>>>>> a supported file format. Can you confirm that is the case or am I doing
>>>>> something wrong? I was able to read a tiff version of the image. I am
>>>>> able
>>>>> to run gdalinfo on GRASS files just fine from a terminal window.
>>>>>
>>>>> Thanks again for the help.
>>>>>
>>>>> Ned
>>>>>
>>>>>
>>>>> Robert Hijmans wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> Ned,
>>>>>>
>>>>>> This is an example of running a RandomForest prediction with the
>>>>>> raster package (for the simple case that there are no NA values in the
>>>>>> raster data; if there are, you have to into account that "predict'
>>>>>> does not return any values (not even NA) for those cells).
>>>>>>
>>>>>> Robert
>>>>>>
>>>>>> #install.packages("raster", repos="http://R-Forge.R-project.org")
>>>>>> require(raster)
>>>>>> require(randomForest)
>>>>>>
>>>>>> # for single band files
>>>>>> spot <- stack('b1.tif', 'b2.tif', 'b3.tif')
>>>>>> # for multiple band files
>>>>>> # spot <- stackFromFiles(c('bands.tif', 'bands.tif', 'bands.tif'),
>>>>>> c(1,2,3) )
>>>>>>
>>>>>> # simulate random points and values to model with
>>>>>> xy <- xyFromCell(spot, round(runif(100) * ncell(spot)))
>>>>>> response <- runif(100) * 100
>>>>>> # read values of raster layers at points, and bind to respinse
>>>>>> trainvals <- cbind(response, xyValues(spot, xy))
>>>>>>
>>>>>> # run RandomForest
>>>>>> randfor <- randomForest(response ~ b1 + b2 + b3, data=trainvals)
>>>>>>
>>>>>> # apply the prediction, row by row
>>>>>> predrast <- setRaster(spot)
>>>>>> filename(predrast) <- 'RF_pred.grd'
>>>>>> for (r in 1:nrow(spot)) {
>>>>>>      spot <- readRow(spot, r)
>>>>>>      rowvals <- values(spot, names=TRUE)
>>>>>> # this next line should not be necessary, but it is
>>>>>> # I'll fix that
>>>>>>      colnames(rowvals) <- c('b1', 'b2', 'b3')
>>>>>>      pred <- predict(randfor, rowvals)
>>>>>>      predrast <- setValues(predrast, pred, r)
>>>>>>      predrast <- writeRaster(predrast, overwrite=TRUE)
>>>>>> }
>>>>>>
>>>>>> plot(predrast)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 11, 2009 at 5:09 PM, Roger Bivand <Roger.Bivand at nhh.no>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Ned:
>>>>>>>
>>>>>>>
>>>>>>> The three bands are most likely treated as 4-byte integers, so the
>>>>>>> object
>>>>>>> will be 2732 by 3058 by 3 by 4 plus a little bit. That's 100MB. They
>>>>>>> may
>>>>>>> get copied too. There are no single byte user-level containers for
>>>>>>> you
>>>>>>> (there is a raw data type, but you can't calculate with it). Possibly
>>>>>>> saying spot_frame <- slot(spot, "data") will save one copying
>>>>>>> operation,
>>>>>>> but its hard to tell - your choice of method first adds inn all the
>>>>>>> coordinates, which are 8-byte numbers, so more than doubles its size
>>>>>>> and
>>>>>>> makes more copies (to 233MB for each copy). Running gc() several
>>>>>>> times
>>>>>>> manually between steps often helps by making the garbage collector
>>>>>>> more
>>>>>>> aggressive.
>>>>>>>
>>>>>>> I would watch the developments in the R-Forge package "raster", which
>>>>>>> builds on some of these things, and try to see how that works. If you
>>>>>>> have
>>>>>>> the GDAL-GRASS plugin for rasters, you can use readGDAL to read from
>>>>>>> GRASS
>>>>>>> - which would work with raster package functions now. Look at the
>>>>>>> code
>>>>>>> of
>>>>>>> recent readRAST6 to see which incantations are needed. If you are
>>>>>>> going
>>>>>>> to
>>>>>>> use randomForest for prediction, you can use smaller tiles until you
>>>>>>> find
>>>>>>> an alternative solution. Note that feeding a data frame of integers
>>>>>>> to
>>>>>>> a
>>>>>>> model fitting or prediction function will result in coercion to a
>>>>>>> matrix of doubles, so your subsequent workflow should take that into
>>>>>>> account.
>>>>>>>  Getting more memory is another option, and may be very cost and
>>>>>>> especially
>>>>>>> time effective - at the moment your machine is swapping. Buying
>>>>>>> memory
>>>>>>> may
>>>>>>> save you time programming around too little memory.
>>>>>>>
>>>>>>> Hope this helps,
>>>>>>>
>>>>>>> Roger
>>>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>> Roger Bivand, NHH, Helleveien 30, N-5045 Bergen,
>>>>>>> Roger.Bivand at nhh.no
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: r-sig-geo-bounces at stat.math.ethz.ch on behalf of Ned Horning
>>>>>>> Sent: Wed 11.02.2009 07:40
>>>>>>> To: r-sig-geo at stat.math.ethz.ch
>>>>>>> Subject: [R-sig-Geo] SpatialGridDataFrame to data.frame
>>>>>>>
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I am trying to read an image from GRASS using the spgrass6 command
>>>>>>> readRAST6 and then convert it into a data.frame object so I can use
>>>>>>> it
>>>>>>> with randomForest. The byte image I'm reading is 2732 rows x 3058
>>>>>>> columns x 3 bands. It's a small subset of a larger image I would like
>>>>>>> to
>>>>>>> use eventually. I have no problem reading the image using readRAST6
>>>>>>> but
>>>>>>> when I try to convert it to a data.frame object my linux system
>>>>>>> resources (1BG RAM, 3GB swap) nearly get maxed out and it runs for a
>>>>>>> couple hours before I kill the process. The image is less than 25MB
>>>>>>> so
>>>>>>> I'm surprised it requires this level of memory. Can someone let me
>>>>>>> know
>>>>>>> why this is. Should I use something other than the GRASS interface
>>>>>>> for
>>>>>>> this? These are the commands I'm using:
>>>>>>>
>>>>>>> spot <- readRAST6(c("subset.red", "subset.green", "subset.blue"))
>>>>>>> spot_frame <- as(spot, "data.frame")
>>>>>>>
>>>>>>> Any help would be appreciated.
>>>>>>>
>>>>>>> All the best,
>>>>>>>
>>>>>>> Ned
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-Geo mailing list
>>>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-Geo mailing list
>>>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>



More information about the R-sig-Geo mailing list