[Rd] [.raster bug {was "str() on raster objects fails .."}

Martin Maechler maechler at stat.math.ethz.ch
Mon Feb 7 08:36:41 CET 2011


>>>>> Simon Urbanek <simon.urbanek at r-project.org>
>>>>>     on Sun, 6 Feb 2011 20:53:01 -0500 writes:

    > On Feb 6, 2011, at 8:10 PM, Paul Murrell wrote:

    >> Hi
    >> 
    >> On 3/02/2011 1:23 p.m., Simon Urbanek wrote:
    >>> 
    >>> On Feb 2, 2011, at 7:00 PM, Paul Murrell wrote:
    >>> 
    >>>> Hi
    >>>> 
    >>>> Martin Maechler wrote:
    >>>>> On Wed, Feb 2, 2011 at 23:30, Simon
    >>>>> Urbanek<simon.urbanek at r-project.org> wrote:
>>>>> On Feb 1, 2011, at 8:16 PM, Paul Murrell wrote:
>>>>> 
    >>>>>>> Hi
    >>>>>>> 
    >>>>>>> On 2/02/2011 2:03 p.m., Henrik Bengtsson wrote:
    >>>>>>>> On Tue, Feb 1, 2011 at 4:46 PM, Paul
    >>>>>>>> Murrell<p.murrell at auckland.ac.nz> wrote:
    >>>>>>>>> Hi
    >>>>>>>>> 
    >>>>>>>>> On 1/02/2011 9:22 p.m., Martin Maechler wrote:
    >>>>>>>>>>>>>>> Henrik Bengtsson<hb at biostat.ucsf.edu> on
    >>>>>>>>>>>>>>> Mon, 31 Jan 2011 11:16:59 -0800 writes:
    >>>>>>>>>>> Hi, str() on raster objects fails for certain
    >>>>>>>>>>> dimensions.  For example:
    >>>>>>>>>> 
    >>>>>>>>>>>> str(as.raster(0, nrow=1, ncol=100)) 'raster'
    >>>>>>>>>>>> chr [1, 1:100]
    >>>>>>>>>>> "#000000" "#000000" "#000000" "#000000" ...
    >>>>>>>>>> 
    >>>>>>>>>>>> str(as.raster(0, nrow=1, ncol=101)) Error in
    >>>>>>>>>>>> `[.raster`(object,
    >>>>>>>>>>> seq_len(max.len)) : subscript out of bounds
    >>>>>>>>>> 
    >>>>>>>>>>> This seems to do with how str() and "[.raster"()
    >>>>>>>>>>> is coded; when subsetting as a vector, which
    >>>>>>>>>>> str() relies on, "[.raster"() still returns a
    >>>>>>>>>>> matrix-like object, e.g.
    >>>>>>>>>> 
    >>>>>>>>>>>> img<- as.raster(1:25, max=25, nrow=5, ncol=5);
    >>>>>>>>>>>> img[1:2]
    >>>>>>>>>>> [,1] [,2] [,3] [,4] [,5] [1,] "#0A0A0A"
    >>>>>>>>>>> "#3D3D3D" "#707070" "#A3A3A3" "#D6D6D6" [2,]
    >>>>>>>>>>> "#141414" "#474747" "#7A7A7A" "#ADADAD"
    >>>>>>>>>>> "#E0E0E0"
    >>>>>>>>>> 
    >>>>>>>>>>> compare with:
    >>>>>>>>>> 
    >>>>>>>>>>>> as.matrix(img)[1:2]
    >>>>>>>>>>> [1] "#0A0A0A" "#3D3D3D"
    >>>>>>>>>> 
    >>>>>>>>>> 
    >>>>>>>>>>> The easy but incomplete fix is to do:
    >>>>>>>>>> 
    >>>>>>>>>>> str.raster<- function(object, ...) {
    >>>>>>>>>>> str(as.matrix(object), ...); }
    >>>>>>>>>> 
    >>>>>>>>>>> Other suggestions?
    >>>>>>>>>> 
    >>>>>>>>>> The informal "raster" class is behaving
    >>>>>>>>>> ``illogical'' in the following sense:
    >>>>>>>>>> 
    >>>>>>>>>>> r<- as.raster(0, nrow=1, ncol=11)
    >>>>>>>>>>> r[seq_along(r)]
    >>>>>>>>>> Error in `[.raster`(r, seq_along(r)) : subscript
    >>>>>>>>>> out of bounds
    >>>>>>>>>> 
    >>>>>>>>>> or, here equivalently,
    >>>>>>>>>>> r[1:length(r)]
    >>>>>>>>>> Error in `[.raster`(r, 1:length(r)) : subscript
    >>>>>>>>>> out of bounds
    >>>>>>>>>> 
    >>>>>>>>>> When classes do behave in such a way, they
    >>>>>>>>>> definitely need their own str() method.
    >>>>>>>>>> 
    >>>>>>>>>> However, the bug really is in "[.raster":
    >>>>>>>>>> Currently, r[i] is equivalent to r[i,] which is
    >>>>>>>>>> not at all matrix-like and its help clearly says
    >>>>>>>>>> that subsetting should work as for matrices. A
    >>>>>>>>>> recent thread on R-help/R-devel has mentioned the
    >>>>>>>>>> fact that "[" methods for matrix-like methods
    >>>>>>>>>> need to use both nargs() and missing() and that
    >>>>>>>>>> "[.dataframe" has been the example to follow
    >>>>>>>>>> "forever", IIRC already in S and S-plus as of 20
    >>>>>>>>>> years ago.
    >>>>>>>>> The main motivation for non-standard behaviour
    >>>>>>>>> here is to make sure that a subset of a raster
    >>>>>>>>> object NEVER produces a vector (because the
    >>>>>>>>> conversion back to a raster object then produces a
    >>>>>>>>> single-column raster and that may be a
    >>>>>>>>> "surprise").  Thanks for making the code more
    >>>>>>>>> standard and robust.
    >>>>>>>>> 
    >>>>>>>>> The r[i] case is still tricky.  The following
    >>>>>>>>> behaviour is quite convenient ...
    >>>>>>>>> 
    >>>>>>>>> r[r == "black"]<- "white"
    >>>>>>>>> 
    >>>>>>>>> ... but the next behaviour is quite jarring (at
    >>>>>>>>> least in terms of the raster image that results
    >>>>>>>>> from it) ...
    >>>>>>>>> 
    >>>>>>>>> r2<- r[1:(nrow(r) + 1)]
    >>>>>>>>> 
    >>>>>>>>> So I think there is some justification for further
    >>>>>>>>> non-standardness to try to ensure that the subset
    >>>>>>>>> of a raster image always produces a sensible
    >>>>>>>>> image.  A simple solution would be just to outlaw
    >>>>>>>>> r[i] for raster objects and force the user to
    >>>>>>>>> write r[i, ] or r[, j], depending on what they
    >>>>>>>>> want.
    >>>>>>>> FYI, I've tried out Martin's updated version at it
    >>>>>>>> seems like a one-column raster matrix is now
    >>>>>>>> returned for r[i], e.g.
    >>>>>>> Yes, that's what I've been looking at ...
    >>>>>>> 
    >>>>>>>>> r<- as.raster(1:8, max=8, nrow=2, ncol=4); r
    >>>>>>>> [,1] [,2] [,3] [,4] [1,] "#202020" "#606060"
    >>>>>>>> "#9F9F9F" "#DFDFDF" [2,] "#404040" "#808080"
    >>>>>>>> "#BFBFBF" "#FFFFFF"
    >>>>>>>> 
    >>>>>>>>> r[1:length(r)]
    >>>>>>>> [,1] [1,] "#202020" [2,] "#404040" [3,] "#606060"
    >>>>>>>> [4,] "#808080" [5,] "#9F9F9F" [6,] "#BFBFBF" [7,]
    >>>>>>>> "#DFDFDF" [8,] "#FFFFFF"
    >>>>>>> ... and the above is exactly the sort of thing that
    >>>>>>> will fry your mind if the image that you are
    >>>>>>> subsetting is, for example, a photo.
    >>>>>>> 
>>>>> Why doesn't raster behave consistently like any matrix
    >>>>>>> object?
>>>>> I would expect simply
>>>>> 
    >>>>>>> r[1:length(r)]
>>>>> [1] "#202020" "#404040" "#606060" "#808080" "#9F9F9F"
    >>>>>>> "#BFBFBF"
>>>>> "#DFDFDF" [8] "#FFFFFF"
>>>>> 
>>>>> Where it's obvious what happened. I saw the comment about
    >>>>>>> the
>>>>> vector but I'm not sure I get it - why don't you want a
    >>>>>>> vector?
>>>>> The raster is no different than matrices - you still need
    >>>>>>> to
>>>>> define the dimensions when going back anyway, moreover
    >>>>>>> what you
>>>>> get now is not consistent at all since there raster never
    >>>>>>> had
>>>>> that dimension anyway ...
>>>>> 
>>>>> Cheers, Simon
    >>>>> I agree that this would be the most "logical" and
    >>>>> notably least surprising behavior, which I find the
    >>>>> most important argument (I'm sorry my last message was
    >>>>> cut off as it was sent accidentally before being
    >>>>> finished completely).
    >>>> 
    >>>> I think this behaviour might surprise some ...
    >>>> 
    >>>> download.file("http://cran.r-project.org/Rlogo.jpg",
    >>>> "Rlogo.jpg") library(ReadImages) logo<-
    >>>> read.jpeg("Rlogo.jpg")
    >>>> 
    >>>> rLogo<- as.raster(logo) rLogoBit<- rLogo[50:60, 50:60]
    >>>> 
    >>>> library(grid) # Original image grid.raster(rLogoBit)
    >>>> grid.newpage() # Subset produces a vector
    >>>> grid.raster(rLogoBit[1:length(rLogoBit)])
    >>>> 
    >>> 
>> But this should fail IMHO since you're supplying a vector but
    >>> grid.raster (assuming it's the same as rasterImage)
    >>> requires a matrix - exactly as you would expect in the
    >>> matrix case - if a function requires a matrix and you
    >>> pass a vector, it will bark. I think you are explaining
    >>> why going to vector *is* desirable ;). In the current
    >>> case it simply generates the wrong dimensions instead of
    >>> resulting in a vector, right?
    >> 
    >> The raster subsetting always produces a raster, but
    >> grid.raster() works with vectors anyway because
    >> as.raster() has a vector method.
    >> 

   > Well, isn't that the actual problem? ;) It could make sense but it
   > should fail if dimensions are not specified for exactly the reason you
   > mentioned - it is fatal if what you have is really an image ...

    > Cheers, Simon


    >> Anyway, I'm happy to go with things as they now are.  I
    >> think at worst it will encourage people to specify two
    >> indices when subsetting a raster object, and that's not a
    >> bad thing.
    >> 
    >> Paul

I and (maybe others) are getting a bit lost..

AFAIK:

- Simon proposes that     r[i]  should return a simple character vector
  such that raster images behave more naturally like matrices.

- Paul  seems happy with  r[i]  returning  a (k x 1) raster object
  -- where  k  almost completely unrelated to the original
  dim(r) -- with the argument that raster subsetting must always
  return a "raster".

My vote would be for Simon's proposal, hence raster subsetting
should return a raster only when  [i,j] or [i,] or [,j]  syntax
is used.

Martin



More information about the R-devel mailing list