[R] bug in rle?

Philippe Grosjean phgrosjean at sciviews.org
Wed Jan 8 20:36:30 CET 2014


Wouldn't it make sense to be able to use rle() on factor/ordered too?
For instance:

rle2 <- function (x) {
    if (!is.factor(x)) return(rle(x))
        
    ## Special case for factor and ordered
    res <- rle(as.integer(x))
    ## Change $values into factor or ordered with correct levels
    if (is.ordered(x)) {
        res$values <- ordered(res$values, levels = levels(x))
    } else res$values <- factor(res$values, levels = levels(x))
    res
}

## Example
fac <- factor(sample(1:3, 20, replace = TRUE))
ord <- as.ordered(fac)
(fac.rle <- rle2(fac))
(ord.rle <- rle2(ord))

## Inverse.rle() does not need to change:
identical(fac, inverse.rle(fac.rle))
identical(ord, inverse.rle(ord.rle))

Best,

Philippe Grosjean

On 08 Jan 2014, at 18:48, Bert Gunter <gunter.berton at gene.com> wrote:

> Thanks Bill:
> 
> Personally, I don't need it. Once Brian made me aware of the
> underlying issue, I can handle it.
> 
> Cheers,
> Bert
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
> 
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> H. Gilbert Welch
> 
> 
> 
> 
> On Wed, Jan 8, 2014 at 9:30 AM, William Dunlap <wdunlap at tibco.com> wrote:
>> If you need an rle for factor data (or lists, or anything for
>> which match(), unique(), and x[i] act in a coherent way), try the
>> following.  It is based on the S+, all-S code, version of rle.
>> 
>> (It does not work on data.frames because unique is row oriented
>> and match is column oriented for data.frames.  If that were
>> changed, it still would need a x[ends,] instead of x[ends] in the
>> closing statement.)
>> 
>> myRle <- function (x)
>> {
>>    if (length(x) == 0) {
>>        list(lengths = integer(0L), values = x)
>>    }
>>    else {
>>        x.int <- match(x, unique(x))
>>        ends <- c(diff(x.int) != 0L, TRUE)
>>        list(lengths = diff(c(0L, seq(along = x)[ends])), values = x[ends])
>>    }
>> }
>> 
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>> 
>> 
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>>> Of Bert Gunter
>>> Sent: Wednesday, January 08, 2014 8:56 AM
>>> To: Prof Brian Ripley
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] bug in rle?
>>> 
>>> Thank you Brian for your clear and informative answer. I was
>>> (obviously!) unaware of this and appreciate the response.
>>> 
>>> Best,
>>> Bert
>>> 
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>> (650) 467-7374
>>> 
>>> "Data is not information. Information is not knowledge. And knowledge
>>> is certainly not wisdom."
>>> H. Gilbert Welch
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Jan 8, 2014 at 8:53 AM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
>>>> On 08/01/2014 16:23, Bert Gunter wrote:
>>>>> 
>>>>> Is the following a bug?
>>>>> ##(R version 3.0.2 (2013-09-25)
>>>>> ## Platform: i386-w64-mingw32/i386 (32-bit))
>>>>> 
>>>>> 
>>>>> d <- data.frame(a=rep(letters[1:3],4:6))
>>>>>  rle(d$a)
>>>>> ##Error in rle(d$a) : 'x' must be an atomic vector
>>>>> 
>>>>> is.atomic(d$a)
>>>>> ##[1] TRUE
>>>> 
>>>> 
>>>> But
>>>> 
>>>>> is.vector(d$a)
>>>> [1] FALSE
>>>> 
>>>> The discrepancies in what a 'vector' is in R are very long standing, but a
>>>> factor is not a vector.
>>>> 
>>>> 
>>>>> rle(c(d$a))
>>>> 
>>>> 
>>>> That loses the class and other attributes, giving a vector.
>>>> 
>>>>> ## Run Length Encoding
>>>>> ##  lengths: int [1:3] 4 5 6
>>>>>  ##  values : int [1:3] 1 2 3
>>>>> 
>>>>> Cheers,
>>>>> Bert
>>>>> 
>>>>> Bert Gunter
>>>>> Genentech Nonclinical Biostatistics
>>>>> (650) 467-7374
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>>>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>>>> University of Oxford,             Tel:  +44 1865 272861 (self)
>>>> 1 South Parks Road,                     +44 1865 272866 (PA)
>>>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 




More information about the R-help mailing list