[R-SIG-Finance] removing repeating values from xts series

Wed Sep 15 09:33:14 CEST 2010

Hi Ulrich
I see. Ad hoc I'd use rle (run length encoding) and some function of cumsum(rle(y)$lengths) to get indexes of non-duplicates.
Regards, david

-----Original Message-----
From: Ulrich Staudinger [mailto:ustaudinger at gmail.com] 
Sent: Wednesday, September 15, 2010 9:25 AM
To: Lüthi David (XICD 1)
Cc: r-sig-finance
Subject: Re: [R-SIG-Finance] removing repeating values from xts series

Hi David,

as far as I understand, duplicated works from the inner workings very
much like unique.

With a vector y (in this case no timeseries), duplicated yields:
> y
[1] 1 1 2 3 2 2 2 2 1
> duplicated(y)
[1] FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE

But what I would like to have is:
FALSE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE
or ...
1 2 3 2 1

I am not so sure that duplicated is what I want, unless I didn't spot
something ... some other approach maybe?

Regards,
Ulrich

On Wed, Sep 15, 2010 at 9:08 AM, Lüthi David (XICD 1)
<david.luethi at claridenleu.com> wrote:
> Ulrich,
> try duplicated(xts.object, ...) or possibly duplicated(as.data.frame(xts.object), ...) if all columns should be considered.
> Regards, david
>
> -----Original Message-----
> From: r-sig-finance-bounces at stat.math.ethz.ch [mailto:r-sig-finance-bounces at stat.math.ethz.ch] On Behalf Of Ulrich Staudinger
> Sent: Wednesday, September 15, 2010 8:28 AM
> To: r-sig-finance
> Subject: [R-SIG-Finance] removing repeating values from xts series
>
> Hi fellows,
>
> I am facing a case that I cannot solve with my limited knowledge of R,
> unless I write the function myself - which I would like to avoid
> (reusing is better than reinventing the wheel). Following the relevant
> information.
>
> Input scenario:
> An xts time series object with duplicates, the object contains bid,
> bid volume, ask, ask volume.
> Example:
> 01-01-2010 09:00:01     100     1       101     1
> 01-01-2010 09:00:02     100     1       101     1
> 01-01-2010 09:00:03     100     1       101     1
> 01-01-2010 09:00:04     101     1       102     1
> 01-01-2010 09:00:05     102     1       102     1
> 01-01-2010 09:00:06     100     1       101     1
> ...
>
> Goal:
> A timeseries with only non-repeating values, removing the duplicates
> in between the values.
>
> I tried "unique" already, but that one returns only the unique values
> from within the whole timeseries and not on a running base.
>
>
> Example code:
> The following example code exemplifies with a non-xts series what I
> want to achieve ...
>> y = c(1,1,2,2,1,1,1,2,3,4,3,3,3,3,3,1)
>> removeDuplicates <- function(input)
> {
>        index = 2
>        ret = c(input[1])
>        for(i in 2:length(input))
>        {
>                if(input[i]!=input[i-1])
>                {
>                        ret[index] = input[i]
>                        index = index + 1
>                }
>        }
>        ret
> }
>>
>> removeDuplicates(y)
> [1] 1 2 1 2 3 4 3 1
>>
>
>
>
> How can I make this with an xts series? Is there a function for this?
>
> Thanks in advance,
> with kind regards,
> Ulrich
>
> --
> Ulrich Staudinger
> activequant.org
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
>

-- 
Ulrich Staudinger
activequant.org