[R-SIG-Finance] Vectorized rolling computation on xts series
Aleks Clark
aleks.clark at gmail.com
Wed Oct 7 11:11:17 CEST 2009
Another approach would be to use zoo or xts's lag function to to
generate a dataframe or matrix with the current day's data and N
previous periods in a table. If your data is fairly univariate, this
shouldn't prove a problem, just do a little math and you can specify
easily how many days of data to "go back". You'd do something like
this:
starting with:
d1
d2
d3
d4
d5
use Lag (or lag, they behave differently), then t() and apply() and end up with:
d1 na na na
d2 d1 na na
d3 d2 d1 na
d4 d3 d2 d1
d5 d4 d3 d2
you can then easily run your computations in a vectored form using the
apply family of functions.
On Wed, Oct 7, 2009 at 3:30 AM, Mark Breman <breman.mark at gmail.com> wrote:
> Hi Shane,
> I had a look at these functions but they do not satisfy my constraints:
>
> - apply.monthly works with 'calendar months', but I need a function that
> allows me to specify for instance 1995-01-06 until 1995-02-06 (i.e.
> 'duration' of one month) for the computation of element x = 1995-02-06
>
> - rollapply (and also rollmax, rollmin) need a specification of the number
> of previous elements from the series if I understand it correctly. As you
> can see in the example it is daily data but with lots of gaps, so this would
> be very difficult to do if at all possible.
>
> Thanks for your quick response though,
>
> Kind regards,
>
> -Mark-
>
> 2009/10/7 Shane <shane.conway at gmail.com>
>
>> I think you want the apply.monthly function in xts. It also has other time
>> periods (eg daily).
>>
>> You may also want to look at rollapply in zoo.
>>
>> Sent from my iPhone
>>
>>
>> On Oct 7, 2009, at 4:05 AM, Mark Breman <breman.mark at gmail.com> wrote:
>>
>> Hi,
>>> I have a univariate xts timeseries (daily data) for which I need to apply
>>> a
>>> computation for each element. The computation for element x needs the last
>>> y
>>> months of the data from the timeseries. What's more, I need a "vectorized"
>>> computation because looping over all elements is too slow (it's a large
>>> timeseries).
>>>
>>> I think this is what is called a "rolling" or "running" computation in R.
>>>
>>> The computation I need to do for element x is:
>>> - calculate the percentage of the value x within the range of values from
>>> the last y months, i.e. determine the min() and max() of the last y months
>>> of data (including x), and determine what percentage of this range the
>>> value
>>> x is. For example: min(last 1 months) == 10, max(last 1 months) == 50, x
>>> ==
>>> 20 would yield: 25%
>>> - elements for which y months of previous data (including x itself) is not
>>> available should become NaN or some other "special value".
>>>
>>> An example
>>> So let's say I have a timeseries called "data":
>>>
>>> data
>>>>
>>> NonCommNet
>>> 1995-01-03 44580
>>> 1995-01-04 44580
>>> 1995-01-05 44580
>>> 1995-01-06 44580
>>> 1995-01-09 44580
>>> 1995-01-10 32835
>>> 1995-01-11 32835
>>> 1995-01-12 32835
>>> 1995-01-13 32835
>>> 1995-01-16 32835
>>> 1995-01-17 38385
>>> 1995-01-18 38385
>>> 1995-01-19 38385
>>> 1995-01-20 38385
>>> 1995-01-23 38385
>>> 1995-01-24 19150
>>> 1995-01-25 19150
>>> 1995-01-26 19150
>>> 1995-01-27 19150
>>> 1995-01-30 19150
>>> 1995-01-31 15245
>>> 1995-02-01 15245
>>> 1995-02-02 15245
>>> 1995-02-03 15245
>>> 1995-02-06 15245
>>> 1995-02-07 24110
>>> 1995-02-08 24110
>>> 1995-02-09 24110
>>> 1995-02-10 24110
>>> 1995-02-13 24110
>>> 1995-02-14 17615
>>> 1995-02-15 17615
>>> 1995-02-16 17615
>>> 1995-02-17 17615
>>> 1995-02-21 -23080
>>> 1995-02-22 -23080
>>> 1995-02-23 -23080
>>> 1995-02-24 -23080
>>> 1995-02-27 -23080
>>> 1995-02-28 -17445
>>>
>>> I tried the following "vectorized" solution ( example with y = 1 month):
>>>
>>>> ((data - min(last(data, "1 months"))) / (max(last(data, "1 months")) -
>>>>
>>> min(last(data, "1 months")))) * 100
>>> NonCommNet
>>> 1995-01-03 143.37783
>>> 1995-01-04 143.37783
>>> 1995-01-05 143.37783
>>> 1995-01-06 143.37783
>>> 1995-01-09 143.37783
>>> 1995-01-10 118.48909
>>> 1995-01-11 118.48909
>>> 1995-01-12 118.48909
>>> 1995-01-13 118.48909
>>> 1995-01-16 118.48909
>>> 1995-01-17 130.25005
>>> 1995-01-18 130.25005
>>> 1995-01-19 130.25005
>>> 1995-01-20 130.25005
>>> 1995-01-23 130.25005
>>> 1995-01-24 89.48930
>>> 1995-01-25 89.48930
>>> 1995-01-26 89.48930
>>> 1995-01-27 89.48930
>>> 1995-01-30 89.48930
>>> 1995-01-31 81.21424
>>> 1995-02-01 81.21424
>>> 1995-02-02 81.21424
>>> 1995-02-03 81.21424
>>> 1995-02-06 81.21424
>>> 1995-02-07 100.00000
>>> 1995-02-08 100.00000
>>> 1995-02-09 100.00000
>>> 1995-02-10 100.00000
>>> 1995-02-13 100.00000
>>> 1995-02-14 86.23649
>>> 1995-02-15 86.23649
>>> 1995-02-16 86.23649
>>> 1995-02-17 86.23649
>>> 1995-02-21 0.00000
>>> 1995-02-22 0.00000
>>> 1995-02-23 0.00000
>>> 1995-02-24 0.00000
>>> 1995-02-27 0.00000
>>> 1995-02-28 11.94109
>>>
>>> This does not satisfy my constraints because:
>>> 1) the first month of data should have become NaN or some other special
>>> value as there is not a full month of previous data available. I think
>>> this
>>> is caused by the last() function which simply returns the available data
>>> if
>>> the requested amount of data is greater than the available amount of data.
>>> 2) the results for the second month of data are wrong. For instance look
>>> at
>>> the result for 1995-02-06 which is 81.21424%. This should have been 0%.
>>> The
>>> last months min() is 15245 (from 1995-02-06), the max() is 44580 (from
>>> element 1995-01-06) so it should yield 0%.
>>>
>>> From analyzing the results I get the impression that the last() function
>>>> is
>>>>
>>> not suited for a "vectorized" solution but I'm not really sure...
>>>
>>> I also had a look at runMin() and runMax() from the TTR package, but you
>>> can't specify a calendar range with these functions as you can with last()
>>> and first() from the xts package.
>>>
>>> Now my question is: am I doing something wrong here or do you know another
>>> vectorized function that satisfies my constraints?
>>>
>>> Kind regards,
>>>
>>> -Mark-
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-SIG-Finance at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>>> -- Subscriber-posting only.
>>> -- If you want to post, subscribe first.
>>>
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only.
> -- If you want to post, subscribe first.
>
--
Aleks Clark
More information about the R-SIG-Finance
mailing list