[R-SIG-Finance] re-indexing data under the zoo package

Tue Mar 1 14:21:37 CET 2011

Thanks Gabor for the solution and the general tips. As this is a
common task for me answer no. 4 will save a lot of typing.

Aidan

On Mon, Feb 28, 2011 at 8:59 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> On Mon, Feb 28, 2011 at 2:38 PM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
>> On Mon, Feb 28, 2011 at 12:36 PM, Aidan Corcoran
>> <aidan.corcoran11 at gmail.com> wrote:
>>> Dear all,
>>>
>>> I was hoping someone could help me to generate an index based on data
>>> in zooreg format. The data are of the form
>>>
>>>> head(dq)
>>>        sphsxs eunrfi irhont
>>> 1995(1)  670.8   82.9     NA
>>> 1995(2)  686.0   82.9     NA
>>> 1995(3)  682.6   83.0     NA
>>> 1995(4)  692.7   82.7     NA
>>> 1996(1)  686.0   81.5   33.6
>>> 1996(2)  697.8   82.0   34.6
>>>
>>> and I would like to index each of the three variables to 100 in
>>> 1996(1). I have made a few failed attempts based on extracting the
>>> values at that date
>>>
>>>> dq[index(dq)==1996.00]
>>>        sphsxs eunrfi irhont
>>> 1996(1)    686   81.5   33.6
>>>
>>> and then trying to divide the series by those values
>>>
>>>> dq/dq[index(dq)==1996.00]
>>>     sphsxs eunrfi irhont
>>> 1996      1      1      1
>>>
>>> but this results in a single row. One option might be to replicate the
>>> 1996 row using rep, but
>>>
>>>> rep(dq[index(dq)==1996.00],2)
>>> [1] 686.0  81.5  33.6 686.0  81.5  33.6
>>>
>>> seems to repeat the data within a single vector, and I'm not sure how
>>> to get it to repeat the row down through a zoo object (and suspect
>>> there might be an easier way).
>>>
>>> Any help much appreciated.
>>>
>>
>> Any of these will refer to the data at 1996:
>>
>> library(zoo)
>> # dq <- ... shown at end ...
>>
>> dq[ I(1996) ]
>> dq[ "1996" ]
>> window(dq, 1996, 1996)
>>
>> 1. Here is a slight variation of Arun.stat's solution that will
>> produce values relative to 1996:
>>
>> dq100 <- dq
>> dq100[] <- t(apply(dq, 1, "/", coredata(dq["1996"])))
>>
>>> dq100
>>           sphsxs   eunrfi   irhont
>> 1995(1) 0.9778426 1.017178       NA
>> 1995(2) 1.0000000 1.017178       NA
>> 1995(3) 0.9950437 1.018405       NA
>> 1995(4) 1.0097668 1.014724       NA
>> 1996(1) 1.0000000 1.000000 1.000000
>> 1996(2) 1.0172012 1.006135 1.029762
>>
>> 2. Another thing you might consider would be to use "yearqtr" class for dq:
>>
>> # convert time to yearqtr
>> dq.yq <- dq
>> time(dq.yq) <- as.yearqtr(time(dq.ym))
>>
>> # index relative to 1996 Q1
>> dq.yq100 <- dq.ym
>> dq.yq100 <- t(apply(dq.yq, 1, "/", coredata(dq[as.yearqtr("1996 Q1")])))
>>
>>> dq.yq100
>>             [,1]     [,2]     [,3]
>> 1995 Q1 0.9778426 1.017178       NA
>> 1995 Q2 1.0000000 1.017178       NA
>> 1995 Q3 0.9950437 1.018405       NA
>> 1995 Q4 1.0097668 1.014724       NA
>> 1996 Q1 1.0000000 1.000000 1.000000
>> 1996 Q2 1.0172012 1.006135 1.029762
>>
>> 3. ts class would work here too since its regularly spaced:
>>
>> tt <- as.ts(dq)
>> tt[] <- t(apply(tt, 1, "/", window(tt, 1996, 1996)))
>> tt
>>
>> Here is the dq used above:
>>
>> dq <-
>> structure(c(670.8, 686, 682.6, 692.7, 686, 697.8, 82.9, 82.9,
>> 83, 82.7, 81.5, 82, NA, NA, NA, NA, 33.6, 34.6), .Dim = c(6L,
>> 3L), .Dimnames = list(NULL, c("sphsxs", "eunrfi", "irhont")), index = c(1995,
>> 1995.25, 1995.5, 1995.75, 1996, 1996.25), class = c("zooreg",
>> "zoo"), frequency = 4)
>
> 4. and here is one more solution that is slightly simpler as it avoids
> the transposition:
>
> dq2 <- dq
> dq2[] <- mapply("/", as.data.frame(dq), coredata(dq["1996"]))
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>