[R-SIG-Finance] question on zoo data manipulation

Achim Zeileis Achim.Zeileis at wu-wien.ac.at
Mon Apr 14 23:39:31 CEST 2008


On Mon, 14 Apr 2008, Manoj wrote:

> Hi Zoo-experts,
>       I am working on the data-set below.
>
> Ticker	Date	BrokerName	Acc_Yr	Measure	lag
> XXX	20080320	BRK1	200806	2.2	0
> XXX	20080320	BRK1	200906	2.5	0
> XXX	20080320	BRK2	200806	2.3	0
> XXX	20080320	BRK2	200906	2.8	0
> XXX	20080320	BRK3	200806	3.3	0
> XXX	20080218	BRK1	200806	2.2	1
> XXX	20080218	BRK1	200906	2.5	1
> XXX	20080218	BRK2	200806	2.4	1
> XXX	20080218	BRK2	200906	2.8	1

The data is not really a straightforward time series but has more
structure, like a panel data set. Hence, I wouldn't try to represent it in
zoo in its un-aggregated form. Instead I would put it into a data.frame
using appropriate classes for the colums, e.g., "Date" for the Date and
"factor" for the BrokerName etc.

Then I would use the aggregate() method for data.frames to accomplish the
aggregation you look for. You can then collect various aggregations of
your data in a zoo object (if you've got unique Dates after aggregation).

hth,
Z

>
>
> Using zoo object, Is there a quicker/efficient way of manipulating the
> data as per following criteria?
>
> 1) For any given date/lag - compute mean of column "measure" grouped
> by different broker & different accounting year?
>           so the output data-set should look like:
>
> Ticker 	Date	Mean Measure	Acc_Yr	Lag
> XXX	20080320	2.6	200806	0
>
> 2) For any lag >= 1, calculate returns on  aggregate "measure"
> constrained on "intersection" of broker-name across lag 0 & lag 1 (so
> BRK3 should drop out) ?
>
> i.e:  the intermediate data-set should look like
>
> Ticker 	Date	Mean Measure	Acc_Yr	Lag
> XXX	20080320	2.25	200806	0
> XXX	20080318	2.3	200806	1
>
>
> Note that for 200806, the mean changes from 2.6 as measured above to
> 2.25 (since BRK3 is dropped in calculation.  The final data-set should
> then be:
>
> Ticker 	Date	Pct_Change	Acc_Yr	Lag
> XXX	20080218	0.02	200806	1
>
> --------------------
>
> I can accomplish the results using a combination of tapply &
> subsetting the data-set for each lag but I thought this kind of
> data-structure is ideal for zoo manipulation, hence the help request.
>
> Thanks in Advance.
>
> Manoj
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only.
> -- If you want to post, subscribe first.
>
>



More information about the R-SIG-Finance mailing list