[R] Securities earning covariance
ANGELO.LINARDI at bancaditalia.it
ANGELO.LINARDI at bancaditalia.it
Thu Jun 5 17:41:44 CEST 2008
Good morning,
I am a new R user and I am trying to learn how to use it.
I am trying to solve this problem.
I have a dataframe df of daily securities (for a year) earnings as
follows:
SEC_ID DAY EARNING
IT0000001 20070101 5.467
IT0000001 20070102 5.456
IT0000001 20070103 4.954
IT0000001 20070104 3.456
..........................
IT0000002 20070101 1.456
IT0000002 20070102 1.345
IT0000002 20070103 1.233
..........................
IT0000003 20070101 0.345
IT0000003 20070102 0.367
IT0000003 20070103 0.319
..........................
And so on: about 800 different SEC_ID and about 180000 rows.
I have to calculate the "covariance" for each couple of securities x and
y according to the formula:
Cov(x,y) = (sum[(x-x')*(y-y')]/N)/(sx*sy)
being x' and y' the mean of securities earning in the year, N the number
of observations, sx and sy the standard deviation of x and y.
To do this I could build a df2 data frame like this:
DAY SEC_ID.x SEC_ID.y EARNING.x
EARNING.y x' y' sx sy
20070101 IT0000001 IT0000002 5.467 1.456
a b aa bb
20070101 IT0000001 IT0000003 5.467 0.345
a c aa cc
20070101 IT0000002 IT0000003 1.456 0.345
b c bb cc
20070102 IT0000001 IT0000002 5.456 1.345
a b aa bb
20070102 IT0000001 IT0000003 5.456 0.367
a c aa cc
20070102 IT0000002 IT0000003 1.345 0.367
b c bb cc
........................................................................
.......................................................
(merging df with itself with a condition SEC_ID.x < SEC_ID.y) and then
easily calculate the formula; but the dimensions are too big (the
process stops whit an out-of-memory message).
Besides partitioning the input and using a loop, are there any smarter
solutions (eventually using split and other ways of "subgroup merging"
to solve the problem ?
Are there any "shortcuts" using statistical built-in functions (e.g.
cov, vcov) ?
Thank you in advance
Angelo Linardi
** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona fede e non
comportano alcun vincolo ne' creano obblighi per la Banca stessa, salvo che cio' non
sia espressamente previsto da un accordo scritto.
Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore, La preghiamo di
comunicarne via e-mail la ricezione al mittente e di distruggerne il contenuto. La
informiamo inoltre che l'utilizzo non autorizzato del messaggio o dei suoi allegati
potrebbe costituire reato. Grazie per la collaborazione.
-- E-mails from the Bank of Italy are sent in good faith but they are neither binding on
the Bank nor to be understood as creating any obligation on its part except where
provided for in a written agreement. This e-mail is confidential. If you have received it
by mistake, please inform the sender by reply e-mail and delete it from your system.
Please also note that the unauthorized disclosure or use of the message or any
attachments could be an offence. Thank you for your cooperation. **
More information about the R-help
mailing list