[R] How to use ddply
Amitabh Dugar
cleverchap at yahoo.com
Mon Jan 13 22:29:42 CET 2014
I have never used R-help to pose a question to the R-users community; is sending this Email the right way to do so?
I am trying to use the ddply function in the plyr package to accomplish the following:
I have a data frame of the type:
ticker monthend_n wgtdiff ret
156 AA 19990228 0.7172 -2.58
545 AAPL 19990228 -0.0828 -15.48
925 ABCW 19990228 0.0966 -7.36
1041 ABFS 19990228 0.1320 -8.89
1165 ABI 19990228 0.2355 4.61
1482 ABS 19990228 0.1668 -6.56
1563 ABT 19990228 0.1650 -0.27
1790 ACAT 19990228 0.1540 -13.82
2498 ACN 19990228 0.0000 12.15
2532 ACO 19990228 0.1320 8.48
2857 ACV 19990228 0.1540 -6.54
2942 ACXM 19990228 0.0000 -6.13
3303 ADCT 19990228 0.1035 1.73
3568 ADM 19990228 0.1540 0.33
4072 ADSK 19990228 -0.1035 -9.19
4672 AEH 19990228 0.1650 NA
4673 AEIC 19990228 0.1314 -6.95
4867 AEP 19990228 0.1540 -3.62
157 AA 19990331 0.1932 1.70
546 AAPL 19990331 0.0330 3.23
1005 ABF 19990331 0.1540 -20.51
1166 ABI 19990331 0.2860 8.33
1255 ABK 19990331 0.0966 -3.57
1483 ABS 19990331 0.0000 -4.50
1564 ABT 19990331 0.3955 1.08
1733 ABX 19990331 0.2340 -3.53
2533 ACO 19990331 0.0966 5.26
3304 ADCT 19990331 0.2925 17.75
3418 ADI 19990331 0.2688 18.70
3724 ADP 19990331 0.1540 -38.43
4514 AEE 19990331 0.1540 -1.31
4868 AEP 19990331 -0.0966 -4.65
I am trying to generate quintile cutoff points across the distribution of tickers for every month, using the command:
> result <- ddply(test, .(monthend_n), .fun=cut, test$wgtdiff,5)
I get the message:
Error in cut.default(piece, ...) : 'x' must be numeric
I tried creating a monthly list of data frames, extracting the wgtdiff column and passing that into the cut function, but that did not work either (as below)
pieces <- split(test,test$monthend_n)
vectors<- lapply(pieces,"[[","wgtdiff")
quintiles <- lapply(vectors,cut(vectors[1:2],5))
Error in cut.default(vectors[1:2], 5) : 'x' must be numeric
However, the cut function does the job correctly when I pass it only an individual month's data, as below:
first <- pieces[[1]]
quintiles <- cut(first$wgtdiff,5)
levels(quintiles)
What is the correct way to solve this problem?
Thanks for your help, everyone!
More information about the R-help
mailing list