[R] data.frame and ddply
arnaud Gaboury
arnaud.gaboury at gmail.com
Thu Apr 15 18:04:32 CEST 2010
Dear group,
I have this following data.frame:
> c
DESCRIPTION CREATED.DATE QUANITY CLOSING.PRICE
26 PRM HGH GD ALU 2010-04-09 -1 2,415.9000
27 PRM HGH GD ALU 2010-04-09 1 2,415.9000
28 PRIMARY NICKEL 2010-03-04 1 25,755.7100
29 PRIMARY NICKEL 2010-03-05 -1 25,755.7100
30 PRIMARY NICKEL 2010-03-10 -1 25,760.8600
31 PRIMARY NICKEL 2010-03-10 1 25,760.8600
32 STANDARD LEAD 2010-04-01 1 2,355.9600
33 STANDARD LEAD 2010-04-01 -1 2,355.9600
34 STANDARD LEAD 2010-04-01 1 2,355.9600
35 STANDARD LEAD 2010-04-01 -1 2,355.9600
36 STANDARD LEAD 2010-04-01 -1 2,355.9600
37 STANDARD LEAD 2010-04-01 1 2,355.9600
38 STANDARD LEAD 2010-04-01 -1 2,355.9600
39 STANDARD LEAD 2010-04-06 1 2,357.1200
40 STANDARD LEAD 2010-04-08 1 2,420.7300
41 SPCL HIGH GRAD 2010-04-08 1 2,420.7300
42 SPCL HIGH GRAD 2010-04-08 -1 2,420.7300
43 SPCL HIGH GRAD 2010-04-09 -1 2,421.0500
44 SPCL HIGH GRAD 2010-04-09 1 2,421.0500
45 SPCL HIGH GRAD 2010-04-09 -1 2,421.0500
46 SPCL HIGH GRAD 2010-04-09 1 2,421.0500
47 SPCL HIGH GRAD 2010-04-09 1 2,421.0500
48 SPCL HIGH GRAD 2010-01-13 1 2,388.4300
49 SPCL HIGH GRAD 2010-01-25 -1 2,388.4300
The goal is to get a smaller df, with only PRM HGH GD ALU, PRIMARY NICKEL,
STANDARD LEAD, STANDARD LEAD as rows, with QUANITY column as a sum of
QUANITY, DATE column as the max CREATED.DATE with the corresponding
CLOSING.PRICE.
If I pass this :
> op=ddply(c, c("DESCRIPTION"), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE))
It returns this :
> op
DESCRIPTION POSITION DATE
1 PRIMARY NICKEL 0 2010-03-10
2 PRM HGH GD ALU 0 2010-04-09
3 SPCL HIGH GRAD 2 2010-04-09
4 STANDARD LEAD 0 2010-04-06
Not bad, as I have my 4 elements, the sum of QUANITY for each one, the max
DATE for each one, BUT I would like to add the CLOSING.PRICE column with the
CLOSING.PRICE corresponding to the max date.
I have no idea how to do this.
Any help would be appreciated.
TY
More information about the R-help
mailing list