[R] data.frame and ddply

arnaud Gaboury arnaud.gaboury at gmail.com
Thu Apr 15 18:04:32 CEST 2010


Dear group,

I have this following data.frame:


> c
      DESCRIPTION CREATED.DATE QUANITY CLOSING.PRICE
26 PRM HGH GD ALU   2010-04-09      -1    2,415.9000
27 PRM HGH GD ALU   2010-04-09       1    2,415.9000
28 PRIMARY NICKEL   2010-03-04       1   25,755.7100
29 PRIMARY NICKEL   2010-03-05      -1   25,755.7100
30 PRIMARY NICKEL   2010-03-10      -1   25,760.8600
31 PRIMARY NICKEL   2010-03-10       1   25,760.8600
32 STANDARD LEAD    2010-04-01       1    2,355.9600
33 STANDARD LEAD    2010-04-01      -1    2,355.9600
34 STANDARD LEAD    2010-04-01       1    2,355.9600
35 STANDARD LEAD    2010-04-01      -1    2,355.9600
36 STANDARD LEAD    2010-04-01      -1    2,355.9600
37 STANDARD LEAD    2010-04-01       1    2,355.9600
38 STANDARD LEAD    2010-04-01      -1    2,355.9600
39 STANDARD LEAD    2010-04-06       1    2,357.1200
40 STANDARD LEAD    2010-04-08       1    2,420.7300
41 SPCL HIGH GRAD   2010-04-08       1    2,420.7300
42 SPCL HIGH GRAD   2010-04-08      -1    2,420.7300
43 SPCL HIGH GRAD   2010-04-09      -1    2,421.0500
44 SPCL HIGH GRAD   2010-04-09       1    2,421.0500
45 SPCL HIGH GRAD   2010-04-09      -1    2,421.0500
46 SPCL HIGH GRAD   2010-04-09       1    2,421.0500
47 SPCL HIGH GRAD   2010-04-09       1    2,421.0500
48 SPCL HIGH GRAD   2010-01-13       1    2,388.4300
49 SPCL HIGH GRAD   2010-01-25      -1    2,388.4300

The goal is to get a smaller df, with only PRM HGH GD ALU, PRIMARY NICKEL,
STANDARD LEAD, STANDARD LEAD as rows, with QUANITY column as a sum of
QUANITY, DATE column as the max CREATED.DATE with the corresponding
CLOSING.PRICE.

If I pass this :

    >  op=ddply(c, c("DESCRIPTION"), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE)) 

It returns this :

> op
     DESCRIPTION POSITION       DATE
1 PRIMARY NICKEL        0 2010-03-10
2 PRM HGH GD ALU        0 2010-04-09
3 SPCL HIGH GRAD        2 2010-04-09
4 STANDARD LEAD         0 2010-04-06

Not bad, as I have my 4 elements, the sum of QUANITY for each one, the max
DATE for each one, BUT I would like to add the CLOSING.PRICE column with the
CLOSING.PRICE corresponding to the max date.
I have no idea how to do this.

Any help would be appreciated.

TY



More information about the R-help mailing list