[R] Equivalent of 'first.var' or 'last.var' from SAS in R?

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Sep 25 21:26:28 CEST 2008


Matthew Pettis wrote:
> Hi,
>
> I want to sort a data frame by multiple columns and then take the
> first record in each unique level of the "by" group I used to sort the
> data frame.  Does someone have an example of how to do this?
>
> Thanks,
> Matt
>
>   
Something like this

 > aggregate(airquality,airquality["Month"],head,1)
  Month Ozone Solar.R Wind Temp Month Day
1     5    41     190  7.4   67     5   1
2     6    NA     286  8.6   78     6   1
3     7   135     269  4.1   84     7   1
4     8    39      83  6.9   81     8   1
5     9    96     167  6.9   91     9   1

where you probably want to lose the first column.

or

 > unsplit(lapply(split(aq,aq$Month), head,1),5:9)
    Ozone Solar.R Wind Temp Month Day
1      41     190  7.4   67     5   1
32     NA     286  8.6   78     6   1
62    135     269  4.1   84     7   1
93     39      83  6.9   81     8   1
124    96     167  6.9   91     9   1

This also works, but the "tail" variant is harder:

 > unsplit(lapply(split(aq,aq$Month), "[",1,),5:9)



-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-help mailing list