[R] Simple indexing conundrum

Martin Henry H. Stevens HStevens at MUOhio.edu
Fri Jul 1 13:54:54 CEST 2005


My apologies in advance for my thickness but I can't seem to solve the 
following, seemingly simple, data manipulation problem:

I have a data frame that contains multiple factors and multiple 
continuous response variables, but duplicates of some factor 
combinations. The duplicates contain bad data, so I would like to 
eliminate the duplicates. I would like to retain the entire rows 
identified by the maximum value of one particular continuous response 
variable.

For instance,

 >data(airquality)

 > str(airquality)
`data.frame':	153 obs. of  6 variables:
  $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
  $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
  $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
  $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
  $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
  $ Day    : int  1 2 3 4 5 6 7 8 9 10 ...

I would like to subset airquality, retaining only the rows, containing 
the maximum Solar.R for each month.

Any solution would be greatly appreciated.

Regards,
Hank



Dr. Martin Henry H. Stevens, Assistant Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/botany/bot/henry.html
http://www.muohio.edu/ecology/
http://www.muohio.edu/botany/
"E Pluribus Unum"




More information about the R-help mailing list