[R] searching for data.frame rows / processing of rows
Gabor Grothendieck
ggrothendieck at gmail.com
Thu Oct 5 08:02:45 CEST 2006
Grouping the data frame by the first two columns, apply colMeans
and then rbind the resulting by-structure together:
do.call(rbind, by(DF, DF[2:1], colMeans, na.rm = TRUE))
On 10/5/06, Greg Tarpinian <sasprog474 at yahoo.com> wrote:
> R 2.3.1, WinXP:
>
> I have a puzzling problem that I suspect may be solved using
> grep or a regular expression but am at a loss how to actually do it...
> My data.frame looks like
>
> Location Time X Y
> -------- ---- --- ---
> 1 0 1.6 9.3
> 1 3 4.2 10.4
> 1 6 2.7 16.3
> 2 0 0.5 2.1
> 2 3 NA 3.6
> 2 3 5.0 0.06
> 2 6 3.4 14.0
>
> and so forth. I would like to search for duplicate Time values
> within a Location and take the numerical average (where possible)
> of the elements in X and Y. These numerical averages should
> then be used to create a single row where multiple rows once
> existed. So, I would like to obtain
>
> 2 3 5.0 1.83
>
> for the two Time = 3 rows for Location = 2 and use it to replace
> these two rows. Ideally, avoiding for(i in 1:blah) loops would be
> nice because the data.frame has about 10,000 rows that need to
> be searched and processed. My intent is to do some comparing of
> SAS to R -- the DATA step processing in SAS is quite fast and
> using the RETAIN statement along with the LAG( ) function allows
> this sort of thing to be done rapidly.
>
>
> Thanks in advance,
>
> Greg
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list