[R] average duplicated rows?
Pieter Schoonees
schoonees at ese.eur.nl
Fri Oct 12 18:18:44 CEST 2012
You will have to split() the data and unsplit() it after making the alterations. Have a look at the plyr package for such functions.
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Vining, Kelly
> Sent: Friday 12 October 2012 5:42
> To: r-help at r-project.org
> Subject: [R] average duplicated rows?
>
> Dear useRs,
>
> I have a slightly complicated data structure and am stuck trying to
> extract what I need. I'm pasting an example of this data below. In some
> cases, there are duplicates in the "gene_id" column because there are
> two different "sample 1" values for a given "sample 2" value. Where
> these duplicates exist, I need to average the corresponding "FL_EARLY"
> values and retain the "FL_LATE" value and replace those two rows with a
> row containing the "FL_EARLY" average so that I no longer have any
> "gene_id" duplicates.
>
> Seems like this is a job for some version of the apply function, but
> searching and puzzling over this has not gotten me anywhere. Any help
> will be much appreciated!
>
> Example data:
>
>
> gene_id sample_1 sample_2 FL_EARLY FL_LATE
> 763938 Eucgr.A00054 fl_S1E fl_S1L 13.170800 22.2605
> 763979 Eucgr.A00101 fl_S1E fl_S1L 0.367960 14.1202
> 1273243 Eucgr.A00101 fl_S2 fl_S1L 0.356625 14.1202
> 764169 Eucgr.A00350 fl_S1E fl_S1L 7.381070 43.9275
> 1273433 Eucgr.A00350 fl_S2 fl_S1L 10.674500 43.9275
> 1273669 Eucgr.A00650 fl_S2 fl_S1L 33.669100 50.0169
> 764480 Eucgr.A00744 fl_S1E fl_S1L 132.429000 747.2770
> 1273744 Eucgr.A00744 fl_S2 fl_S1L 142.659000 747.2770
> 764595 Eucgr.A00890 fl_S1E fl_S1L 2.937760 14.9647
> 764683 Eucgr.A00990 fl_S1E fl_S1L 8.681250 48.5492
> 1273947 Eucgr.A00990 fl_S2 fl_S1L 10.553300 48.5492
> 764710 Eucgr.A01020 fl_S1E fl_S1L 0.000000 57.9273
> 1273974 Eucgr.A01020 fl_S2 fl_S1L 0.000000 57.9273
> 764756 Eucgr.A01073 fl_S1E fl_S1L 8.504710 101.1870
> 1274020 Eucgr.A01073 fl_S2 fl_S1L 5.400010 101.1870
> 764773 Eucgr.A01091 fl_S1E fl_S1L 3.448910 15.7756
> 764826 Eucgr.A01152 fl_S1E fl_S1L 69.565700 198.2320
> 764831 Eucgr.A01158 fl_S1E fl_S1L 7.265640 30.9565
> 764845 Eucgr.A01172 fl_S1E fl_S1L 3.248020 16.9127
> 764927 Eucgr.A01269 fl_S1E fl_S1L 18.710200 76.6918
>
>
>
> --Kelly V.
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list