[BioC] finding and deleting repeated observations
mervi.alanne at wri.fi
mervi.alanne at wri.fi
Fri May 28 19:27:03 CEST 2010
Dear all,
I'm a novice with R and could use some help. How could I find repeated
observations based on one column and select the one to keep based on
another column?
In more detail, this is the thing I want to achieve:
-data.frame has 4 columns GeneSymbol, A, B, pvalue
-data in column GeneSymbol may be repeated 1-6 times
-data also contains unique observations
-Of the repeated obs, keep the obs which has the lowest pvalue
-Do not discard data from cols A and B
Example input data:
GeneSymbol A B pvalue
ABC1 12 44 0.01
ABC1 2 32 0.05
AB 4 55 0.2
ABCD1 15 25 0.005
ABCD1 11 27 0.002
ABCD1 9 18 0.0001
I'd like the output to look like this:
GeneSymbol A B pvalue
ABC1 2 32 0.01
AB 4 55 0.2
ABCD1 9 18 0.0001
Any suggestions?
-Mervi
Wihuri Research Institute
More information about the Bioconductor
mailing list