[R] Remove duplicates from a data frame but with some special requirements
gcam
gcam032 at gmail.com
Thu Dec 17 04:55:52 CET 2009
Hi all.
So I have a data frame with multiple columns/variables. The first variable
is a major sample name for which there are some sub-samples. Currently I
have used the following command to remove the duplicates:
Samps_working<-Samps[-c(which(duplicated(Samps$ESR_Ref_edit))),]
This removes all of the duplicated sample rows.
However, I just realised that, of course, this removes the first observation
of each duplicated set. However, I wish to retain any that have the code
"Y" in another variable Samps$Loaded. I'm at a bit of a loss as to how best
to approach this problem.
Just to reiterate. I want to remove all duplicate lines based on sample
name, but, I want the lines to be removed with a preference given to those
that do not include a "Y" in the Loaded variable column.
--
View this message in context: http://n4.nabble.com/Remove-duplicates-from-a-data-frame-but-with-some-special-requirements-tp965745p965745.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list