[R] remove duplicated row according to NA condition

William Dunlap wdunlap at tibco.com
Wed May 28 17:43:17 CEST 2014


It would help if you said what you want done when none or all or some
of the col1-col2 duplicates have NA's in the col3.  E.g., what do you
want the function to do for the following input?

> data2 <- data.frame(col1=c("a","a","a","b","b","c","c","d","d","e"),
    col2=c(1,1,1,2,2,3,3,4,4,5),
    col3=c("A1",NA,"A3",NA,"B2","C1","C2",NA,NA,NA))
> data2
   col1 col2 col3
1     a    1   A1
2     a    1 <NA>
3     a    1   A3
4     b    2 <NA>
5     b    2   B2
6     c    3   C1
7     c    3   C2
8     d    4 <NA>
9     d    4 <NA>
10    e    5 <NA>

(You may want it to return a data.frame or you may want the function
to stop because the data is not considered legal, but you should
decide what it should do.)

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, May 28, 2014 at 5:35 AM, jeff6868
<geoffrey_klein at etu.u-bourgogne.fr> wrote:
> Hi everybody,
>
> I have a little problem in my R-code which seems be easy to solve, but I
> wasn't able to find the solution by myself for the moment.
>
> Here's an example of the form of my data:
>
> data <-
> data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))
>
> I would like to remove duplicated data based on the first two columns
> (col1,col2), but in both cases here, I would like to remove the duplicated
> row which is equal to NA in col3.
>
> Here's the data.frame I would like to obtain:
>
> data2 <- data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))
>
> I've been trying to mix duplicated() with is.na() but it doesn't work yet.
>
> Can someone tell me the best and easiest way to do this?
>
> Thanks a lot!
>
>
>
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list