[R] extract duplications from list

arun smartpink111 at yahoo.com
Mon May 19 18:01:11 CEST 2014


Hi,

You may try:

myList <- list(structure(list(X = c("FBgn0000008", "FBgn0000014", "FBgn0000028", 
"FBgn0000109", "FBgn0000114", "FBgn0000120"), NAME = c("FBgn0000008", 
"FBgn0000014", "FBgn0000028", "FBgn0000109", "FBgn0000114", "FBgn0000120"
), MEM.SHIP = c(0.9304502, 1, 1, 1, 0.4839886, 1)), .Names = c("X", 
"NAME", "MEM.SHIP"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6")), structure(list(X = c("FBgn0000251", 
"FBgn0001168", "FBgn0001941", "FBgn0003053", "FBgn0003159", "FBgn0000120"
), NAME = c("FBgn0000251", "FBgn0001168", "FBgn0001941", "FBgn0000028", 
"FBgn0003159", "FBgn0003162"), MEM.SHIP = c(0.313865, 0.8995011, 
0.7485548, 0.4426997, 0.4843226, 0.655629)), .Names = c("X", 
"NAME", "MEM.SHIP"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6")))

library(data.table)
 dt1 <- rbindlist(myList)
fun1 <- function(val){duplicated(val)|duplicated(val,fromLast=TRUE)}
dt1[,paste0("Col",1:2):=lapply(.SD, fun1),.SDcols=1:2]
dt1

A.K.




On Monday, May 19, 2014 8:50 AM, Assa Yeroslaviz <frymor at gmail.com> wrote:
Hi,

I have a list of 40 data.frames.

I would like to identify duplicated entries in the whole list, not only in
one specific data.frame, but in all 40.

Here is my list:

> myList
[[1]]
            X        NAME  MEM.SHIP
1 FBgn0000008 FBgn0000008 0.9304502
2 FBgn0000014 FBgn0000014 1.0000000
3 FBgn0000028 FBgn0000028 1.0000000
4 FBgn0000109 FBgn0000109 1.0000000
5 FBgn0000114 FBgn0000114 0.4839886
6 FBgn0000120 FBgn0000120 1.0000000

[[2]]
            X        NAME  MEM.SHIP
1 FBgn0000251 FBgn0000251 0.3138650
2 FBgn0001168 FBgn0001168 0.8995011
3 FBgn0001941 FBgn0001941 0.7485548
4 FBgn0003053 FBgn0000028 0.4426997
5 FBgn0003159 FBgn0003159 0.4843226
6 FBgn0000120 FBgn0003162 0.6556290


I would like to know whether there are duplicated entries in the first
and/or second column of all. In this list I have two duplications one is
FBgn0000120 in both lines Nr. 6 and the second is FBgn0000028 in line 3 and
line 4 in df1 and df2 respectively.


Is there a way to do it. With unique I don't get any results. and I cannot
convert the list into a data.frame, as the number of items in each df is
different.

Thanks
Assa

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list