[R] is there is a way to extract lines in between 3 files that are in common based on one column?

Ana Marija @okov|c@@n@m@r|j@ @end|ng |rom gm@||@com
Mon Jun 1 23:50:09 CEST 2020


Hello,

I have 3 data frames which have about 3.4 mill lines (but they don't have
exactly the same number of lines)...they look like this:

> neu1=neu[order(neu$Marker),]
> head(neu1)
       Chr        BP          Marker      MAF A1 A2 Direction   pValue    N
209565   1 100000012 1:100000012:G:T 0.229925  T  G         + 0.650403 1594
209566   1 100000827 1:100000827:C:T 0.287014  T  C         + 0.955449 1594
209567   1 100002713 1:100002713:C:T 0.097867  T  C         - 0.290455 1594
209568   1 100002882 1:100002882:T:G 0.287014  G  T         + 0.955449 1594
209569   1 100002991 1:100002991:G:A 0.097867  A  G         - 0.290455 1594
209570   1 100004726 1:100004726:G:A 0.132058  A  G         + 0.115005 1594
> nep1=nep[order(nep$Marker),]
> head(nep1)
       Chr        BP          Marker       MAF A1 A2 Direction    pValue
 N
209642   1 100000012 1:100000012:G:T 0.2300430  T  G         - 0.1420030
1641
209643   1 100000827 1:100000827:C:T 0.2867150  T  C         - 0.2045580
1641
209644   1 100002713 1:100002713:C:T 0.0975015  T  C         - 0.0555507
1641
209645   1 100002882 1:100002882:T:G 0.2867150  G  T         - 0.2045580
1641
209646   1 100002991 1:100002991:G:A 0.0975015  A  G         - 0.0555507
1641
209647   1 100004726 1:100004726:G:A 0.1325410  A  G         - 0.8725660
1641
> ret1=ret[order(ret$Marker),]
> head(ret1)
        Chr        BP          Marker       MAF A1 A2 Direction   pValue
 N
865453    1 100000012 1:100000012:G:T 0.2322760  T  G         - 0.230383
1608
451596    1 100000827 1:100000827:C:T 0.2882460  T  C         - 0.120356
1608
1026046   1 100002713 1:100002713:C:T 0.0982587  T  C         - 0.272936
1608
451597    1 100002882 1:100002882:T:G 0.2882460  G  T         - 0.120356
1608
1026047   1 100002991 1:100002991:G:A 0.0982587  A  G         - 0.272936
1608
2234642   1 100004726 1:100004726:G:A 0.1340170  A  G         - 0.594538
1608

Is there is a way to create another 3 data frames, say neu2, nep2, ret2
which would only contain lines that have the same entries in Marker column
for all 3 data frames?

Thanks
Ana

	[[alternative HTML version deleted]]



More information about the R-help mailing list