[R] comparing columns and printing overlapping rows

MacQueen, Don macqueen1 at llnl.gov
Sat Jun 3 01:06:47 CEST 2017


There are missing details, such as:

  what do you mean by "overlapping"?
  do the "files" have the same number of rows?
  do you care whether the "overlapping" entries are in the same row?
  what kind of R data structure do you have the "files" stored in?

Assuming your file 1 is stored in a vector and your file 2 is stored in a data frame,
then this example shows one possibility.

## make some example data
f1 <- c('a','b')

f2 <- data.frame( c1=sample(letters[1:5] , 6, replace=TRUE),
                 c2=1:6,
                 c3=month.abb[1:6]
                 )

## find rows in f2 in which a value in the first column of f2 is found in f1
subset(f2, f2$c1 %in% f1)


oh, and,
where's your reproducible example? which is generally expected; please see the posting guide
please don't send html email to r-help, it usually makes email unreadable on r-help


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062


On 6/2/17, 2:22 PM, "R-help on behalf of Aanchal Sharma" <r-help-bounces at r-project.org on behalf of aanchalsharma833 at gmail.com> wrote:

    Hi All,
    
    I have two files.
    1. with only one column
    2. data matrix
    
    I need to compare first columns of both files and print the rows from
    second file for the overlapping entries. I have solutions for awk and sed,
    but I need how to do it in R.
    Thanks
    
    Regards
    Anchal
    -- 
    Anchal Sharma, PhD
    Postdoctoral Fellow
    195, Little Albany street,
    Cancer Institute of New Jersey
    Rutgers University
    NJ-08901
    
    	[[alternative HTML version deleted]]
    
    ______________________________________________
    R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
    



More information about the R-help mailing list