[R] Compare two dataframes

Mark Na mtb954 at gmail.com
Thu Dec 16 20:02:29 CET 2010


Hello,

I have two dataframes DF1 and DF2 that should be identical but are not
(DF1 has some rows that aren't in DF2, and vice versa). I would like
to produce a new dataframe DF3 containing rows in DF1 that aren't in
DF2 (and similarly DF4 would contain rows in DF2 that aren't in DF1).

I have a solution for this problem (see self contained example below)
but it's awkward and requires making a new "ID" column by pasting
together all of the columns in each DF and them comparing the two DFs
based on this unique ID.

Is there a better way?

Many thanks for your help,

Mark



#compare two dataframes and extract uncommon rows

#MAKE SOME DATA
cars$id<-paste(cars$speed,cars$dist,sep="") $create unique ID field by
pasting all columns together
cars1<-cars[1:35,]
cars2<-cars[16:50,]

#EXTRACT UNIQUE ROWS
cars1_unique<-cars1[cars1$id %in% setdiff(cars1$id,cars2$id),] #rows
unique to cars1 (i.e., not in cars2)
cars2_unique<-cars2[cars2$id %in% setdiff(cars2$id,cars1$id),] #rows
unique to cars2



More information about the R-help mailing list