[R] Compare two dataframes

Fri Dec 17 00:56:44 CET 2010

Hello Mark,

This is how I do it but it's longer than your code :)

unique.rows <- function (df1, df2) {
# Returns any rows of df1 that are not in df2
  out <- NULL
  for (i in 1:nrow(df1)) {
  	found <- FALSE
    for (j in 1:nrow(df2)) {
      if (all(df1[i,] == df2[j,])) {
      	found <- TRUE
        break
      }
    }
    if (!found) out <- rbind(out, df1[i,])
  }
  out
}

Michael

On 17 December 2010 06:02, Mark Na <mtb954 at gmail.com> wrote:
> Hello,
>
> I have two dataframes DF1 and DF2 that should be identical but are not
> (DF1 has some rows that aren't in DF2, and vice versa). I would like
> to produce a new dataframe DF3 containing rows in DF1 that aren't in
> DF2 (and similarly DF4 would contain rows in DF2 that aren't in DF1).
>
> I have a solution for this problem (see self contained example below)
> but it's awkward and requires making a new "ID" column by pasting
> together all of the columns in each DF and them comparing the two DFs
> based on this unique ID.
>
> Is there a better way?
>
> Many thanks for your help,
>
> Mark
>
>
>
> #compare two dataframes and extract uncommon rows
>
> #MAKE SOME DATA
> cars$id<-paste(cars$speed,cars$dist,sep="") $create unique ID field by
> pasting all columns together
> cars1<-cars[1:35,]
> cars2<-cars[16:50,]
>
> #EXTRACT UNIQUE ROWS
> cars1_unique<-cars1[cars1$id %in% setdiff(cars1$id,cars2$id),] #rows
> unique to cars1 (i.e., not in cars2)
> cars2_unique<-cars2[cars2$id %in% setdiff(cars2$id,cars1$id),] #rows
> unique to cars2
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>