[R] efficiently diff two data frames

Liviu Andronic landronimirc at gmail.com
Tue Apr 16 19:42:42 CEST 2013


Dear all,
What is the quickest and most efficient way to diff two data frames,
so as to obtain a vector of indices (or logical) for rows/columns that
differ in the two data frames?  For example,
> Xe <- head(mtcars)
> Xf <- head(mtcars)
> Xf[2:4,3:5] <- 55
> all.equal(Xe, Xf)
[1] "Component 3: Mean relative difference: 0.6863118"
[2] "Component 4: Mean relative difference: 0.4728435"
[3] "Component 5: Mean relative difference: 14.23546"

I could use all.equal(), but it only returns human readable info that
cannot be easily used programmatically. It also gives no info on the
rows. Another way would be to:
require(prob)
> setdiff(Xe, Xf)
                mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4 Wag  21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710     22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1

But again this doesn't return subsetting indices, nor any info on hte
columns. Any suggestions on how to approach this?

Regards ,
Liviu



More information about the R-help mailing list