[R] How to locate the difference from two data frames

David Winsemius dwinsemius at comcast.net
Thu Apr 8 05:04:37 CEST 2010


On Apr 7, 2010, at 9:55 PM, Erik Iverson wrote:

> Jun Shen wrote:
>> Hi, David,
>> Thanks for the reply. However str() doesn't tell me exactly which  
>> element is
>> different. I expect to see a is identical to b. But if there is  
>> some minor
>> difference (usually by human mistake), I want to know which element
>> (numerical or character) is different.
>
> So you're saying you have two data.frames with the same dimensions  
> and want to know which elements are different?  What about different  
> attributes such as names, labels, etc?
>
> A very naive first attempt might be something like:
>
> df1 <- data.frame(a = 1:10, b = 2:11)
> df2 <- data.frame(a = 1:10, b = c(2:10, 12))
>
> mapply("==", df1, df2)

That might or might not reveal the differences that identical would  
pick pick up:

 > df1 <- data.frame(a = 1:10, b = 2:11)
 > df2 <- data.frame(a = 1:10, b = 2:11)
 > attr(df1, "row.names") <- letters[1:10]
 > mapply("identical", attributes(df1), attributes(df2))
     names row.names     class
      TRUE     FALSE      TRUE
 > mapply("==", df1, df2)
          a    b
  [1,] TRUE TRUE
  [2,] TRUE TRUE
  [3,] TRUE TRUE
  [4,] TRUE TRUE
  [5,] TRUE TRUE
  [6,] TRUE TRUE
  [7,] TRUE TRUE
  [8,] TRUE TRUE
  [9,] TRUE TRUE
[10,] TRUE TRUE


David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list