[R] comparing columns in a dataframe

markleeds at verizon.net markleeds at verizon.net
Sun Apr 5 00:45:02 CEST 2009


   Hi: you've got to create a setdiff in both directions in order to get the
   lone ones in each column because setdiff is
   not commutative meaning that setdiff(a,b) does not equal setdiff(b,a). once
   you do that, then
   ( setdiff1 + setdiff2 - intersect ) should equal the union.
   if it doesn't, that would be weird and more investigation would need to be
   done.

   On Apr 4, 2009, Bob Green <bgreen at dyson.brisnet.org.au> wrote:

     hello,
     I am hoping for some advice regarding comparing variables from 3
     versions of a spreadsheet which have been combined into a single
     dataframe. The aim is to identify which rows have been changed.
     The dataframe contains 177 rows of data (each cell contains text).
     'intersect' produced a file with 35 rows, 'union' a file with 303
     rows and 'setdiff' a file with 130 rows
     Below is the code that I have started with.
     Ideally I would like to identify the actual row numbers where there
     is difference in the variables (either pairwise or between 3 variables).
     x <- read.csv("c://rec_compare.csv",header=T, as.is=TRUE)
     u <- union(x$rm1, x$redc1)
     write.csv(u,"c:/union_test.csv")
     i <- intersect(x$rm1, x$redc1)
     write.csv(i,"c:/intersect_test.csv")
     sd <- setdiff(x$rm1, x$redc1)
     write.csv(sd,"c:/setdiff_test.csv")
     Any suggestions are appreciated.
     regards
     Bob
     ______________________________________________
     [1]R-help at r-project.org mailing list
     [2]https://stat.ethz.ch/mailman/listinfo/r-help
     PLEASE do read the posting guide
     [3]http://www.R-project.org/posting-guide.html
     and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:R-help at r-project.org
   2. https://stat.ethz.ch/mailman/listinfo/r-help
   3. http://www.R-project.org/posting-guide.html



More information about the R-help mailing list