[R] comparing columns in a dataframe
markleeds at verizon.net
markleeds at verizon.net
Sun Apr 5 00:45:02 CEST 2009
Hi: you've got to create a setdiff in both directions in order to get the
lone ones in each column because setdiff is
not commutative meaning that setdiff(a,b) does not equal setdiff(b,a). once
you do that, then
( setdiff1 + setdiff2 - intersect ) should equal the union.
if it doesn't, that would be weird and more investigation would need to be
done.
On Apr 4, 2009, Bob Green <bgreen at dyson.brisnet.org.au> wrote:
hello,
I am hoping for some advice regarding comparing variables from 3
versions of a spreadsheet which have been combined into a single
dataframe. The aim is to identify which rows have been changed.
The dataframe contains 177 rows of data (each cell contains text).
'intersect' produced a file with 35 rows, 'union' a file with 303
rows and 'setdiff' a file with 130 rows
Below is the code that I have started with.
Ideally I would like to identify the actual row numbers where there
is difference in the variables (either pairwise or between 3 variables).
x <- read.csv("c://rec_compare.csv",header=T, as.is=TRUE)
u <- union(x$rm1, x$redc1)
write.csv(u,"c:/union_test.csv")
i <- intersect(x$rm1, x$redc1)
write.csv(i,"c:/intersect_test.csv")
sd <- setdiff(x$rm1, x$redc1)
write.csv(sd,"c:/setdiff_test.csv")
Any suggestions are appreciated.
regards
Bob
______________________________________________
[1]R-help at r-project.org mailing list
[2]https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
[3]http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
References
1. mailto:R-help at r-project.org
2. https://stat.ethz.ch/mailman/listinfo/r-help
3. http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list