[R] multiple column comparison

Petr PIKAL petr.pikal at precheza.cz
Mon Jan 30 11:48:53 CET 2012


Hi
 
I did not see any response and actually I can not offer any ready made 
solution too. For such problems there could be various solutions from 
cycles to *apply, reshape or plyr options.

However for anybody to start with it would be nice to get rather more 
clear description together with some small toy ready available data 
(preferably produced by dput) and desired result.

Regards
Petr


> Hello, 
> I have a very large content analysis project, which I've just begun to
> collect training data on. I have three coders, who are entering data on 
up
> to 95 measurements. Traditionally, I've used Excel to check coder 
agreement
> (e.g., percentage agreement), by lining up each coder's measurements
> side-by-side, creating a new column with the results using if 
statements.
> That is, if (a=b, 1, 0). With this many variables, I am clearly 
interested
> in something that I don't have to create manually every time I check
> percentage agreement for coders. 
> 
> The data are set up like this: 
> 
> ID        CODER V1  V2   V3   V4 ... V95
> ID1      C1         y      int   doc  y
> ID2      C1         y      ext   doc  y
> ID1      C2        n        int  doc  y
> ID2      C2        n        int  doc  y
> ID1     C3         n         int  doc  y
> ID2     C3         n         int  doc  y
> 
> I would like to write a script to do the following:
> For each variable compare each pair of coders using if statements (e.g., 
if
> C1.V1.==C1.V2, 1, 0)
> 
> ID        C1.V1  C2.V1 C3.V1
> ID1       y           y           y 
> ID2      y            y           y 
> 
> For each coding pair, enter the resulting 1s and 0s into a new column. 
> 
> The new column name would reflect the results of the comparison, such as
> C1.C2.V1
> 
> I'd ideally like to create this so that it can handle any number of
> variables and any number of coders. 
> 
> I appreciate any thoughts, help, and pointers on this. 
> 
> Thanks in advance. 
> 
> Best,
> Ryan Fuller
> Doctoral Candidate, Communication
> Graduate Student Researcher, Carsey-Wolf Center
> http://carseywolf.ucsb.edu
> University of California, Santa Barbara
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/multiple-
> column-comparison-tp4332604p4332604.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list