drflxms drflxms at googlemail.com
Fri Aug 22 14:12:20 CEST 2008

Dear R-colleagues,

I am quite a newbie to R fighting my stupidity to solve a probably quite
simple problem of generating artificial data with defined features.

I am conducting a study of inter-observer-agreement in
child-bronchoscopy. One of the most important measures is Kappa
according to Fleiss, which is very comfortable available in R through
the irr-package.
Unfortunately medical doctors like me don't really understand much of
statistics. Therefore I'd like to give the reader an easy understandable
example of Fleiss-Kappa in the Methods part. To achieve this, I obtained
a table with the results of the German election from 2005:

party        number of votes    percent

SPD        16194665            34,2
CDU        13136740            27,8
CSU        3494309            7,4
Gruene    3838326            8,1
FDP        4648144            9,8
PDS        4118194            8,7

I want to show the agreement of voters measured by Fleiss-Kappa. To
calculate this with the kappam.fleiss-function of irr, I need a
data.frame like this:

                (id of 1st voter) (id of 2nd voter)

party             spd                         cdu

Of course I don't plan to calculate this with the million of cases
mentioned in the table above (I am working on a small laptop). A
division by 1000 would be more than perfect for this example. The exact
format of the table is generally not so important, as I could reshape
nearly every format with the help of the reshape-package.

Unfortunately I could not figure out how to create such a
fictive/artificial dataset as described above. Any data.frame would be
nice, that keeps at least the percentage. String-IDs of parties could be
substituted by numbers of course (would be even better for function
kappam.fleiss in irr!).

I would appreciate any kind of help very much indeed.
Greetings from Munich,

Felix Mueller-Sarnowski

