[R] simple generation of artificial data with defined features
drflxms
drflxms at googlemail.com
Fri Aug 22 14:12:20 CEST 2008
Dear R-colleagues,
I am quite a newbie to R fighting my stupidity to solve a probably quite
simple problem of generating artificial data with defined features.
I am conducting a study of inter-observer-agreement in
child-bronchoscopy. One of the most important measures is Kappa
according to Fleiss, which is very comfortable available in R through
the irr-package.
Unfortunately medical doctors like me don't really understand much of
statistics. Therefore I'd like to give the reader an easy understandable
example of Fleiss-Kappa in the Methods part. To achieve this, I obtained
a table with the results of the German election from 2005:
party number of votes percent
SPD 16194665 34,2
CDU 13136740 27,8
CSU 3494309 7,4
Gruene 3838326 8,1
FDP 4648144 9,8
PDS 4118194 8,7
I want to show the agreement of voters measured by Fleiss-Kappa. To
calculate this with the kappam.fleiss-function of irr, I need a
data.frame like this:
(id of 1st voter) (id of 2nd voter)
party spd cdu
Of course I don't plan to calculate this with the million of cases
mentioned in the table above (I am working on a small laptop). A
division by 1000 would be more than perfect for this example. The exact
format of the table is generally not so important, as I could reshape
nearly every format with the help of the reshape-package.
Unfortunately I could not figure out how to create such a
fictive/artificial dataset as described above. Any data.frame would be
nice, that keeps at least the percentage. String-IDs of parties could be
substituted by numbers of course (would be even better for function
kappam.fleiss in irr!).
I would appreciate any kind of help very much indeed.
Greetings from Munich,
Felix Mueller-Sarnowski
More information about the R-help
mailing list