Re: [R] simple generation of artificial data with defined features

Greg Snow Fri, 22 Aug 2008 09:41:30 -0700

I don't think that the election data is the right data to demonstrate Kappa, 
you need subjects that are classified by 2 or more different raters/methods.  
The election data could be considered classifying the voters into which party 
they voted for, but you only have 1 rater.  Maybe if you had some survey data 
that showed which party each voter voted for in 2 or more elections, then that 
may be a good example dataset.  Otherwise you may want to stick with the sample 
datasets.


There are other packages that compute Kappa values as well (I don't know if 
others calculate this particular version), but some of those take the summary 
data as input rather than the raw data, which may be easier if you just have 
the summary tables.


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111



> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of drflxms
> Sent: Friday, August 22, 2008 6:12 AM
> To: r-help@r-project.org
> Subject: [R] simple generation of artificial data with
> defined features
>
> Dear R-colleagues,
>
> I am quite a newbie to R fighting my stupidity to solve a
> probably quite simple problem of generating artificial data
> with defined features.
>
> I am conducting a study of inter-observer-agreement in
> child-bronchoscopy. One of the most important measures is
> Kappa according to Fleiss, which is very comfortable
> available in R through the irr-package.
> Unfortunately medical doctors like me don't really understand
> much of statistics. Therefore I'd like to give the reader an
> easy understandable example of Fleiss-Kappa in the Methods
> part. To achieve this, I obtained a table with the results of
> the German election from 2005:
>
> party        number of votes    percent
>
> SPD        16194665            34,2
> CDU        13136740            27,8
> CSU        3494309            7,4
> Gruene    3838326            8,1
> FDP        4648144            9,8
> PDS        4118194            8,7
>
> I want to show the agreement of voters measured by
> Fleiss-Kappa. To calculate this with the
> kappam.fleiss-function of irr, I need a data.frame like this:
>
>                 (id of 1st voter) (id of 2nd voter)
>
> party             spd                         cdu
>
> Of course I don't plan to calculate this with the million of
> cases mentioned in the table above (I am working on a small
> laptop). A division by 1000 would be more than perfect for
> this example. The exact format of the table is generally not
> so important, as I could reshape nearly every format with the
> help of the reshape-package.
>
> Unfortunately I could not figure out how to create such a
> fictive/artificial dataset as described above. Any data.frame
> would be nice, that keeps at least the percentage. String-IDs
> of parties could be substituted by numbers of course (would
> be even better for function kappam.fleiss in irr!).
>
> I would appreciate any kind of help very much indeed.
> Greetings from Munich,
>
> Felix Mueller-Sarnowski
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple generation of artificial data with defined features

Reply via email to