On the general question on how to create a dataset that matches the frequencies in a table, function as.data.frame can be useful. It takes as argument an object of a class 'table' and returns a data frame of frequencies.
Consider for example table 6.1 of Fleiss et al (3rd Ed): > birth.weight <- c(10,15,40,135) > attr(birth.weight, "class") <- "table" > attr(birth.weight, "dim") <- c(2,2) > attr(birth.weight, "dimnames") <- list(c("A", "Ab"), c("B", "Bb")) > birth.weight B Bb A 10 40 Ab 15 135 > summary(birth.weight) Number of cases in table: 200 Number of factors: 2 Test for independence of all factors: Chisq = 3.429, df = 1, p-value = 0.06408 > > bw.dt <- as.data.frame(birth.weight) Observations (rows) in this table can then be replicated according to their corresponding frequencies to yield the expanded dataset that conforms with the original table. > bw.dt.exp <- bw.dt[rep(1:nrow(bw.dt), bw.dt$Freq), -ncol(bw.dt)] > dim(bw.dt.exp) [1] 200 2 > table(bw.dt.exp) Var2 Var1 B Bb A 10 40 Ab 15 135 The above approach is not restricted to 2x2 tables, and should be straightforward generate datasets that conform to arbitrary nxm frequency tables. -Christos Hatzis > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Greg Snow > Sent: Friday, August 22, 2008 12:41 PM > To: drflxms; r-help@r-project.org > Subject: Re: [R] simple generation of artificial data with > defined features > > I don't think that the election data is the right data to > demonstrate Kappa, you need subjects that are classified by 2 > or more different raters/methods. The election data could be > considered classifying the voters into which party they voted > for, but you only have 1 rater. Maybe if you had some survey > data that showed which party each voter voted for in 2 or > more elections, then that may be a good example dataset. > Otherwise you may want to stick with the sample datasets. > > There are other packages that compute Kappa values as well (I > don't know if others calculate this particular version), but > some of those take the summary data as input rather than the > raw data, which may be easier if you just have the summary tables. > > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > [EMAIL PROTECTED] > (801) 408-8111 > > > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of drflxms > > Sent: Friday, August 22, 2008 6:12 AM > > To: r-help@r-project.org > > Subject: [R] simple generation of artificial data with defined > > features > > > > Dear R-colleagues, > > > > I am quite a newbie to R fighting my stupidity to solve a probably > > quite simple problem of generating artificial data with defined > > features. > > > > I am conducting a study of inter-observer-agreement in > > child-bronchoscopy. One of the most important measures is Kappa > > according to Fleiss, which is very comfortable available in > R through > > the irr-package. > > Unfortunately medical doctors like me don't really > understand much of > > statistics. Therefore I'd like to give the reader an easy > > understandable example of Fleiss-Kappa in the Methods part. > To achieve > > this, I obtained a table with the results of the German > election from > > 2005: > > > > party number of votes percent > > > > SPD 16194665 34,2 > > CDU 13136740 27,8 > > CSU 3494309 7,4 > > Gruene 3838326 8,1 > > FDP 4648144 9,8 > > PDS 4118194 8,7 > > > > I want to show the agreement of voters measured by Fleiss-Kappa. To > > calculate this with the kappam.fleiss-function of irr, I need a > > data.frame like this: > > > > (id of 1st voter) (id of 2nd voter) > > > > party spd cdu > > > > Of course I don't plan to calculate this with the million of cases > > mentioned in the table above (I am working on a small laptop). A > > division by 1000 would be more than perfect for this example. The > > exact format of the table is generally not so important, as I could > > reshape nearly every format with the help of the reshape-package. > > > > Unfortunately I could not figure out how to create such a > > fictive/artificial dataset as described above. Any > data.frame would be > > nice, that keeps at least the percentage. String-IDs of > parties could > > be substituted by numbers of course (would be even better > for function > > kappam.fleiss in irr!). > > > > I would appreciate any kind of help very much indeed. > > Greetings from Munich, > > > > Felix Mueller-Sarnowski > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.