Dear Ilai, Thank you for your helpfulresponse. My question had two parts.
1. Are mosaic plots a good way to visualise multiple response data? Or are there better alternatives? 2. How can I do my modified chi-square tests in R (which you were able answer, so thank you very much :) ) All the best, Marcos On 14 March 2012 05:03, ilai <ke...@math.montana.edu> wrote: > Not sure I understand your question (or if there is one) and I am not > familiar with vcd::mosaic. But if you are asking is there a simpler > way ? than yes: > 1. work with ?array and ?aperm > 2. create the array directly in R from the original data - not excel > 3. ?mosaicplot (no package required - it's in grid) > > Here is what I mean based on your f.tbl: > > >> f.tbl = structure(c(10, 15, 25, 45, 30, 50), .Dim = 2:3, .Dimnames = > structure(list(Sex = c("F", "M"), Responses = c("A", "B", "total > subjects")), .Names = c("Sex", "Responses")), class = "table") > > # Calculate the No-A No-B columns: > (ff.tbl <- rbind(f.tbl[,1:2],f.tbl[,3]-f.tbl[,1:2])) > # rearrange to a CxRxB (in this case 2x2x2) array: > dim(ff.tbl) <- c(2,2,2) > # give some names > dimnames(ff.tbl) <- list(Sex=c('F','M'),c('yes','no'),Response=c('A','B')) > ff.tbl > # plot > mosaicplot(ff.tbl) > # or plot > mosaicplot(aperm(ff.tbl,3:1)) > # or test > apply(ff.tbl, 3 , chisq.test) # and sum the result > > > Hope this helps get you started > > > > f.tbl Responses > > > > Sex A B total subjects > > F 10 25 30 > > M 15 45 50 > > > > > > The answer I have is to adjust my data and then use the mosaic() function > > in package:vcd; however, I'm not sure that's the best way forward and I > > don't have a very efficient way of getting there. I will present my > > solution so you guys can take a look. > > > > The fundamental problem is that because of the multiple response data, > you > > can't simply apply a normal Chi-square test to the contingency table. > > There's a raft of approaches, but I've decided to use a simple technique > > introduced by (A. Agresti, I. Liu, Modeling a categorical variable > allowing > > arbitrarily many category choices, Biometrics 55 (1999) 936-43.) and > > refined by Thomas and Decady and Bilder and Loughin. In summary, the test > > statistic (a modified Chi square statistic) is calculated by summing up > the > > individual chi-square statistics for each of the c marginal r в 2 tables > > relating the single response variable to the multiple response variable > > with df = c(r - 1)). Note, that instead of using the row totals (total > > number of responses) the test statistic is calculated with the total > number > > of subjects per row. > > > > (phew, I hope that made sense :) ) Unfortunately, my google-research has > > not revealed an easy way to transform my one data table into c x r x 2 > > tables for analysis. So I end up having to create the two different > tables > > myself, shown below (note that the Not-A/B columns are calculated as the > > difference between the main data column (A/B) and the total number of > > subjects listed above. > > > >> g.mtrx=matrix(c(10,15,20,35),nrow=2)> g.tbl=as.table(g.mtrx)> > dimnames(g.tbl)=list(Sex=c("F","M"),Responses=c("A","Not-A"))> g.tbl > Responses > > Sex A Not-A > > F 10 20 > > M 15 35 > > > >> h.tbl=as.table(h.mtrx)> h.mtrx=matrix(c(25,45,5,5),nrow=2)> > h.tbl=as.table(h.mtrx)> > dimnames(h.tbl)=list(Sex=c("F","M"),Responses=c("B","Not-B"))> h.tbl > Responses > > Sex B Not-B > > F 25 5 > > M 45 5 > > > > > > If I then preform the normal Chi-square test on each of the two tables > > (chisq.test()) and then sum up the results, I get the answer I want. > > Clearly this is cumbersome, which is why I do it in Excel at the moment > (I > > know shame on me). However, I really want to take advantage of the mosaic > > function in vcd. So what I have to do at the moment is create the tables > > above and use abind() (package:abind) to bring my two matrices together > to > > form a multidimensional matrix. Example: > > > >> gh.abind = abind(g.mtrx,h.mtrx,along=3)> > dimnames(gh.abind)=list(Sex=c("F","M"),Responses=c("Yes","No"),Factors=c("A","B"))> > gh.abind, , Factors = A > > > > Responses > > Sex Yes No > > F 10 20 > > M 15 35 > > > > , , Factors = B > > > > Responses > > Sex Yes No > > F 25 5 > > M 45 5 > > > > Now I can use the simple mosaic function to plot the combined matrix > > > >> mosaic(gh.abind) > > > > So that's it. I don't use any pearson-r shading in mosaic since I > > don't think it would be appropriate to try and model my weird multiple > > response tables (at the moment), but what I will do is look at the > > odds-ratio table and then manually colour the mosaic cells with high > > odds-ratios (greater than 2). > > > > I am literally having to type all this by hand into R, and as you can > > imagine, it gets cumbersome with large multi column tables (which I > > have). Does any body have any thoughts on my approach of using mosaic > > for this sort of data? And if so, any insight on how I can be a bit > > slicker with my R code? > > > > All help is appreciated and I hope that this question wasn't too long > > to read through.Not sure I uderstand your question (or if there is one) > and I am not familiar with vcd::mosaic. But if you are asking is there a > simpler way ? than yes: > 1. work with ?array and ?aperm not tables > 2. create the array directly in R from the original data > 3. ?mosaicplot (no package required - it's in grid) > > Here is what I mean based on your f.tbl: > >> f.tbl = structure(c(10, 15, 25, 45, 30, 50), .Dim = 2:3, .Dimnames = > structure(list(+ Sex = c("F", "M"), Responses = c("A", "B", "total > subjects"+ )), .Names = c("Sex", > "Responses")), class = "table") > > # Calculate the No-A No-B columns: > (ff.tbl <- rbind(f.tbl[,1:2],f.tbl[,3]-f.tbl[,1:2])) > # rearrange to a CxRxB (in this case 2x2x2) array: > dim(ff.tbl) <- c(2,2,2) > # give some names > dimnames(ff.tbl) <- list(Sex=c('F','M'),c('yes','no'),Response=c('A','B')) > # plot > mosaicplot(ff.tbl) > # or plot > mosaicplot(aperm(ff.tbl,3:1)) > # Now you could apply your test or whatever to each 2x2 Response with > > > > f.tbl Responses > > > > Sex A B total subjects > > F 10 25 30 > > M 15 45 50 > > > > > > The answer I have is to adjust my data and then use the mosaic() function > > in package:vcd; however, I'm not sure that's the best way forward and I > > don't have a very efficient way of getting there. I will present my > > solution so you guys can take a look. > > > > The fundamental problem is that because of the multiple response data, > you > > can't simply apply a normal Chi-square test to the contingency table. > > There's a raft of approaches, but I've decided to use a simple technique > > introduced by (A. Agresti, I. Liu, Modeling a categorical variable > allowing > > arbitrarily many category choices, Biometrics 55 (1999) 936-43.) and > > refined by Thomas and Decady and Bilder and Loughin. In summary, the test > > statistic (a modified Chi square statistic) is calculated by summing up > the > > individual chi-square statistics for each of the c marginal r в 2 tables > > relating the single response variable to the multiple response variable > > with df = c(r - 1)). Note, that instead of using the row totals (total > > number of responses) the test statistic is calculated with the total > number > > of subjects per row. > > > > (phew, I hope that made sense :) ) Unfortunately, my google-research has > > not revealed an easy way to transform my one data table into c x r x 2 > > tables for analysis. So I end up having to create the two different > tables > > myself, shown below (note that the Not-A/B columns are calculated as the > > difference between the main data column (A/B) and the total number of > > subjects listed above. > > > >> g.mtrx=matrix(c(10,15,20,35),nrow=2)> g.tbl=as.table(g.mtrx)> > dimnames(g.tbl)=list(Sex=c("F","M"),Responses=c("A","Not-A"))> g.tbl > Responses > > Sex A Not-A > > F 10 20 > > M 15 35 > > > >> h.tbl=as.table(h.mtrx)> h.mtrx=matrix(c(25,45,5,5),nrow=2)> > h.tbl=as.table(h.mtrx)> > dimnames(h.tbl)=list(Sex=c("F","M"),Responses=c("B","Not-B"))> h.tbl > Responses > > Sex B Not-B > > F 25 5 > > M 45 5 > > > > > > If I then preform the normal Chi-square test on each of the two tables > > (chisq.test()) and then sum up the results, I get the answer I want. > > Clearly this is cumbersome, which is why I do it in Excel at the moment > (I > > know shame on me). However, I really want to take advantage of the mosaic > > function in vcd. So what I have to do at the moment is create the tables > > above and use abind() (package:abind) to bring my two matrices together > to > > form a multidimensional matrix. Example: > > > >> gh.abind = abind(g.mtrx,h.mtrx,along=3)> > dimnames(gh.abind)=list(Sex=c("F","M"),Responses=c("Yes","No"),Factors=c("A","B"))> > gh.abind, , Factors = A > > > > Responses > > Sex Yes No > > F 10 20 > > M 15 35 > > > > , , Factors = B > > > > Responses > > Sex Yes No > > F 25 5 > > M 45 5 > > > > Now I can use the simple mosaic function to plot the combined matrix > > > >> mosaic(gh.abind) > > > > So that's it. I don't use any pearson-r shading in mosaic since I > > don't think it would be appropriate to try and model my weird multiple > > response tables (at the moment), but what I will do is look at the > > odds-ratio table and then manually colour the mosaic cells with high > > odds-ratios (greater than 2). > > > > I am literally having to type all this by hand into R, and as you can > > imagine, it gets cumbersome with large multi column tables (which I > > have). Does any body have any thoughts on my approach of using mosaic > > for this sort of data? And if so, any insight on how I can be a bit > > slicker with my R code? > > > > All help is appreciated and I hope that this question wasn't too long > > to read through. > > > > All the best, > > Marcos > > > > > > > > > > -- > > PhD Engineering Candidate > > University of Cambridge > > Department of Engineering > > Centre for Sustainable Development > > mp...@cam.ac.uk <mp...@cam.ac.uk> > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > All the best, > > Marcos > > > > > > > > > > -- > > PhD Engineering Candidate > > University of Cambridge > > Department of Engineering > > Centre for Sustainable Development > > mp...@cam.ac.uk <mp...@cam.ac.uk> > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > -- PhD Engineering Candidate University of Cambridge Department of Engineering Centre for Sustainable Development mp...@cam.ac.uk <mp...@cam.ac.uk> [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.