Dear Ilai,

Thank you for your helpfulresponse. My question had two parts.

1. Are mosaic plots a good way to visualise multiple response data? Or are
there better alternatives?
2. How can I do my modified chi-square tests in R (which you were able
answer, so thank you very much :) )

All the best,
Marcos



On 14 March 2012 05:03, ilai <ke...@math.montana.edu> wrote:

> Not sure I understand your question (or if there is one) and I am not
> familiar with vcd::mosaic. But if you are asking is there a simpler
> way ? than yes:
> 1. work with ?array and ?aperm
> 2. create the array directly in R from the original data - not excel
> 3. ?mosaicplot (no package required - it's in grid)
>
> Here is what I mean based on your f.tbl:
>
> >> f.tbl = structure(c(10, 15, 25, 45, 30, 50), .Dim = 2:3, .Dimnames =
> structure(list(Sex = c("F", "M"), Responses = c("A", "B", "total
> subjects")), .Names = c("Sex", "Responses")), class = "table")
>
> # Calculate the No-A No-B columns:
> (ff.tbl <- rbind(f.tbl[,1:2],f.tbl[,3]-f.tbl[,1:2]))
> # rearrange to a CxRxB (in this case 2x2x2) array:
> dim(ff.tbl) <- c(2,2,2)
> # give some names
>  dimnames(ff.tbl) <- list(Sex=c('F','M'),c('yes','no'),Response=c('A','B'))
> ff.tbl
> # plot
>  mosaicplot(ff.tbl)
> # or plot
> mosaicplot(aperm(ff.tbl,3:1))
> # or test
> apply(ff.tbl, 3 , chisq.test) # and sum the result
>
>
> Hope this helps get you started
>
>
> > f.tbl   Responses
>
>
> > Sex  A  B total subjects
> >  F 10 25             30
> >  M 15 45             50
> >
> >
> > The answer I have is to adjust my data and then use the mosaic() function
> > in package:vcd; however, I'm not sure that's the best way forward and I
> > don't have a very efficient way of getting there. I will present my
> > solution so you guys can take a look.
> >
> > The fundamental problem is that because of the multiple response data,
> you
> > can't simply apply a normal Chi-square test to the contingency table.
> > There's a raft of approaches, but I've decided to use a simple technique
> > introduced by (A. Agresti, I. Liu, Modeling a categorical variable
> allowing
> > arbitrarily many category choices, Biometrics 55 (1999) 936-43.) and
> > refined by Thomas and Decady and Bilder and Loughin. In summary, the test
> > statistic (a modified Chi square statistic) is calculated by summing up
> the
> > individual chi-square statistics for each of the c marginal r в 2 tables
> > relating the single response variable to the multiple response variable
> > with df = c(r - 1)). Note, that instead of using the row totals (total
> > number of responses) the test statistic is calculated with the total
> number
> > of subjects per row.
> >
> > (phew, I hope that made sense :) ) Unfortunately, my google-research has
> > not revealed an easy way to transform my one data table into c x r x 2
> > tables for analysis. So I end up having to create the two different
> tables
> > myself, shown below (note that the Not-A/B columns are calculated as the
> > difference between the main data column (A/B) and the total number of
> > subjects listed above.
> >
> >> g.mtrx=matrix(c(10,15,20,35),nrow=2)> g.tbl=as.table(g.mtrx)>
> dimnames(g.tbl)=list(Sex=c("F","M"),Responses=c("A","Not-A"))> g.tbl
> Responses
> > Sex  A  Not-A
> >  F  10     20
> >  M  15     35
> >
> >> h.tbl=as.table(h.mtrx)> h.mtrx=matrix(c(25,45,5,5),nrow=2)>
> h.tbl=as.table(h.mtrx)>
> dimnames(h.tbl)=list(Sex=c("F","M"),Responses=c("B","Not-B"))> h.tbl
> Responses
> > Sex  B Not-B
> >  F 25     5
> >  M 45     5
> >
> >
> > If I then preform the normal Chi-square test on each of the two tables
> > (chisq.test()) and then sum up the results, I get the answer I want.
> > Clearly this is cumbersome, which is why I do it in Excel at the moment
> (I
> > know shame on me). However, I really want to take advantage of the mosaic
> > function in vcd. So what I have to do at the moment is create the tables
> > above and use abind() (package:abind) to bring my two matrices together
> to
> > form a multidimensional matrix. Example:
> >
> >> gh.abind = abind(g.mtrx,h.mtrx,along=3)>
> dimnames(gh.abind)=list(Sex=c("F","M"),Responses=c("Yes","No"),Factors=c("A","B"))>
> gh.abind, , Factors = A
> >
> >   Responses
> > Sex Yes No
> >  F  10 20
> >  M  15 35
> >
> > , , Factors = B
> >
> >   Responses
> > Sex Yes No
> >  F  25  5
> >  M  45  5
> >
> > Now I can use the simple mosaic function to plot the combined matrix
> >
> >> mosaic(gh.abind)
> >
> > So that's it. I don't use any pearson-r shading in mosaic since I
> > don't think it would be appropriate to try and model my weird multiple
> > response tables (at the moment), but what I will do is look at the
> > odds-ratio table and then manually colour the mosaic cells with high
> > odds-ratios (greater than 2).
> >
> > I am literally having to type all this by hand into R, and as you can
> > imagine, it gets cumbersome with large multi column tables (which I
> > have). Does any body have any thoughts on my approach of using mosaic
> > for this sort of data? And if so, any insight on how I can be a bit
> > slicker with my R code?
> >
> > All help is appreciated and I hope that this question wasn't too long
> > to read through.Not sure I uderstand your question (or if there is one)
> and I am not familiar with vcd::mosaic. But if you are asking is there a
> simpler way ? than yes:
> 1. work with ?array and ?aperm not tables
> 2. create the array directly in R from the original data
> 3. ?mosaicplot (no package required - it's in grid)
>
> Here is what I mean based on your f.tbl:
> >> f.tbl = structure(c(10, 15, 25, 45, 30, 50), .Dim = 2:3, .Dimnames =
> structure(list(+     Sex = c("F", "M"), Responses = c("A", "B", "total
> subjects"+                                      )), .Names = c("Sex",
> "Responses")), class = "table")
>
> # Calculate the No-A No-B columns:
> (ff.tbl <- rbind(f.tbl[,1:2],f.tbl[,3]-f.tbl[,1:2]))
> # rearrange to a CxRxB (in this case 2x2x2) array:
> dim(ff.tbl) <- c(2,2,2)
> # give some names
>  dimnames(ff.tbl) <- list(Sex=c('F','M'),c('yes','no'),Response=c('A','B'))
> # plot
>  mosaicplot(ff.tbl)
> # or plot
> mosaicplot(aperm(ff.tbl,3:1))
> # Now you could apply your test or whatever to each 2x2 Response with
>
>
> > f.tbl   Responses
>
>
> > Sex  A  B total subjects
> >  F 10 25             30
> >  M 15 45             50
> >
> >
> > The answer I have is to adjust my data and then use the mosaic() function
> > in package:vcd; however, I'm not sure that's the best way forward and I
> > don't have a very efficient way of getting there. I will present my
> > solution so you guys can take a look.
> >
> > The fundamental problem is that because of the multiple response data,
> you
> > can't simply apply a normal Chi-square test to the contingency table.
> > There's a raft of approaches, but I've decided to use a simple technique
> > introduced by (A. Agresti, I. Liu, Modeling a categorical variable
> allowing
> > arbitrarily many category choices, Biometrics 55 (1999) 936-43.) and
> > refined by Thomas and Decady and Bilder and Loughin. In summary, the test
> > statistic (a modified Chi square statistic) is calculated by summing up
> the
> > individual chi-square statistics for each of the c marginal r в 2 tables
> > relating the single response variable to the multiple response variable
> > with df = c(r - 1)). Note, that instead of using the row totals (total
> > number of responses) the test statistic is calculated with the total
> number
> > of subjects per row.
> >
> > (phew, I hope that made sense :) ) Unfortunately, my google-research has
> > not revealed an easy way to transform my one data table into c x r x 2
> > tables for analysis. So I end up having to create the two different
> tables
> > myself, shown below (note that the Not-A/B columns are calculated as the
> > difference between the main data column (A/B) and the total number of
> > subjects listed above.
> >
> >> g.mtrx=matrix(c(10,15,20,35),nrow=2)> g.tbl=as.table(g.mtrx)>
> dimnames(g.tbl)=list(Sex=c("F","M"),Responses=c("A","Not-A"))> g.tbl
> Responses
> > Sex  A  Not-A
> >  F  10     20
> >  M  15     35
> >
> >> h.tbl=as.table(h.mtrx)> h.mtrx=matrix(c(25,45,5,5),nrow=2)>
> h.tbl=as.table(h.mtrx)>
> dimnames(h.tbl)=list(Sex=c("F","M"),Responses=c("B","Not-B"))> h.tbl
> Responses
> > Sex  B Not-B
> >  F 25     5
> >  M 45     5
> >
> >
> > If I then preform the normal Chi-square test on each of the two tables
> > (chisq.test()) and then sum up the results, I get the answer I want.
> > Clearly this is cumbersome, which is why I do it in Excel at the moment
> (I
> > know shame on me). However, I really want to take advantage of the mosaic
> > function in vcd. So what I have to do at the moment is create the tables
> > above and use abind() (package:abind) to bring my two matrices together
> to
> > form a multidimensional matrix. Example:
> >
> >> gh.abind = abind(g.mtrx,h.mtrx,along=3)>
> dimnames(gh.abind)=list(Sex=c("F","M"),Responses=c("Yes","No"),Factors=c("A","B"))>
> gh.abind, , Factors = A
> >
> >   Responses
> > Sex Yes No
> >  F  10 20
> >  M  15 35
> >
> > , , Factors = B
> >
> >   Responses
> > Sex Yes No
> >  F  25  5
> >  M  45  5
> >
> > Now I can use the simple mosaic function to plot the combined matrix
> >
> >> mosaic(gh.abind)
> >
> > So that's it. I don't use any pearson-r shading in mosaic since I
> > don't think it would be appropriate to try and model my weird multiple
> > response tables (at the moment), but what I will do is look at the
> > odds-ratio table and then manually colour the mosaic cells with high
> > odds-ratios (greater than 2).
> >
> > I am literally having to type all this by hand into R, and as you can
> > imagine, it gets cumbersome with large multi column tables (which I
> > have). Does any body have any thoughts on my approach of using mosaic
> > for this sort of data? And if so, any insight on how I can be a bit
> > slicker with my R code?
> >
> > All help is appreciated and I hope that this question wasn't too long
> > to read through.
> >
> > All the best,
> > Marcos
> >
> >
> >
> >
> > --
> > PhD Engineering Candidate
> > University of Cambridge
> > Department of Engineering
> > Centre for Sustainable Development
> > mp...@cam.ac.uk <mp...@cam.ac.uk>
> >
> >        [[alternative HTML version deleted]]
> >
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> >
> > All the best,
> > Marcos
> >
> >
> >
> >
> > --
> > PhD Engineering Candidate
> > University of Cambridge
> > Department of Engineering
> > Centre for Sustainable Development
> > mp...@cam.ac.uk <mp...@cam.ac.uk>
> >
> >        [[alternative HTML version deleted]]
> >
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



-- 
PhD Engineering Candidate
University of Cambridge
Department of Engineering
Centre for Sustainable Development
mp...@cam.ac.uk <mp...@cam.ac.uk>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to