on 02/10/2009 03:33 PM Rolf Turner wrote: > > I am appealing to the general collective wisdom of this > list in respect of a statistics (rather than R) question. This question > comes to me from a friend who is a veterinary oncologist. In a study that > she is writing up there were 73 cats who were treated with a drug called > piroxicam. None of the cats were observed to be subject to vomiting prior > to treatment; 12 of the cats were subject to vomiting after treatment > commenced. She wants to be able to say that the treatment had a > ``significant'' > impact with respect to this unwanted side-effect. > > Initially she did a chi-squared test. (Presumably on the matrix > matrix(c(73,0,61,12),2,2) --- she didn't give details and I didn't pursue > this.) I pointed out to her that because of the dependence --- same 73 > cats pre- and post- treatment --- the chi-squared test is inappropriate. > > So what *is* appropriate? There is a dependence structure of some sort, > but it seems to me to be impossible to estimate. > > After mulling it over for a long while (I'm slow!) I decided that a > non-parametric approach, along the following lines, makes sense: > > We have 73 independent pairs of outcomes (a,b) where a or b is 0 > if the cat didn't barf, and is 1 if it did barf. > > We actually observe 61 (0,0) pairs and 12 (0,1) pairs. > > If there is no effect from the piroxicam, then (0,1) and (1,0) are > equally likely. So given that the outcome is in {(0,1),(1,0)} the > probability of each is 1/2. > > Thus we have a sequence of 12 (0,1)-s where (under the null hypothesis) > the probability of each entry is 1/2. Hence the probability of this > sequence is (1/2)^12 = 0.00024. So the p-value of the (one-sided) test > is 0.00024. Hence the result is ``significant'' at the usual levels, > and my vet friend is happy. > > I would very much appreciate comments on my reasoning. Have I made any > goof-ups, missed any obvious pit-falls? Gone down a wrong garden path? > > Is there a better approach? > > Most importantly (!!!): Is there any literature in which this approach is > spelled out? (The journal in which she wishes to publish will almost > surely > demand a citation. They *won't* want to see the reasoning spelled out in > the paper.) > > I would conjecture that this sort of scenario must arise reasonably often > in medical statistics and the suggested approach (if it is indeed valid > and sensible) would be ``standard''. It might even have a name! But I > have no idea where to start looking, so I thought I'd ask this wonderfully > learned list. > > Thanks for any input.
Rolf, I am a little confused, perhaps due to lack of sleep (sick dog with CHF). Typically in this type of study, essentially looking at the efficacy/safety profile of a treatment, there are two options. One does a two arm randomized study, whereby "subjects" are randomized to one of two treatments. The two treatments may both be "active" or one may be a placebo. Then a typical two sample comparison of the primary hypothesis is made. In this setting, you would have a second group of 73 cats who received a comparative treatment (or a placebo) to compare against the 16.4% observed in this treatment group. For example, say that patients were undergoing cancer treatment, which has nausea and vomiting as a side effect. Due to the side effect, it is common to see a reduction in dosing, which of course reduces treatment effectiveness. You might want to study a treatment that favorably reduces that side effect, to enable improved treatment dosing and patient tolerance. The other option would be to perform a single sample study, where there is an a priori hypothesis, based upon prior work, of the expected incidence of the adverse event or perhaps a "clinically acceptable" incidence of the adverse event. This would seem to be the scenario indicated above. What is lacking is some a priori expectation of the incidence of the event in question, so that one can show that you have reduced the incidence from the expected. 50% would not make sense here, though if it did, a single sample binomial test would be used, presuming a two-sided hypothesis: > binom.test(12, 73, 0.5)$p.value [1] 4.802197e-09 That none of them had vomiting prior to treatment seems to be of little interest here. You could just as easily argue that there was a significant increase in the incidence of vomiting from 0% to 16.4% due to the treatment. What am I missing? Regards, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.