Thank you Joris. Your explanation makes sense. 
 
What nQuery does is confusing though. The software simply ask for p1 and p2 at 
any given time t, and then calculate the sample size using the formula. For 
example , the intepretation can be something like "100 patients per group are 
needed to detect the difference of p1=0.8 and p2=0.6 at time t at 5% 
significance level with 80% power". It seems like To calculate sample size, 
user just need to provide p1 and p2 at ANY given time during the follow up. 
This is where my confusion rose because sample size will be different based on 
how you choose the time point at which p1 and p2 were selected.
 
My guess the time t at which p1 and p2 are selected is not any time point. It 
seems to be at the end of follow up, i.e. time t is the length of follow up. 
Let's say, if t=1 year, the above example should be "100 patients per group 
have to be followed up for 1 year to detect the difference of p1=0.8 and 
p2=0.6 at 1 year at 5% significance level with 80% power". If t=5 years, 
then the intepretation is "100 patients per group have to be followed up for 5 
years to detect the difference of p1=0.8 and p2=0.6 at 5 years at 5% 
significance level with 80% power"

any comments are appreciated.
 
John

--- On Thu, 5/6/10, Joris Meys <jorism...@gmail.com> wrote:


From: Joris Meys <jorism...@gmail.com>
Subject: Re: [R] sample size for survival curves
To: "array chip" <arrayprof...@yahoo.com>
Date: Thursday, May 6, 2010, 8:12 PM


It sounds logic to get different sample sizes depending on the time you run the 
experiment. Say you expect a fixed death rate of 5% and 10%  in both groups. 
take 20 patients in every group, and after one year you have 19 and 18 
survivors, respectively. After 5 years, you have 15 and 10 survivors, which is 
a bigger difference, and can hence be more easily detected.

Cheers
Joris


On Fri, May 7, 2010 at 1:45 AM, array chip <arrayprof...@yahoo.com> wrote:

Dear R users, I am not asking questions specifically on R, but I know there are 
many statistical experts here in the R community, so here it goes my questions:

Freedman (1982) propose an approximation of sample size/power calculation based 
on log-rank test using the formula below (This is what nQuery does):
            (Z(1-α/side)+Z(power))^2*(hazard.ratio+1)^2
     N  =  ---------------------------------------------
                    (2-p1-p2)*(hazard.ratio-1)^2

Where Z is the standard normal cumulative distribution. p1 and p2 are the 
survival probability of the 2 groups at a given time, say t.

As you can see, the sample size depends on the survival probabilities, p1 and 
p2. This is where my question lies. Let’s say we have 2 survival curves. I 
can choose p1 and p2 at time 1 year, and calculate a sample size. I can also 
choose p1 and p2 at time 5 years (still the same hazard ratio since the same 2 
survival curves), and calculate a different sample size. How to interpret the 2 
estimates of sample size?

This problem doesn’t occur when we calculate the number of events required 
using this formula:
              4*( Z(α/side)+Z(power))^2
             --------------------------
                (log(hazard.ratio))^2

Because number of events required only depends on hazard ratio.

Thanks for any suggestions.

John




______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering 
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be 
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php




      
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to