Thank you Joris. Your explanation makes sense.  What nQuery does is confusing though. The software simply ask for p1 and p2 at any given time t, and then calculate the sample size using the formula. For example , the intepretation can be something like "100 patients per group are needed to detect the difference of p1=0.8 and p2=0.6 at time t at 5% significance level with 80% power". It seems like To calculate sample size, user just need to provide p1 and p2 at ANY given time during the follow up. This is where my confusion rose because sample size will be different based on how you choose the time point at which p1 and p2 were selected.  My guess the time t at which p1 and p2 are selected is not any time point. It seems to be at the end of follow up, i.e. time t is the length of follow up. Let's say, if t=1 year, the above example should be "100 patients per group have to be followed up for 1 year to detect the difference of p1=0.8 and p2=0.6 at 1 year at 5% significance level with 80% power". If t=5 years, then the intepretation is "100 patients per group have to be followed up for 5 years to detect the difference of p1=0.8 and p2=0.6 at 5 years at 5% significance level with 80% power" any comments are appreciated.  John --- On Thu, 5/6/10, Joris Meys <jorism...@gmail.com> wrote: From: Joris Meys <jorism...@gmail.com> Subject: Re: [R] sample size for survival curves To: "array chip" <arrayprof...@yahoo.com> Date: Thursday, May 6, 2010, 8:12 PM It sounds logic to get different sample sizes depending on the time you run the experiment. Say you expect a fixed death rate of 5% and 10% in both groups. take 20 patients in every group, and after one year you have 19 and 18 survivors, respectively. After 5 years, you have 15 and 10 survivors, which is a bigger difference, and can hence be more easily detected. Cheers Joris On Fri, May 7, 2010 at 1:45 AM, array chip <arrayprof...@yahoo.com> wrote: Dear R users, I am not asking questions specifically on R, but I know there are many statistical experts here in the R community, so here it goes my questions: Freedman (1982) propose an approximation of sample size/power calculation based on log-rank test using the formula below (This is what nQuery does):       (Z(1-α/side)+Z(power))^2*(hazard.ratio+1)^2    N  =  ---------------------------------------------           (2-p1-p2)*(hazard.ratio-1)^2 Where Z is the standard normal cumulative distribution. p1 and p2 are the survival probability of the 2 groups at a given time, say t. As you can see, the sample size depends on the survival probabilities, p1 and p2. This is where my question lies. Letâs say we have 2 survival curves. I can choose p1 and p2 at time 1 year, and calculate a sample size. I can also choose p1 and p2 at time 5 years (still the same hazard ratio since the same 2 survival curves), and calculate a different sample size. How to interpret the 2 estimates of sample size? This problem doesnât occur when we calculate the number of events required using this formula:        4*( Z(α/side)+Z(power))^2        --------------------------         (log(hazard.ratio))^2 Because number of events required only depends on hazard ratio. Thanks for any suggestions. John ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.