Re: [R] How to calculate confidence interval of C statistic by rcorr.cens

khosoda Sun, 22 May 2011 22:26:04 -0700

Dear Prof. Harrell,

I'm sorry to say this, but I'm afraid I cannot understand what you writevery well. Do you mean that the method to calculate confidence intervalsfor Dxy or C statistics in logistic model penalized for overfitting hasnot been established yet and what I did is wrong?

Could you elaborate it or teach me some reference point?


Kohkichi

(11/05/23 4:22), Frank Harrell wrote:

Hi Kohkichi,
What we really need to figure out is how to make validate give you
confidence intervals for Dxy or C while it is penalizing for overfitting.
Some people have ad hoc solutions for that but nothing is nailed down yet.
Frank

khosoda wrote:


Thank you for your comment, Prof Harrell.

I changed the function;

CstatisticCI<- function(x)   # x is object of rcorr.cens.
    {
      se<- x["S.D."]/2
      Low95<- x["C Index"] - 1.96*se
      Upper95<- x["C Index"] + 1.96*se

      cbind(x["C Index"], Low95, Upper95)
    }

  >  CstatisticCI(MyModel.lrm.penalized.rcorr)
                        Low95   Upper95
C Index 0.8222785 0.7195828 0.9249742

I obtained wider CI than the previous incorrect one.
Regarding your comments on overfitting, this is a sample used in model
development. However, I performed penalization by pentrace and lrm in
rms package. The CI above is CI of penalized model. Results of
validation of each model are followings;

First model
  >  validate(MyModel.lrm, bw=F, B=1000)
            index.orig training    test optimism index.corrected    n
Dxy           0.6385   0.6859  0.6198   0.0661          0.5724 1000
R2            0.3745   0.4222  0.3388   0.0834          0.2912 1000
Intercept     0.0000   0.0000 -0.1446   0.1446         -0.1446 1000
Slope         1.0000   1.0000  0.8266   0.1734          0.8266 1000
Emax          0.0000   0.0000  0.0688   0.0688          0.0688 1000
D             0.2784   0.3248  0.2474   0.0774          0.2010 1000
U            -0.0192  -0.0192  0.0200  -0.0392          0.0200 1000
Q             0.2976   0.3440  0.2274   0.1166          0.1810 1000
B             0.1265   0.1180  0.1346  -0.0167          0.1431 1000
g             1.7010   2.0247  1.5763   0.4484          1.2526 1000
gp            0.2414   0.2512  0.2287   0.0225          0.2189 1000

penalized model
  >  validate(MyModel.lrm.penalized, bw=F, B=1000)
            index.orig training    test optimism index.corrected    n
Dxy           0.6446   0.6898  0.6256   0.0642          0.5804 1000
R2            0.3335   0.3691  0.3428   0.0264          0.3072 1000
Intercept     0.0000   0.0000  0.0752  -0.0752          0.0752 1000
Slope         1.0000   1.0000  1.0547  -0.0547          1.0547 1000
Emax          0.0000   0.0000  0.0249   0.0249          0.0249 1000
D             0.2718   0.2744  0.2507   0.0236          0.2481 1000
U            -0.0192  -0.0192 -0.0027  -0.0165         -0.0027 1000
Q             0.2910   0.2936  0.2534   0.0402          0.2508 1000
B             0.1279   0.1192  0.1336  -0.0144          0.1423 1000
g             1.3942   1.5259  1.5799  -0.0540          1.4482 1000
gp            0.2141   0.2188  0.2298  -0.0110          0.2251 1000

Optimism of slope and intercept were improved from 0.1446 and 0.1734 to
-0.0752 and -0.0547, respectively. Emax was improved from 0.0688 to
0.0249. Therefore, I thought overfitting was improved at least to some
extent. Well, I'm not sure whether this is enough improvement though.

--
Kohkichi

(11/05/22 23:27), Frank Harrell wrote:

S.D. is the standard deviation (standard error) of Dxy.  It already
includes
the effective sample size in its computation so the sqrt(n) terms is not
needed.  The help file for rcorr.cens has an example where the confidence
interval for C is computed.  Note that you are making the strong
assumption
that there is no overfitting in the model or that you are evaluating C on
a
sample not used in model development.
Frank


Kohkichi wrote:


Hi,

I'm trying to calculate 95% confidence interval of C statistic of
logistic regression model using rcorr.cens in rms package. I wrote a
brief function for this purpose as the followings;

CstatisticCI<- function(x)   # x is object of rcorr.cens.
    {
      se<- x["S.D."]/sqrt(x["n"])
      Low95<- x["C Index"] - 1.96*se
      Upper95<- x["C Index"] + 1.96*se
      cbind(x["C Index"], Low95, Upper95)
    }

Then,

MyModel.lrm.rcorr<- rcorr.cens(x=predict(MyModel.lrm), S=df$outcome)
MyModel.lrm.rcorr

         C Index            Dxy           S.D.              n
missing     uncensored
       0.8222785      0.6445570      0.1047916    104.0000000
0.0000000    104.0000000
Relevant Pairs     Concordant      Uncertain
    3950.0000000   3248.0000000      0.0000000

CstatisticCI(x5factor_final.lrm.pen.rcorr)

                        Low95   Upper95
C Index 0.8222785 0.8021382 0.8424188

I'm not sure what "S.D." in object of rcorr.cens means. Is this standard
deviation of "C Index" or standard deviation of "Dxy"?
I thought it is standard deviation of "C Index". Therefore, I wrote the
code above. Am I right?

I would appreciate any help in advance.

--
Kohkichi Hosoda M.D.

      Department of Neurosurgery,
      Kobe University Graduate School of Medicine,

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context:
http://r.789695.n4.nabble.com/How-to-calculate-confidence-interval-of-C-statistic-by-rcorr-cens-tp3541709p3542163.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-confidence-interval-of-C-statistic-by-rcorr-cens-tp3541709p3542654.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
*************************************************
　神戸大学大学院医学研究科 脳神経外科学分野
　細田 弘吉
　
　〒650-0017　神戸市中央区楠町7丁目5-1
    Phone: 078-382-5966
    Fax  : 078-382-5979
    E-mail address
        Office: khos...@med.kobe-u.ac.jp
        Home  : khos...@venus.dti.ne.jp

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to calculate confidence interval of C statistic by rcorr.cens

Reply via email to