Re: [R] Difference between R and SAS in Corcordance index in ordinal logistic regression

Olivier Collignon Thu, 24 Jan 2013 07:20:24 -0800

Dear Dr Harrell,
Thank you very much for your answer. Actually I also tried to found the C index 
by hand on these data using the mean probabilities and I found 0.968, as you 
just showed.
I understand now why I had a slight difference with the outpout of lrm. I am 
thus convinced that this result is correct.


I read on the SAS help that the procedure logistic also proceed to some binning 
(BINWIDTH option) :

http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_logistic_sect010.htm

But I cannot explain why the difference between the two softwares is that huge, 
especially since the class probabilities are the same.

Do you think it could be due to the fact that mean probabilities are computed 
differently ?

Thank for your help and best regards,
OC


> Date: Thu, 24 Jan 2013 05:28:13 -0800
> From: f.harr...@vanderbilt.edu
> To: r-help@r-project.org
> Subject: Re: [R] Difference between R and SAS in Corcordance index in ordinal 
> logistic regression
> 
> lrm does some binning to make the calculations faster.  The exact calculation
> is obtained by running
> 
> f <- lrm(...)
> rcorr.cens(predict(f), DA), which results in:
> 
>        C Index            Dxy           S.D.              n        missing 
>     0.96814404     0.93628809     0.03808336    32.00000000     0.00000000 
>     uncensored Relevant Pairs     Concordant      Uncertain 
>    32.00000000   722.00000000   699.00000000     0.00000000 
> 
> I.e., C=.968 instead of .963.  But this is even farther away than the value
> from SAS you reported.
> 
> If you don't believe the rcorr.cens result, create a tiny example and do the
> calculations by hand.
> Frank
> 
> 
> blackscorpio81 wrote
> > Dear R users,
> > 
> > Please allow to me ask for your help.
> >  I am currently using Frank Harrell Jr package "rms" to model ordinal
> > logistic regression with proportional odds. In order to assess model
> > predictive ability, C concordance index is displayed and equals to 0.963.
> > 
> > This is the code I used with the data attached 
> > data.csv <http://r.789695.n4.nabble.com/file/n4656409/data.csv>  
> >  :
> > 
> >>require(rms)
> >>a<-read.csv2("/data.csv",row.names = 1,na.strings = c(""," "),dec=".")
> >>lrm(DA~SJ+TJ,data=a)
> > 
> > Logistic Regression Model
> > 
> > lrm(formula = DA~SJ+TJ, data = a)
> > 
> > Frequencies of Responses
> > 
> >  1  2  3  4 
> >  6 13  9  4 
> > 
> >                                               Model Likelihood         
> > Discrimination                  Rank Discrim.    
> >                                              Ratio Test                     
> >   
> > Indexes                               Indexes       
> > Obs            32                      LR chi2      53.14             R2    
> >   
> > 0.875                      C       0.963    
> > max |deriv| 6e-06             d.f.             2                    g       
> >       
> > 8.690                Dxy     0.925    
> >                                              Pr(> chi2) <0.0001         gr  
> >  
> > 5942.469                    gamma   0.960    
> >                                                                             
> >          
> > gp       0.486                      tau-a   0.673    
> >                                                                             
> >          
> > Brier    0.022                     
> > 
> >                         Coef              S.E.        Wald  Z     Pr(>|Z|)
> > y>=2             -0.6161     0.6715        -0.92           0.3589  
> > y>=3             -6.5949     2.3750        -2.78          0.0055  
> > y>=4        -16.2358        5.3737         -3.02         0.0025  
> > SJ                 1.4341      0.5180          2.77         0.0056  
> > TJ                  0.5312      0.2483         2.14          0.0324
> > 
> > I wanted to compare the results with SAS. I found the same slopes and
> > intercept with opposite signs, which is normal since R models the
> > probabilities P(Y>=k|X) whereas SAS models the probabilities P(Y<=k|X) 
> > (see pdf attached, page 2 , table "Association des probabilités prédites
> > et des réponses observées").
> > SAS_Report_-_Logistic_Regression.pdf
> > <http://r.789695.n4.nabble.com/file/n4656409/SAS_Report_-_Logistic_Regression.pdf>
> >   
> > 
> > I chose the order for levels.
> > 
> > I controlled that the corresponding probabilities P(Y=k|X)  are the same
> > with both softwares. But I can't understand why in SAS the C index drops
> > from 0.963 down to 0.332.
> > 
> > I read a lot of things about this and it seems to me that both softwares
> > use slightly different technique to compute the C index ; it is
> > nevertheless surprising to me to observe such a shift in the results.
> > 
> > Does anyone have a clue on this ?
> > Thank you very much for you help
> > Blackscorpio
> 
> 
> 
> 
> 
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Difference-between-R-and-SAS-in-Corcordance-index-in-ordinal-logistic-regression-tp4656409p4656508.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R and SAS in Corcordance index in ordinal logistic regression

Reply via email to