Hi Werner,

AICs of nested models are compared on additive scale, not on multiplicative 
scale.  So, you have to think about how much the AIC is decreased when you add 
the new variable, not the factor by which it is reduced.  

If you are doing a stepwise selection based on AIC, then the p-value approach 
and AIC approach are related.  In the AIC approach, you include a new variable 
or delete an existing variable when the change in AIC score is 2 or more.  In 
the stepwise likelihood ratio test, LRT, (a.k.a. F-test in linear regression), 
to select variables, the AIC score change of 2 corresponds roughly to a p-value 
of 0.15, i.e. entering or deleting a variable if the p-value for the LRT is 
less than 0.15.

Of course, the big issue is that the sampling properties of stepwise model 
selection procedures are extremely difficult to characterize. Resampling and 
cross-validation approaches can help address this problem. Another more 
principled approach to model selection is to use regularization methods (e.g. 
ridge, lasso).  But there is no free lunch.  In regularization methods, one has 
to decide on the degree of regularization.

I hope I have successfully convinced you about the perils and pitfalls of model 
selection.  

Best,
Ravi.
____________________________________________________________________

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


----- Original Message -----
From: Werner Wernersen <pensterfuz...@yahoo.de>
Date: Saturday, June 13, 2009 10:52 am
Subject: Re: [R] Insignificant variable improves AIC (multinom)?
To: Peter Flom <peterflomconsult...@mindspring.com>, r-h...@stat.math.ethz.ch


>  > >Hi,
>  
>  > >
>  > >I am trying to specify a multinomial logit model using the 
> multinom function 
>  > from the nnet package. Now I add another independent variable and 
> it halves the 
>  > AIC as given by summary(multinom()). But when I call 
> Anova(multinom()) from the 
>  > car package, it tells me that this added variable is insignificant 
> 
>  > (Pr(>Chisq)=0.39). Thus, the improved AIC suggests to keep the 
> variable but the 
>  > Anova suggests to drop it.
>  > >
>  > >I am sure this is due to my lack of understanding of these models 
> but could 
>  > someone help me out with a pointer what my mistake is?
>  > 
>  > 
>  > I am not sure why you  would expect the same answer from AIC and 
> p-value.  They 
>  > are different questions.  AIC attempts to answer a question about 
> overall model 
>  > fit.  p-value for a particular variable attempts to answer whether 
> that 
>  > particular coefficient could be due to chance if the population 
> value of the 
>  > parameter was 0.
>  > 
>  > One way these could give different answers is if the new variable 
> affected the 
>  > parameter estimates for the other parameters.
>  > 
>  > It's yet another exemplar of the problems with using p-values for 
> model 
>  > selection
>  > 
>  > HTH
>  > 
>  > Peter
>  > 
>  > Peter L. Flom, PhD
>  > Statistical Consultant
>  > www DOT peterflomconsulting DOT com
>  
[[elided Yahoo spam]]
>  
>  That was very enlightening. I have to read up on model selection. The 
> thought I have to get my head around is that the added variable helps 
> explaining the observed variability in the data and thus should be 
> retained in the model. But since the coefficient is insignificant, I 
> cannot interpret it and if I use this equation for predictions then I 
> add a "random" value since I cannot reject that the coefficient is 
> actually zero instead of what I estimated.
>  
>  One just never sees someone presenting regression coefficients which 
> are not significant although model selection procedures are often 
> based on the AIC...
>  
>  Have a good weekend,
>    Werner
>  
>  
>  
>  
>  
>  ______________________________________________
>  R-help@r-project.org mailing list
>  
>  PLEASE do read the posting guide 
>  and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to