Using any 'significance level', I think is the main problem in the stepwise variable selection method. As such in 'normal' circumstances the interpretation of p-value is topsy-turvy. Then you can only imagine as to what happens to this p-value interpretation in this process of variable selection...you no longer no, what does the significance level mean, if at all anything? smita
--- Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: > Xiaohui Chen wrote: > > step or stepAIC functions do the job. You can opt > to use BIC by changing > > the mulplication of penalty. > > > > I think AIC and BIC are not only limited to > compare two pre-defined > > models, they can be used as model search criteria. > You could enumerate > > the information criteria for all possible models > if the size of full > > model is relatively small. But this is not > generally scaled to practical > > high-dimensional applications. Hence, it is often > only possible to find > > a 'best' model of a local optimum, e.g. measured > by AIC/BIC. > > Sure you can use them that way, and they may perform > better than other > measures, but the resulting model will be highly > biased (regression > coefficients biased away from zero). AIC and BIC > were not designed to > be used in this fashion originally. Optimizing AIC > or BIC will not > produce well-calibrated models as does penalizing a > large model. > > > > > On the other way around, I wouldn't like to say > the over-penalization of > > BIC. Instead, I think AIC is usually > underpenalizing larger models in > > terms of the positive probability of incoperating > irrevalent variables > > in linear models. > > If you put some constraints on the process (e.g., if > using AIC to find > the optimum penalty in penalized maximum likelihood > estimation), AIC > works very well and BIC results if far too much > shrinkage > (underfitting). If using a dangerous process such > as stepwise variable > selection, the more conservative BIC may be better > in some sense, worse > in others. The main problem with stepwise variable > selection is the use > of significance levels for entry below 1.0 and > especially below 0.1. > > Frank > > > > > X > > > > Frank E Harrell Jr åé: > >> Smita Pakhale wrote: > >>> Hi Maria, > >>> > >>> But why do you want to use forwards or backwards > >>> methods? These all are 'backward' methods of > modeling. > >>> Try using AIC or BIC. BIC is much better than > AIC. > >>> And, you do not have to believe me or any one > else on > >>> this. > >> > >> How does that help? BIC gives too much > penalization in certain > >> contexts; both AIC and BIC were designed to > compare two pre-specified > >> models. They were not designed to fix problems of > stepwise variable > >> selection. > >> > >> Frank > >> > >>> > >>> Just make a small data set with a few variables > with > >>> known relationship amongst them. With this > simulated > >>> data set, use all your modeling methods: > backwards, > >>> forwards, AIC, BIC etc and then see which one > gives > >>> you a answer closest to the truth. The beauty of > using > >>> a simulated dataset is that, you 'know' the > truth, as > >>> you are the 'creater' of it! > >>> > >>> smita > >>> > >>> --- Charilaos Skiadas <[EMAIL PROTECTED]> > wrote: > >>> > >>>> A google search for "logistic regression with > >>>> stepwise forward in r" returns the following > post: > >>>> > >>>> > >>> > https://stat.ethz.ch/pipermail/r-help/2003-December/043645.html > >>>> Haris Skiadas > >>>> Department of Mathematics and Computer Science > >>>> Hanover College > >>>> > >>>> On May 28, 2008, at 7:01 AM, Maria wrote: > >>>> > >>>>> Hello, > >>>>> I am just about to install R and was wondering > >>>> about a few things. > >>>>> I have only worked in Matlab because I wanted > to > >>>> do a logistic > >>>>> regression. However Matlab does not do > logistic > >>>> regression with > >>>>> stepwiseforward method. Therefore I thought > about > >>>> testing R. So my > >>>>> question is > >>>>> can I do logistic regression with stepwise > forward > >>>> in R? > >>>>> Thanks /M > >>>> ______________________________________________ > >>> > >> > > > > > > > -- > Frank E Harrell Jr Professor and Chair > School of Medicine > Department of Biostatistics > Vanderbilt University > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.