But you are still left with the problem of choosing the regularization parameter, i.e. how much to shrink the coefficients? In other words, there is no free ride.
Ravi. ____________________________________________________________________ Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu ----- Original Message ----- From: Frank E Harrell Jr <f.harr...@vanderbilt.edu> Date: Saturday, July 4, 2009 9:26 am Subject: Re: [R] is AIC always 100% in evaluating a model? To: Tal Galili <tal.gal...@gmail.com> Cc: r-help@r-project.org, Ben Bolker <bol...@ufl.edu> > Tal Galili wrote: > > Hi Ben, > > I just wished to give a small remark about your claim: > > "it's best not to consider hypothesis testing (statistical > significance) and > > AIC in the same analysis." > > > > Since in the case of forward selection for orthogonal matrix's, it > can be > > shown that AIC is like using a P to enter rule of 0.16. For further > > reference see:page 3 of: "A SIMPLE FORWARD SELECTION PROCEDURE BASED > > ONFALSE DISCOVERY RATE CONTROL" BY YOAV BENJAMINI AND YULIA GAVRILOV, > > > > > > > > Cheers, > > Tal Galili > > Tal, > > That is not limited to orthogonal designs. When used for one > variable > at a time variable selection. AIC is just a restatement of the > P-value, > and as such, doesn't solve the severe problems with stepwise variable > > selection other than forcing us to use slightly more sensible alpha > values. As an aside, some statisticians try to deal with > multiplicity > problems caused by stepwise variable selection by making alpha > smaller > than 0.05. This increases bias by giving variables whose effects are > > estimated with error a greater relative chance of being selected. > alpha > typically needs to be 0.5 or greater to avoid problems with stepwise > > variable selection. > > AIC was designed to compare two pre-specified models. > > Variable selection does not compete well with shrinkage methods that > > simultaneously model all potential predictors. > > Frank > > > > > > > > > > > > > On Sat, Jul 4, 2009 at 1:46 AM, Ben Bolker <bol...@ufl.edu> wrote: > > > >> > >> > >> alexander russell-2 wrote: > >>> Hello, > >>> I'd like to say that it's clear when an independent variable can > be ruled > >>> out generally speaking; on the other hand in R's AIC with bbmle, > if one > >>> finds a better AIC value for a model without the given independent > >>> variable, > >>> versus the same model with, can we say that the independent > variable is > >>> not > >>> likely to be significant(in the ordinary sense!)? > >>> > >>> That is, having made a lot of models from a data set, then the > best two > >>> are > >>> say 78.2 and 79.3 without and with (a second independent variable > >>> respectively) should we say it's better to judge the influence of > the 2nd > >>> IV > >>> as insignificant? > >>> regards, > >>> -shfets > >>> _____________________________________ > >>> > >>> > >> Without meaning to sound snarky, it's best not to consider hypothesis > >> testing (statistical significance) and AIC in the same analysis. > >> If you want to decide whether predictor variables have a significant > >> effect on a response, you should consider their effect in the full > model, > >> via Wald test, likelihood ratio test, etc.. If you want to find > the model > >> with the best expected predictive capability (i.e. lowest expected > >> Kullback-Leibler distance), you should use AIC. > >> > >> Burnham and Anderson, among others, say this repeatedly. > >> > >> In general, for a one-parameter difference, hypothesis testing > >> is "more conservative" than AIC (e.g., critical log-likelihood difference > >> for a p-value of 0.05 under the LRT test is 1.92, while the log-likelihood > >> difference required to say that a model is expected to have better > >> predictive capability/lower AIC is 1) -- but since they are > designed to > >> answer > >> such different questions, it's not even a fair comparison. > >> > >> Ben Bolker > >> > >> -- > >> View this message in context: > >> > >> Sent from the R help mailing list archive at Nabble.com. > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> > >> PLEASE do read the posting guide > >> > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > > > > -- > Frank E Harrell Jr Professor and Chair School of Medicine > Department of Biostatistics Vanderbilt University > > ______________________________________________ > R-help@r-project.org mailing list > > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.