The Inferno awaits me -- but I cannot resist a comment (but DO look at Frank's website).
There is a deep and disconcerting dissonance here. Scientists are (naturally) interested in getting at mechanisms, and so want to know which of the variables "count" and which do not. But statistical analysis -- **any** statistical analysis -- cannot tell you that. All statistical analysis can do is build models that give good predictions (and only over the range of the data). The models you get depend **both** on the way Nature works **and** the peculiarities of your data (which is what Frank referred to in his comment on data reduction). In fact, it is highly likely that with your data there are many alternative prediction equations built from different collections of covariates that perform essentially equally well. Sometimes it is otherwise, typically when prospective, carefully designed studies are performed -- there is a reason that the FDA insists on clinical trials, after all (and reasons why such studies are difficult and expensive to do!). The belief that "data mining" (as it is known in the polite circles that Frank obviously eschews) is an effective (and even automated!) tool for discovering how Nature works is a misconception, but one that for many reasons is enthusiastically promoted. If you are looking only to predict, it may do; but you are deceived if you hope for Truth. Can you get hints? -- well maybe, maybe not. Chaos beckons. I think many -- maybe even most -- statisticians rue the day that stepwise regression was invented and certainly that it has been marketed as a tool for winnowing out the "important" few variables from the blizzard of "irrelevant" background noise. Pogo was right: " We have seen the enemy -- and it is us." (As I said, the Inferno awaits...) Cheers to all, Bert Gunter DEFINITELY MY OWN OPINIONS HERE! -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of David Winsemius Sent: Saturday, September 27, 2008 5:34 PM To: Darin Brooks Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [R] FW: logistic regression It's more a statement that it expresses a statistical perspective very succinctly, somewhat like a Zen koan. Frank's book,"Regression Modeling Strategies", has entire chapters on reasoned approaches to your question. His website also has quite a bit of material free for the taking. -- David Winsemius Heritage Laboratories On Sep 27, 2008, at 7:24 PM, Darin Brooks wrote: > Glad you were amused. > > I assume that "booking this as a fortune" means that this was an > idiotic way > to model the data? > > MARS? Boosted Regression Trees? Any of these a better choice to > extract > significant predictors (from a list of about 44) for a measured > dependent > variable? > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > ] On > Behalf Of Ted Harding > Sent: Saturday, September 27, 2008 4:30 PM > To: [EMAIL PROTECTED] > Subject: Re: [R] FW: logistic regression > > > > On 27-Sep-08 21:45:23, Dieter Menne wrote: >> Frank E Harrell Jr <f.harrell <at> vanderbilt.edu> writes: >> >>> Estimates from this model (and especially standard errors and >>> P-values) >>> will be invalid because they do not take into account the stepwise >>> procedure above that was used to torture the data until they >>> confessed. >>> >>> Frank >> >> Please book this as a fortune. >> >> Dieter > > Seconded! > Ted. > > -------------------------------------------------------------------- > E-Mail: (Ted Harding) <[EMAIL PROTECTED]> > Fax-to-email: +44 (0)870 094 0861 > Date: 27-Sep-08 Time: 23:30:19 > ------------------------------ XFMail ------------------------------ > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > No virus found in this incoming message. > Checked by AVG - http://www.avg.com > > 6:55 PM > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.