The regression book by John Fox: http://socserv.mcmaster.ca/jfox/Books/Companion/index.html has a section on regression diagnostics and everything is done in R which might make it particularly suitable.
On Sat, Feb 14, 2009 at 6:48 PM, Jason Rupert <jasonkrup...@yahoo.com> wrote: > Many thanks to Greg L. Snow and David Winsemius for their responses. > > First off I can safely say I don't know enough statistics to be dangerous, > but hopefully I will get to that point:) > > Regarding the goal - ultimately I would like to use linear regression > (constrained for using linear regression at this point) for my data. I > thought the requirements for using linear regression was the following (I > pulled this list from > www.utexas.edu/courses/schwab/sw318_spring_2004/SolvingProblems/Class27_RegressionNCorrHypoTest.ppt): > > The assumptions required for utilizing a regression equation are the same as > the assumptions for the test of significance of a correlation coefficient. > Both variables are interval level. > Both variables are normally distributed. > The relationship between the two variables is linear. > The variance of the values of the dependent variable is uniform for all > values of the independent variable (equality of variance). > > Thus, I was going to attempt to (1) identify which distribution my data most > closely represents, (2) translate my data so that it is normal, and (3) then > use linear regression on the data. > > However, if > "The assumptions of most regression methods is that the *errors* need to > have the desired relationship between means and variance, and not that the > dependent variable be "normal". Many times the apparent non-normality will > be "explained" or "captured" by the regression model." > > Does this mean I can just "do" linear regression without translating my data > and it will be okay? > > Note that I was using "lm" from R to access the errors, however, I had not > an opportunity to do much analysis of those results to determine if they are > Gaussian or not. > > I guess I am going to try to track down the following documents: > (1) Statistical Distributions (Paperback) > by Merran Evans (Author), Nicholas Hastings (Author), Brian Peacock (Author) > # ISBN-10: 0471371246 > # ISBN-13: 978-0471371243 > > (2) Regression Modeling Strategies (Hardcover) > by Frank E. Jr. Harrell (Author) > # ISBN-10: 0387952322 > # ISBN-13: 978-0387952321 > > Maybe electronic versions of those documents are available. My wife is > already giving me a hard time the volume of books around. > > Thank you again for all your feedback and insights. > > > --- On Fri, 2/13/09, David Winsemius <dwinsem...@comcast.net> wrote: > > From: David Winsemius <dwinsem...@comcast.net> > Subject: Re: [R] Website, book, paper, etc. that shows example plots of > distributions? > To: jasonkrup...@yahoo.com > Cc: "Gabor Grothendieck" <ggrothendi...@gmail.com>, R-help@r-project.org > Date: Friday, February 13, 2009, 9:10 AM > > This is probably the right time to issue a warning about the error of making > transformations on the dependent variable before doing your analysis. The > classic error that newcomers to statistics commit is to decide that they > want to > "make their data normal". The assumptions of most regression methods > is that the *errors* need to have the desired relationship between means and > variance, and not that the dependent variable be "normal". Many times > the apparent non-normality will be "explained" or "captured" > by the regression model. Other methods of modeling non-linear dependence are > also available. > > I found Harrell's book "Regression Modeling Strategies" to be an > excellent source for alternatives. My copy of V&R's MASS is only the > second edition but chapters 5 & 6 in that edition on > linear models also had > examples of using QQ plots on residuals. Checking that text's website I see > that chapters 6 at least is probably similar. They include the scripts from > their chapters along with the MASS package (installed as part of the VR > bundle). > My copy is entitled "ch06.r" and resides in the scripts subdirectory: > /Library/Frameworks/R.framework/Versions/2.8/Resources/library/MASS/scripts/ch06.R > > --David Winsemius > > > On Feb 13, 2009, at 8:11 AM, Jason Rupert wrote: > >> Thank you very much. Thank you again regarding the suggestion below. I > will give that a shot and I guess I've got my work counted out for me. I > counted 45 different distributions. >> >> Is the best way to get a QQPlot of each, to run through producing a data > set for each distribution and then using the qqplot function to get a QQplot > of > the distribution and then compare it with my data distribution? >> >> > As you can tell I am not a trained statistician, so any guidance or > suggested further reading is greatly appreciated. >> >> I guess I am pretty sure my data is not a normal distribution due to doing > some of the empirical "Goodness of Fit" tests and comparing the QQplot > of my data against the QQPlot of a normal distribution with the same number > of > points. I guess the next step is to figure out which distribution my data > most > closely matches. >> >> Also, I guess I could also fool around and take the log, sqrt, etc. of my > data and see if it will then more closely resemble a normal distribution. >> >> Thank you again for assisting this novice data analyst who is trying to > gain a better understanding of the techniques using this powerful software > package. >> >> >> >> >> --- On Fri, 2/13/09, Gabor Grothendieck <ggrothendi...@gmail.com> > wrote: >> From: Gabor > Grothendieck <ggrothendi...@gmail.com> >> Subject: Re: [R] Website, book, paper, etc. that shows example plots of > distributions? >> To: jasonkrup...@yahoo.com >> Cc: R-help@r-project.org >> Date: Friday, February 13, 2009, 5:43 AM >> >> You can readily create a dynamic display for using qqplot and similar > functions >> in conjunction with either the playwith or TeachingDemos packages. >> >> For example, to investigate the effect of the shape parameter in the skew >> normal distribution on its qqplot relative to the normal distribution: >> >> library(playwith) >> library(sn) >> playwith(qqnorm(rsn(100, shape = shape)), >> parameters = list(shape = seq(-3, 3, .1))) >> >> Now move the slider located at the bottom of the window that >> appears and watch the plot change in response to changing >> the shape value. >> >> You can > find more distributions here: >> http://cran.r-project.org/web/views/Distributions.html >> >> On Thu, Feb 12, 2009 at 1:04 PM, Jason Rupert > <jasonkrup...@yahoo.com> >> wrote: >>> By any chance is any one aware of a website, book, paper, etc. or >> combinations of those sources that show plots of different distributions? >>> >>> After reading a pretty good whitepaper I became aware of the benefit > of I >> the benefit of doing Q-Q plots and histograms to help assess a > distribution. >> The whitepaper is called: >>> "Univariate Analysis and Normality Test Using SAS, Stata, and >> SPSS*" , (c) 2002-2008 The Trustees of Indiana University Univariate >> Analysis and Normality Test: 1, Hun Myoung Park >>> >>> Unfortunately the white paper does not provide an extensive amount of >> example distributions plotted using Q-Q plots and histograms, so I > am > curious if >> there is a "portfolio"-type website or other whitepaper shows >> examples of various types of distributions. >>> >>> It would be helpful to see a bunch of Q-Q plots and their associated >> histograms to get an idea of how the distribution looks in comparison > against >> the Gaussian. >>> >>> I think seeing the plot really helps. >>> >>> Thank you for any insights. >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> >> > >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.