Hi,
I'm a very new user of R and I hope not to be too "basic" (I tried to find the answer to my questions by other ways but I was not able to). I have 12 response variables (species growth rates) and two environmental factors that I want to test to find out a possible relation. The sample size is quite small: (7<n<12, depending on each species-case). I performed a Shapiro test (shapiro.test) to test for normal distribution of the responses variables and they were normally distribuited 10 times (over 12 possible, i.e. 12 response variables). I performed a Generalized Linear Model in R-software (MASS package), and I selected models by automatic backward stepwise (stepAIC procedure) considering as the starting model the one with the additive effects of both the factors. This is the case for six species growth rates (six cases) but for the others six I tested the effect of just one factor ("x2", see below) using just the "glm" procedure. So, my object containing the data is called "data" and, this is the editor for the first species (sp1): GLM1<-glm(growth.sp1~x1+x2,family=gaussian, data) MOD.SELECTION<-stepAIC(GLM1, trace=TRUE) summary(MOD.SELECTION) Here I attach an example of one of these analyses and after I finally give you my questions (I hope not to be too long-winded!!): > sp1.starting.model<-glm(sp1~x1+x2,family=gaussian, data) > sp1.back<-stepAIC(sp1.starting.model, trace=TRUE) Start: AIC=63.6 sp1 ~ x1 + x2 Df Deviance AIC - x2 1 73.490 61.801 <none> 72.278 63.602 - x1 1 122.659 67.949 Step: AIC=61.8 gpf ~ x1 Df Deviance AIC <none> 73.490 61.801 - x1 1 126.400 66.309 > summary(sp1.back) Call: glm(formula = sp1 ~ x1, family = gaussian, data = data) Deviance Residuals: Min 1Q Median 3Q Max -5.04833 -1.15233 -0.06802 0.81325 5.11464 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -7.62399 3.11127 -2.450 0.0342 * x1 0.20595 0.07675 2.683 0.0230 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for gaussian family taken to be 7.348965) Null deviance: 126.40 on 11 degrees of freedom Residual deviance: 73.49 on 10 degrees of freedom (1 observation deleted due to missingness) AIC: 61.801 Number of Fisher Scoring iterations: 2 THE QUESTIONS: 1) Can I trust in the existence of such statistical relation? I mean: is there a way to know the power of this test in R? 2) I decided to use always "family=gaussian" because I have also negative values in my response variable and I cannot manage it in a different way. In fact I was not able to use as link function a "negative binomial" as I previously did in SAS because of negative values of response variable (as R "told" me when I tried) 3) How should I interpret the dispersion value R give me (in the case reported it was "7.3")? I mean, what range of values (if it does exist) I would expect to make the result reliable in the case of "family=gaussian" (I'm not interested in prediction but just in finding a statistical relation)? Thank you very much in advance, Best wishes _________________________________________________________________ [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.