Simone Santoro wrote: >> Simone Santoro wrote: >>> >>> I have 1! 2 response variables (species growth rates) and two >>> envir onmental factors that I want to test to find out a possible >>> relation. >>> >>> The sample size is quite small: (7<n<12, depending on each >>> species-case). >>> >>> I performed a Shapiro test (shapiro.test) to test for normal >>> distribution of the responses variables and they were normally >>> distribuited 10 times (over 12 possible, i.e. 12 response >>> variables). >>> >> The Shapiro test is probably not very powerful for such a small >> data set -- i.e., the data could be non-normal (in fact it almost >> certainly *is* non-normal) but the deviation is not detectable ... >> where do your growth rates come from? Can you make a guess at their >> probable distribution? >> >> >>>> The growth rates are calculated as ΔXt, where ΔXt = (Xt + 1) - >>>> Xt , Xt is loge (Nt), and Nt is the population size at time t. >>>> I use it and not directly population size because I found in a >>>> few cases (species population size trend) the existence of >>>> autocorrelation (time lag= >>> 1), nevertheless the "ΔXt"didn't >>>> >>> show autocorrelation and was equivalent to my purpose: >>>> investigating if "x1" or "x2" affected to the population >>>> dynamic of these species. I would expect that "ΔXt" would be >>>> normally >>> distribuited.
This seems perfectly reasonable. >> Why different procedures for different cases? >> >> >>>> I don't understand if you are suggesting to me to use different >>>> procedures for different cases or if you are asking me why! I >>>> used different procedures for different cases, in such case: I >>>> >>> didn't. I thought you said you tried models containing both x1 and x2 for 6 of the cases and just x2 for the other 6. Maybe I was confused -- maybe you were stating the results. >> You would probably be better off just doing summary() and looking >> at the p-values of the two predictors (if you must ...) >> >> Why are you using AIC if you! 're interested in testing >> relationships rather than prediction ? >> >> >>>> So, by reading the Whittingham et al. paper (thank you very >>>> much) and reading your commentaries I undertand I would be >>>> better off using the "full" (just two predictors) model and >>> >>>> taking in account the p-values of such a model (not using the >>>> stepAIC procedure), isn't it? yes. >> >>> THE QUESTIONS: >>> >>> 1) Can I trust in the existence of such statistical relation? I >>> mean: is there a way to know the power of this test in R? >>> >> There are power tests in R, but I don't know if there are any >> specifically for this case (two-predictor regression). Remember >> that power applies to the probability of type II (falsely failing >> to reject null hypothesis) errors. >> >> >>>> Ok, on the other hand, I suppose that the small sample size >>>> makes the existence of a statistical relation between the >>>> predictor and the! response variable even more reliable, isn't >>>> it? Actually, it means that you will only be able to detect large effects. If the estimated effects are larger than seem sensible, then they are quite likely spurious: http://www.stat.columbia.edu/~gelman/research/published/power4r.pdf >> >>> 2) I decided to use always "family=gaussian" because I have also >>> negative values in my response variable and I cannot manage it in >>> a different way. In fact I was not able to use as link function a >>> "negative binomial" as I previously did in SAS because of >>> negative values of response variable (as R "told" me when I >>> tried) >>> >> Is this a question? As above, glm() with gaussian family and >> identity (default) link is equivalent to lm(). >> >> >>>> Yes, now I understand and I shame because I'm aware it is a >>>> very basic statistical issue (I'm sorry!). But, if I strongly >>>> believe the response variable is normally distribuited, >>>> although >>> the small sample size makes difficult to test its >>>> normality, can I use lm() without testing for normality? In >>>> other words: can I trust on logical basis t! hat the >>>> statistical population beyond >>> the sample would be normally >>>> distribuited and consequently using lm()? I would say so. Ben Bolker
signature.asc
Description: OpenPGP digital signature
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.