I have answered a similar question just hours ago. Your question either indicates an unfamiliarity with R or, more generally, an unfamiliarity with regression analysis, or both.
Your anova indicates that the overall model is significant. That is, your model is an improvement over the null model with just an intercept. Thus, the anova output tells you that there are significant differences between the HR groups. This is also indicated by the F-test and p-value for the F-test in the second model output (the summary.lm). The question why R does not show HR2 or not just only HR indicates the lack of understanding of R or regression. The fact that there are seven coefficient estimates for HR3 through HR8 indicates that the HR variable is coded as a factor. Consequently, R estimates the model using dummy variables for each level of HR, except for the baseline, which is coded zero (this is basically an analysis of variance, not of covariance). By standard, the baseline is the smallest value (or lowest in alphabetical order) of a factor-coded variable, which is HR=2 in your case. Since it is the baseline, there is no estimate for a coefficient for HR2. All other estimated coefficients compare the effect of HR=3 through HR8 relative to the group in which HR=2. If you want to include HR as a numeric variable in the regression (= analysis of covariance), you have to code it as.numeric. Note, however, that your results for the factor-coded HR variable indicate that the trend (if any) is not linear. For a comparison of all groups against one another from an analysis of variance, I think there other methods, like the Bonferroni-Dunn test (as a post-hoc test). Best, Daniel ------------------------- cuncta stricte discussurus ------------------------- -----Ursprüngliche Nachricht----- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von Ross Culloch Gesendet: Thursday, August 13, 2009 4:46 PM An: r-help@r-project.org Betreff: [R] lm coefficients output confusing Hi all, I have an issue with the lm() function regarding the listing of the coefficients. My data are below, showing a list of hours (HR) relating to the time spent resting (R) by an individual animal. Simply i want to run a lm() to run in an anova() to see if there is a significant difference in resting between hours. HR R 1 2 0.6666667 2 2 0.4666667 3 2 0.8000000 4 2 0.6333333 5 2 0.7333333 6 2 0.8000000 7 2 0.8666667 8 2 0.7857143 9 2 0.7826087 10 2 0.6666667 11 2 0.9166667 12 2 0.6666667 13 3 0.5294118 14 3 0.8541667 15 3 0.4583333 16 3 0.5882353 17 3 0.9347826 18 3 0.7878788 19 3 0.7857143 20 3 0.6944444 21 3 0.8333333 22 3 0.7450980 23 3 0.9230769 24 3 0.7222222 25 4 0.6571429 26 4 0.7241379 27 4 0.7391304 28 4 0.6571429 29 4 0.8000000 30 4 0.9130435 31 4 0.7187500 32 4 0.8437500 33 4 0.9230769 34 4 0.8571429 35 4 0.8695652 36 4 0.8888889 37 5 0.3333333 38 5 0.5365854 39 5 0.6774194 40 5 0.7142857 41 5 0.6904762 42 5 0.5483871 43 5 0.5952381 44 5 0.4166667 45 5 0.5666667 46 5 0.5952381 47 5 0.7894737 48 5 0.7500000 49 6 0.6268657 50 6 0.7187500 51 6 0.5500000 52 6 0.7164179 53 6 0.7656250 54 6 0.5869565 55 6 0.7164179 56 6 0.7031250 57 6 0.7230769 58 6 0.7462687 59 6 0.9200000 60 6 0.8536585 61 7 0.6379310 62 7 0.5357143 63 7 0.5227273 64 7 0.8000000 65 7 0.6724138 66 7 0.7083333 67 7 0.7241379 68 7 0.6938776 69 7 0.6545455 70 7 0.7931034 71 7 0.7560976 72 7 0.8684211 73 8 0.6727273 74 8 0.6000000 75 8 0.8333333 76 8 0.8181818 77 8 0.7818182 78 8 0.7647059 79 8 0.5818182 80 8 0.5918367 81 8 0.7450980 82 8 0.7818182 83 8 0.8048780 84 8 0.8684211 The script i'm using and output is as follows: > anova(rdayml <- lm(R ~ HR, data=rdata2, na.action=na.exclude)) Analysis of Variance Table Response: R Df Sum Sq Mean Sq F value Pr(>F) HR 6 0.25992 0.04332 3.1762 0.00774 ** Residuals 77 1.05021 0.01364 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 > > summary(rdayml <- lm(R ~ HR,data=rdata2)) Call: lm(formula = R ~ HR, data = rdata2) Residuals: Min 1Q Median 3Q Max -0.279725 -0.065416 0.005593 0.077486 0.201070 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.732082 0.033713 21.715 <2e-16 *** HR3 0.005976 0.047678 0.125 0.9006 HR4 0.067232 0.047678 1.410 0.1625 HR5 -0.130935 0.047678 -2.746 0.0075 ** HR6 -0.013152 0.047678 -0.276 0.7834 HR7 -0.034807 0.047678 -0.730 0.4676 HR8 0.004971 0.047678 0.104 0.9172 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.1168 on 77 degrees of freedom Multiple R-squared: 0.1984, Adjusted R-squared: 0.1359 F-statistic: 3.176 on 6 and 77 DF, p-value: 0.00774 What i really don't understand is why the lm summary lists the hour numbers in the coefficient of the lm, as apposed to just reading HR? On top of that if R does display the data like this then i don't understand why it omits hour 2? If i can get this to work correctly can I use the p value to determine which of the hours is significantly different to the others - so in this example hour 5 is significantly different? Or is it just a case of using the p value from the anova to determine that there is a significant difference between hours (in this case) and use a plot to determine which hour(s) are likely to be the cause? Any help or advice would be most useful! Best wishes, Ross -- View this message in context: http://www.nabble.com/lm-coefficients-output-confusing-tp24958398p24958398.h tml Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.