I have a variable which is roughly age categories in decades. In the original data, it came in coded: > str(xxx) 'data.frame': 58271 obs. of 29 variables: $ issuecat : Factor w/ 5 levels "0 - 39","40 - 49",..: 1 1 1 1... snip
I then defined issuecat as ordered: > xxx$issuecat<-as.ordered(xxx$issuecat) When I include issuecat in a glm model, the result makes me think I have asked R for a linear+quadratic+cubic+quartic polynomial fit. The results are not terribly surprising under that interpretation, but I was hoping for only a linear term (which I was taught to called a "test of trend"), at least as a starting point. > age.mdl<-glm(actual~issuecat,data=xxx,family="poisson") > summary(age.mdl) Call: glm(formula = actual ~ issuecat, family = "poisson", data = xxx) Deviance Residuals: Min 1Q Median 3Q Max -0.3190 -0.2262 -0.1649 -0.1221 5.4776 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -4.31321 0.04865 -88.665 <2e-16 *** issuecat.L 2.12717 0.13328 15.960 <2e-16 *** issuecat.Q -0.06568 0.11842 -0.555 0.579 issuecat.C 0.08838 0.09737 0.908 0.364 issuecat^4 -0.02701 0.07786 -0.347 0.729 This also means my advice to a another poster this morning may have been misleading. I have tried puzzling out what I don't understand by looking at indices or searching in MASSv2, the Blue Book, Thompson's application of R to Agresti's text, and the FAQ, so far without success. What I would like to achieve is having the lowest age category be a reference category (with the intercept being the log-rate) and each succeeding age category be incremented by 1. The linear estimate would be the log(risk-ratio) for increasing ages. I don't want the higher order polynomial estimates. Am I hoping for too much? -- David Winsemius using R 2.6.1 in WinXP ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.