You **really** should work with a local statistician. Remote statistical advice (this is not really about R) from even well-meaning helpers unfamiliar with your work is really very risky. For example, I would suggest making all sorts of plots (statistical summaries alone are wholly inadequate and potentially quite misleading), but exactly what to plot, how to interpret what the plots show, and what to do next would depend on both the subject matter background (how the study was conducted and what sorts of mechanisms are expected, for example)and what the plots revealed.
Like the gangster movies (used to) say: just a friendly warning ... :) -- Bert Gunter Genentech ----- Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andreas Nord Sent: Monday, August 25, 2008 9:22 AM To: r-help@r-project.org Subject: [R] lmer4 and variable selection Dear list, I am currently working with a rather large data set on body temperature regulation in wintering birds. My original model contains quite a few dependent variables, but I do not (of course) wish to keep them all in my final model. I've fitted the following model to the data: > temp.lme1<-lmer(T.B~tarsus+wing+weight+factor(age)+factor(sex)+fat+minsunset +day1oct+day1oct.2+minnight+ave.day+minnight.1+T.A+ave.night.1+(1|ID)+(1|sig n),data=bodytemp.df) where T.B equals body temperature; explanatories are a number of biometric measures (tarsus, wing, weight, fat, age, sex) and various measures of ambient temperature (ave.day, minnight.1, minnight, ave.night.1, T.A) and time/date (minsunset,day1oct,day1oct.2). Random factors are ID (individuals were samples ranging from 1 to 3 times) and sign (person performing measurements; 2 levels). Model output looks like this: > summary(temp.lme1) Linear mixed model fit by REML Formula: T.B ~ tarsus + wing + weight + factor(age) + factor(sex) + fat + minsunset + day1oct + day1oct.2 + minnight + ave.day + minnight.1 + T.A + ave.night.1 + (1 | ID) + (1 | sign) Data: bodytemp.df AIC BIC logLik deviance REMLdev 557.8 614 -260.9 441 521.8 Random effects: Groups Name Variance Std.Dev. ID (Intercept) 1.0399e-01 0.32247096 sign (Intercept) 6.2663e-08 0.00025033 Residual 8.0162e-01 0.89533134 Number of obs: 167, groups: ID, 124; sign, 2 Fixed effects: Estimate Std. Error t value (Intercept) 4.124e+01 4.104e+00 10.049 tarsus -5.925e-02 5.801e-02 -1.021 wing -6.252e-02 4.984e-02 -1.254 weight 1.499e-01 1.446e-01 1.037 factor(age)2K+ 1.981e-01 1.651e-01 1.200 factor(sex)M 9.232e-02 2.146e-01 0.430 fat -2.297e-02 8.150e-02 -0.282 minsunset -1.104e-03 1.043e-03 -1.058 day1oct -4.247e-03 2.879e-02 -0.148 day1oct.2 5.087e-05 1.560e-04 0.326 minnight -5.987e-02 7.022e-02 -0.853 ave.day 1.128e-01 1.582e-01 0.713 minnight.1 -9.590e-02 1.684e-01 -0.570 T.A -4.855e-02 5.185e-02 -0.936 ave.night.1 1.420e-01 2.477e-01 0.573 Correlation of Fixed Effects: (Intr) tarsus wing weight f()2K+ fct()M fat mnsnst day1ct dy1c.2 mnnght ave.dy mnng.1 T.A tarsus -0.851 wing -0.870 0.966 weight 0.071 -0.417 -0.411 factr(g)2K+ 0.211 -0.248 -0.241 0.219 factor(sx)M 0.573 -0.499 -0.526 -0.179 0.105 fat -0.037 0.046 0.052 -0.264 -0.152 0.045 minsunset -0.177 -0.144 -0.122 0.214 -0.101 -0.027 -0.045 day1oct -0.261 -0.051 -0.052 -0.117 -0.145 0.140 0.131 0.515 day1oct.2 0.257 0.050 0.051 0.121 0.141 -0.149 -0.125 -0.484 -0.993 minnight -0.074 0.249 0.216 -0.271 -0.032 -0.043 0.022 0.022 -0.168 0.231 ave.day -0.025 0.070 0.050 0.001 0.045 -0.022 0.046 -0.363 -0.120 0.041 -0.415 minnight.1 0.304 -0.081 -0.045 0.069 0.129 0.012 -0.054 -0.349 -0.636 0.644 0.023 0.052 T.A 0.049 -0.043 0.018 0.130 0.040 -0.164 -0.065 -0.317 -0.288 0.249 -0.598 0.267 0.143 ave.night.1 -0.234 0.004 -0.015 -0.030 -0.110 0.016 0.031 0.493 0.614 -0.586 0.105 -0.524 -0.863 -0.243 At this point, I want to go on selecting the variables with most explanatory power to come up with a final model. However, I'm not sure on how to do this, because (not being a trained statistician) I'm used to having p-values to guide me. Similarly, I would like to be able to report the relative "importance" of variables in some way but, as apparent from a number of threads, p-values seem to be the least preferred option when it comes to lmer. I've read about the mcmcsamp()-function, but I'm not entirely sure on how to use it or on how to intrepret the output. Any advice would be most appreciated. Kind regards, Andreas Nord -- View this message in context: http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19146850.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.