Dear list, the following came up in an introductory class. Please help me understand the -1 (or 0+) syntax in formulae: Why do the enumerator dfs, F-statisics etc. differ between the models lm(y ~ x1) and lm(y ~ x0 + x1 - 1), if x0 is a vector containing simply ones?
Example: N <- 40 x0 <- rep(1,N) x1 <- 1:N vare <- N/8 set.seed(4) e <- rnorm(N, 0, vare^2) X <- cbind(x0, x1) beta <- c(.4, 1) y <- X %*% beta + e summary(lm(y ~ x1)) # [...] # Residual standard error: 20.92 on 38 degrees of freedom # Multiple R-squared: 0.1151, Adjusted R-squared: 0.09182 # F-statistic: 4.943 on 1 and 38 DF, p-value: 0.03222 summary(lm(y ~ x0 + x1 - 1)) # or summary(lm(y ~ 0 + x0 + x1)) # [...] # Residual standard error: 20.92 on 38 degrees of freedom # Multiple R-squared: 0.6888, Adjusted R-squared: 0.6724 # F-statistic: 42.05 on 2 and 38 DF, p-value: 2.338e-10 Thanks in advance, Jochen ---- Jochen Laubrock, Dept. of Psychology, University of Potsdam, Karl-Liebknecht-Strasse 24-25, 14476 Potsdam, Germany phone: +49-331-977-2346, fax: +49-331-977-2793 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.