[Rd] Small quirks in summary.(g)lm docs

Milan Bouchet-Valat Tue, 19 Feb 2013 08:32:37 -0800

Hi!

In R 3.0.0 from current SVN, ?summary.lm says:
> Value [...]
> df degrees of freedom, a 3-vector (p, n-p, p*), the last
>    being the number of non-aliased coefficients.


?summary.glm says:
> df a 3-vector of the rank of the model and the number of residual 
>    degrees of freedom, plus number of non-aliased coefficients.

It seems to me that the description is reversed: p is the number of
non-aliased coefficients, and p* the total number of coefficients. I do
not have reference books off-hand to check how it is intended to work,
but see this example:

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2,10,20, labels=c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group + I(group != "Ctl"))
lm.D9

Call:
lm(formula = weight ~ group + I(group != "Ctl"))

Coefficients:
          (Intercept)               groupTrt  I(group != "Ctl")TRUE  
                5.032                 -0.371                     NA  

summary(lm.D9)$df
[1]  2 18  3

sum(!summary(lm.D9)$aliased)
[1] 2


The same is true with glm().


Also, ?summary.lm seems to miss a mention that is present
in ?summary.glm:
> Aliased coefficients are omitted in the returned object but (as from
> R 1.8.0) restored by the print method.

This is apparently true of summary.lm too:

summary(lm.D9)

Call:
lm(formula = weight ~ group + I(group != "Ctl"))

Residuals:
    Min      1Q  Median      3Q     Max 
-1.0710 -0.4938  0.0685  0.2462  1.3690 

Coefficients: (1 not defined because of singularities)
                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)             5.0320     0.2202  22.850 9.55e-15 ***
groupTrt               -0.3710     0.3114  -1.191    0.249    
I(group != "Ctl")TRUE       NA         NA      NA       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.6964 on 18 degrees of freedom
Multiple R-squared: 0.07308,    Adjusted R-squared: 0.02158 
F-statistic: 1.419 on 1 and 18 DF,  p-value: 0.249 

summary(lm.D9)$coefficients
            Estimate Std. Error  t value     Pr(>|t|)
(Intercept)    5.032  0.2202177 22.85012 9.547128e-15
groupTrt      -0.371  0.3114349 -1.19126 2.490232e-01


Attached is a patch that applies these changes, if I'm not mistaken (and
my English can be improved...).


Regards

Index: src/library/stats/man/summary.glm.Rd
===================================================================
--- src/library/stats/man/summary.glm.Rd	(révision 62006)
+++ src/library/stats/man/summary.glm.Rd	(copie de travail)
@@ -89,7 +89,8 @@
   \item{dispersion}{either the supplied argument or the inferred/estimated
     dispersion if the latter is \code{NULL}.}
   \item{df}{a 3-vector of the rank of the model and the number of
-    residual degrees of freedom, plus number of non-aliased coefficients.}
+    residual degrees of freedom, plus number of coefficients (including
+    aliased ones).}
   \item{cov.unscaled}{the unscaled (\code{dispersion = 1}) estimated covariance
     matrix of the estimated coefficients.}
   \item{cov.scaled}{ditto, scaled by \code{dispersion}.}
Index: src/library/stats/man/summary.lm.Rd
===================================================================
--- src/library/stats/man/summary.lm.Rd	(révision 62006)
+++ src/library/stats/man/summary.lm.Rd	(copie de travail)
@@ -37,6 +37,9 @@
   coefficients, standard errors, etc. and additionally gives
   \sQuote{significance stars} if \code{signif.stars} is \code{TRUE}.
 
+  Aliased coefficients are omitted in the returned object but restored
+  by the \code{print} method.
+
   Correlations are printed to two decimal places (or symbolically): to
   see the actual correlations print \code{summary(object)$correlation}
   directly.
@@ -58,8 +61,9 @@
     error
     \deqn{\hat\sigma^2 = \frac{1}{n-p}\sum_i{w_i R_i^2},}{\sigma^2 = 1/(n-p) Sum(w[i] R[i]^2),}
     where \eqn{R_i}{R[i]} is the \eqn{i}-th residual, \code{residuals[i]}.}
-  \item{df}{degrees of freedom, a 3-vector \eqn{(p, n-p, p*)}, the last
-    being the number of non-aliased coefficients.}
+  \item{df}{degrees of freedom, a 3-vector \eqn{(p, n-p, p*)}, the first
+    being the number of non-aliased coefficients, the last being the total
+    number of coefficients.}
   \item{fstatistic}{(for models including non-intercept terms)
     a 3-vector with the value of the F-statistic with
     its numerator and denominator degrees of freedom.}

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Small quirks in summary.(g)lm docs

Reply via email to