FWIW p. 206 of the White Book gives the following for names(binomial()): family, names, link, inverse, deriv, initialize, variance, deviance, weight.
So $aic wasn't there In The Beginning. I haven't done any more archaeology to try to figure out when/by whom it was first introduced ... Section 6.3.3, on extending families, doesn't give any other relevant info. A patch for src/library/stats/man/family.Rd below: please check what I've said about $aic and $mu.eta to make sure they're correct! I can submit this to the r bug list if preferred. ---- Index: family.Rd =================================================================== --- family.Rd (revision 74904) +++ family.Rd (working copy) @@ -31,7 +31,7 @@ \arguments{ \item{link}{a specification for the model link function. This can be a name/expression, a literal character string, a length-one character - vector or an object of class + vector, or an object of class \code{"\link[=make.link]{link-glm}"} (such as generated by \code{\link{make.link}}) provided it is not specified \emph{via} one of the standard names given next. @@ -45,7 +45,7 @@ the \code{Gamma} family the links \code{inverse}, \code{identity} and \code{log}; the \code{poisson} family the links \code{log}, \code{identity}, - and \code{sqrt} and the \code{inverse.gaussian} family the links + and \code{sqrt}; and the \code{inverse.gaussian} family the links \code{1/mu^2}, \code{inverse}, \code{identity} and \code{log}. @@ -105,8 +105,8 @@ \note{ The \code{link} and \code{variance} arguments have rather awkward semantics for back-compatibility. The recommended way is to supply - them is as quoted character strings, but they can also be supplied - unquoted (as names or expressions). In addition, they can also be + them as quoted character strings, but they can also be supplied + unquoted (as names or expressions). Additionally, they can be supplied as a length-one character vector giving the name of one of the options, or as a list (for \code{link}, of class \code{"link-glm"}). The restrictions apply only to links given as @@ -130,10 +130,18 @@ \item{dev.resids}{function giving the deviance residuals as a function of \code{(y, mu, wt)}.} \item{aic}{function giving the AIC value if appropriate (but \code{NA} - for the quasi- families). See \code{\link{logLik}} for the assumptions - made about the dispersion parameter.} - \item{mu.eta}{function: derivative \code{function(eta)} - \eqn{d\mu/d\eta}.} + for the quasi- families). More precisely, this function + returns \eqn{-2 L + 2 s}, where \eqn{L} is the log-likelihood and \eqn{s} + is the number of estimated scale parameters; the penalty term for the + location parameters is added elsewhere. + See \code{\link{logLik}} for the assumptions + made about the dispersion parameter.} + \item{mu.eta}{function: derivative of the inverse-link function + with respect to the linear predictor. If the inverse-link + function is \eqn{\mu=g^{-1}(\eta)}{mu=ginv(eta)} + where \eqn{eta}{\eta} is the value + of the linear predictor, then this function returns + \eqn{d(g^{-1})/d\eta=d\mu/d\eta}{d(ginv(eta))/d(eta)=d(mu)/d(eta)}.} \item{initialize}{expression. This needs to set up whatever data objects are needed for the family as well as \code{n} (needed for AIC in the binomial family) and \code{mustart} (see \code{\link{glm}}).} @@ -224,8 +232,8 @@ ## which case use an offset of 0 in the corresponding formula ## to get the null deviance right. -## Binomial with identity link: often not a good idea. -\dontrun{binomial(link = make.link("identity"))} +## Binomial with identity link: often computationally and conceptually difficult +\dontrun{binomial(link = "identity")} ## tests of quasi x <- rnorm(100) @@ -236,7 +244,7 @@ glm(y ~ x, family = quasi(variance = "mu^2", link = "log")) \dontrun{glm(y ~ x, family = quasi(variance = "mu^3", link = "log")) # fails} y <- rbinom(100, 1, plogis(x)) -# needs to set a starting value for the next fit +# need to set a starting value for the next fit glm(y ~ x, family = quasi(variance = "mu(1-mu)", link = "logit"), start = c(0,1)) } \keyword{models} On Mon, Jun 4, 2018 at 10:46 AM, Martin Maechler <maech...@stat.math.ethz.ch> wrote: >>>>>> Ben Bolker >>>>>> on Sun, 3 Jun 2018 17:33:18 -0400 writes: > > > Is it generally known/has it been previously discussed here that the > > $aic() component in GLM-family objects (e.g. results of binomial(), > > poisson(), etc.) does not as implemented actually return the AIC, but > > rather -2*log-likelihood + 2*(model_has_scale_parameter) ? > > This rings a faint bell from the last millennium with me, > and the following "fortune" may contain the answer implicitly : > > ------------------------------------------------------------ > > if(!require("fortunes")) install.packages("fortunes") > > fortune("bug compatib") > > For quite a while, bug-for-bug compatibility with S-PLUS v 3.x was > considered > important to allow people to port their packages between systems. > -- Peter Dalgaard > R-help (February 2009) > > > ------------------------------------------------------------ > > Ideally, readers who still have access to a version of S-PLUS / S+ > or who have read and internalized or (even co-written !) > "The white book", notably Ch.6, may be able to shed a historic > light on this. > > I note that the white book's Appendix B with function help > pages, has a page ?family.object, accessible here > https://sites.oxy.edu/lengyel/M150/Sueselbeck/helpfiles/family.object.html > > which does *not* mention a <fam>$dev.resid() component, but instead > allows to use <fam>$residuals(*, residuals=TRUE) > get the > " > vector of deviance residual, whose weighted sum of > squares is the deviance > " > > Given the above, and also the ?glm entry > > || Author(s): > || > || The original R implementation of ‘glm’ was written by Simon Davies > || working for Ross Ihaka at the University of Auckland, but has > || since been extensively re-written by members of the R Core team. > || > || The design was inspired by the S function of the same name > || described in Hastie & Pregibon (1992). > > actually suggest that it may be hard nowadays to find the > original "design specs" that Simon and Ross had used at the > time, and also that they only were _inspired_ by the white book > chapter 6 (= Hastie & Pregibon (1992)). > > > > > Can anyone in this forum gauge how a documentation patch > > would be received? > > It depends on further answers to your questions (i.e, this > thread), but I'd currently say "gratefully". > I'd expect it would be a patch mainly to > src/library/stats/man/family.Rd > > Note that help(AIC) has a non-small 'Details' section, but > indeed it does not mention the family(*)$aic function. > > > This behaviour does not seem to be documented in ?family (or anywhere > > else I can find), which says: > > > aic: function giving the AIC value if appropriate (but ‘NA’ for > > the quasi- families). See ‘logLik’ for the assumptions made > > about the dispersion parameter. > > > > For a demonstration that e.g. binomial()$aic() is really -2*log L and > > not the AIC, see: > > > > https://github.com/wch/r-source/blob/trunk/src/library/stats/R/family.R#L317 > > > This document > > <https://github.com/lme4/lme4/blob/master/misc/notes/deviance.rmd> > > explicates the details a bit more ('L' denotes log-likelihood): > > > * family()$aic computes $-2L$, which glm.fit translates to an AIC by > > adding $2k$ and storing it in model$aic > > * logLik.default retrieves model$aic and converts it back to a > > log-likelihood > > * stats:::AIC.default retrieves the log-likelihood and converts it > > back to an AIC (!) > > * family()$dev.resid() computes the squared deviance residuals > > * stats:::residuals.glm retrieves these values and takes the signed > > square root > > > cheers > > Ben Bolker ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel