On Tue, 30 Apr 2013, Paul Johnson wrote:

Greetings to r-help land.

I've run into some program crashes and I've traced them back to methods() behavior after the package gdata is loaded. I provide now a minimal re-producible example. This seems bugish to me. How about you?

dat <- data.frame(x = rnorm(100), y = rnorm(100))
lm1 <- lm(y ~ x, data = dat)

The two lines above are not really needed. It's just the behaviour of methods(class = "lm") before and after loading gdata:

methods(class = "lm")

## OK so far

library(gdata)
methods(class = "lm")
## epic fail

The reason is that nobs.lm is found twice by methods(): first in "stats" and then in "gdata". And because methods() builds a data.frame with row names corresponding to all methods, this gives an error because row names in data frames have to be unique.

I guess it would be good to safeguard the methods function against situations like this. If methods are found more than once, one could run unqiue() on it or alternatively keep the duplicates but add the information about which namespace it is coming from.

Additionally, it would probably be good if "gdata" changed its current behavior of copying the nobs generic and nobs.lm method. gdata does this so that it can provide a modified nobs.default and nobs.data.frame built on top of the nobs.default. But the price is that all the other methods registered with the stats::nobs generic are not found anymore

R> methods(nobs)
[1] nobs.default* nobs.glm*     nobs.lm*      nobs.logLik*  nobs.nls*

... and after loading gdata ...

R> methods(nobs)
[1] nobs.data.frame* nobs.default*    nobs.lm*

Hence the glm/logLik/nls methods are only found if the user explicitly calls stats::nobs. An artificially constructed example is

R> m <- lm(dist ~ speed, data = cars)
R> nobs(logLik(m))
[1] 1
R> stats:::nobs(logLik(m))
[1] 50

I haven't checked how much use gdata makes of the modified nobs.default outside nobs.data.frame. If this isn't used in other crucial places, I would probably recommend to omit nobs/nobs.default/nobs.lm from the gdata namespace and just register nobs.data.frame with stats::nobs.



## OUTPUT.

dat <- data.frame(x = rnorm(100), y = rnorm(100))
lm1 <- lm(y ~ x, data = dat)

methods(class = "lm")
[1] add1.lm*           alias.lm*          anova.lm           case.names.lm*
[5] confint.lm*        cooks.distance.lm* deviance.lm*       dfbeta.lm*
[9] dfbetas.lm*        drop1.lm*          dummy.coef.lm*     effects.lm*
[13] extractAIC.lm*     family.lm*         formula.lm*        hatvalues.lm
[17] influence.lm*      kappa.lm           labels.lm*         logLik.lm*
[21] model.frame.lm     model.matrix.lm    nobs.lm*           plot.lm
[25] predict.lm         print.lm           proj.lm*           qr.lm*
[29] residuals.lm       rstandard.lm       rstudent.lm        simulate.lm*
[33] summary.lm         variable.names.lm* vcov.lm*

  Non-visible functions are asterisked

library(gdata)
gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.

gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.

Attaching package: ?gdata?

The following object is masked from ?package:stats?:

   nobs

The following object is masked from ?package:utils?:

   object.size

methods(class = "lm")
Error in data.frame(visible = rep.int(FALSE, n2), from = rep.int(msg,  :
 duplicate row.names: nobs.lm

sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C                 LC_NAME=C
[9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] gdata_2.12.0.2

loaded via a namespace (and not attached):
[1] gtools_2.7.1 tcltk_3.0.0  tools_3.0.0


gdata is one of my favorite packages, its worth the effort to get to the
bottom of this.

--
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu

        [[alternative HTML version deleted]]



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to