>>>>> Ben Bolker <bbol...@gmail.com> >>>>> on Thu, 8 Mar 2018 09:42:40 -0500 writes:
> Meant to respond to this but forgot. > I didn't write a new terms() function -- I added an attribute to the > terms() (a vector of the names > of the constructed model matrix), thus preserving the information at > the point when it was available. > I do agree that it would be preferable to have an upstream fix ... did anybody ever propose a small patch to the upstream sources ? -- including a REPREX (or 2: one for lme4, one for survival) I'm open to look at one .. not for the next few days, though. Martin > On Thu, Mar 8, 2018 at 9:39 AM, Therneau, Terry M., Ph.D. via R-devel > <r-devel@r-project.org> wrote: >> Ben, >> >> >> Looking at your notes, it appears that your solution is to write your own >> terms() function >> for lme. It is easy to verify that the "varnames.fixed" attribute is not >> returned by the >> ususal terms function. >> >> Then I also need to write my own terms function for the survival and coxme >> pacakges? >> Because of the need to treat strata() terms in a special way I manipulate >> the >> formula/terms in nearly every routine. >> >> Extrapolating: every R package that tries to examine formulas and partition >> them into bits >> needs its own terms function? This does not look like a good solution to >> me. >> >> On 03/07/2018 07:39 AM, Ben Bolker wrote: >>> >>> I knew I had seen this before but couldn't previously remember where. >>> https://github.com/lme4/lme4/issues/441 ... I initially fixed with >>> gsub(), but (pushed by Martin Maechler to do better) I eventually >>> fixed it by storing the original names of the model frame (without >>> backticks) as an attribute for later retrieval: >>> >>> https://github.com/lme4/lme4/commit/56416fc8b3b5153df7df5547082835c5d5725e89. >>> >>> >>> On Wed, Mar 7, 2018 at 8:22 AM, Therneau, Terry M., Ph.D. via R-devel >>> <r-devel@r-project.org> wrote: >>>> >>>> Thanks to Bill Dunlap for the clarification. On follow-up it turns out >>>> that >>>> this will be an issue for many if not most of the routines in the >>>> survival >>>> package: a lot of them look at the terms structure and make use of the >>>> dimnames of attr(terms, 'factors'), which also keeps the unneeded >>>> backquotes. Others use the term.labels attribute. To dodge this I will >>>> need to create a fixterms() routine which I call at the top of every >>>> single >>>> routine in the library. >>>> >>>> Is there a chance for a fix at a higher level? >>>> >>>> Terry T. >>>> >>>> >>>> >>>> On 03/05/2018 03:55 PM, William Dunlap wrote: >>>>> >>>>> I believe this has to do terms() making "term.labels" (hence the >>>>> dimnames >>>>> of "factors") >>>>> with deparse(), so that the backquotes are included for non-syntactic >>>>> names. The backquotes >>>>> are not in the column names of the input data.frame (nor model frame) so >>>>> you get a mismatch >>>>> when subscripting the data.frame or model.frame with elements of >>>>> terms()$term.labels. >>>>> >>>>> I think you can avoid the problem by adding right after >>>>> ll <- attr(Terms, "term.labels") >>>>> the line >>>>> ll <- gsub("^`|`$", "", ll) >>>>> >>>>> E.g., >>>>> >>>>> > d <- data.frame(check.names=FALSE, y=1/(1:5), `b$a$d`=sin(1:5)+2, `x >>>>> y >>>>> z`=cos(1:5)+2) >>>>> > Terms <- terms( y ~ log(`b$a$d`) + `x y z` ) >>>>> > m <- model.frame(Terms, data=d) >>>>> > colnames(m) >>>>> [1] "y" "log(`b$a$d`)" "x y z" >>>>> > attr(Terms, "term.labels") >>>>> [1] "log(`b$a$d`)" "`x y z`" >>>>> > ll <- attr(Terms, "term.labels") >>>>> > gsub("^`|`$", "", ll) >>>>> [1] "log(`b$a$d`)" "x y z" >>>>> >>>>> It is a bit of a mess. >>>>> >>>>> >>>>> Bill Dunlap >>>>> TIBCO Software >>>>> wdunlap tibco.com <http://tibco.com> >>>>> >>>>> On Mon, Mar 5, 2018 at 12:55 PM, Therneau, Terry M., Ph.D. via R-devel >>>>> <r-devel@r-project.org <mailto:r-devel@r-project.org>> wrote: >>>>> >>>>> A user reported a problem with the survdiff function and the use of >>>>> variables that >>>>> contain a space. Here is a simple example. The same issue occurs >>>>> in >>>>> survfit for the >>>>> same reason. >>>>> >>>>> lung2 <- lung >>>>> names(lung2)[1] <- "in st" # old name is inst >>>>> survdiff(Surv(time, status) ~ `in st`, data=lung2) >>>>> Error in `[.data.frame`(m, ll) : undefined columns selected >>>>> >>>>> In the body of the code the program want to send all of the >>>>> right-hand >>>>> side variables >>>>> forward to the strata() function. The code looks more or less like >>>>> this, where m is >>>>> the model frame >>>>> >>>>> Terms <- terms(m) >>>>> index <- attr(Terms, "term.labels") >>>>> if (length(index) ==0) X <- rep(1L, n) # no coariates >>>>> else X <- strata(m[index]) >>>>> >>>>> For the variable with a space in the name the term.label is "`in >>>>> st`", >>>>> and the >>>>> subscript fails. >>>>> >>>>> Is this intended behaviour or a bug? The issue is that the name of >>>>> this column in the >>>>> model frame does not have the backtics, while the terms structure >>>>> does >>>>> have them. >>>>> >>>>> Terry T. >>>>> >>>>> ______________________________________________ >>>>> R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>> <https://stat.ethz.ch/mailman/listinfo/r-devel> >>>>> >>>>> >>>> ______________________________________________ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel