Currently exists("someName", where=someDataFrame) reports if "someName" is an column of the data.frame 'someDataFrame' and the 'where=' may be omitted. If we have an environment we use exsts("someName", envir=someEnvironment). It might be nice to continue using exists() instead of introducing a new function has(), although, since we want the same syntax to work for environments, data.frames, tbl_dfs, data.tables, etc., we may need the new function.
Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Jun 28, 2016 at 4:08 AM, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > On 27/06/2016 10:15 PM, Lenth, Russell V wrote: > >> Hadley's note on partial matching has me scared the most concerning the >> as.null() coding. So the need for a hasName() (or whatever) function seems >> all the more compelling, and that it be in base R. Perhaps it should be >> generic, with a default method that searches in the names attribute, >> potentially extensible to other classes. >> > > I am thinking of putting it in, but if I do the definition will be > equivalent to the one-liner down below. That's already slower than the > is.null() test; making it generic would slow it down too much. > > Duncan Murdoch > > > Thanks so much, several of you, for your positive and helpful responses. >> >> Russ >> >> -----Original Message----- >> From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] >> Sent: Monday, June 27, 2016 12:50 PM >> To: Hadley Wickham <h.wick...@gmail.com>; Lenth, Russell V < >> russell-le...@uiowa.edu> >> Cc: r-package-devel@r-project.org >> Subject: Re: [R-pkg-devel] Absent variables and tibble >> >> On 27/06/2016 1:09 PM, Hadley Wickham wrote: >> >>> The other thing you need to be aware of it you're using the other >>> approach is partial matching: >>> >>> df <- data.frame(xyz = 1) >>> is.null(df$x) >>> #> [1] FALSE >>> >>> Duncan - I think that argues for including a has_name() (hasName() ?) >>> function in base R. Is that something you'd consider? >>> >> >> Yes, I'd consider it. I think hasName() would be more consistent with >> other has*() functions in the R sources. >> >> I guess the implementation should be defined to be equivalent to >> >> hasName <- function(x, name) >> name %in% names(x) >> >> though it would make sense to make a faster internal implementation; >> !is.null(df$x) is quite a bit faster than "x" %in% names(df). >> >> Duncan Murdoch >> >> >> >>> Hadley >>> >>> On Mon, Jun 27, 2016 at 10:05 AM, Lenth, Russell V >>> <russell-le...@uiowa.edu> wrote: >>> >>>> Thanks, Hadley. I do understand why you'd want more careful checking. >>>> >>>> If you're going to provide a variable-existing function, may I suggest >>>> a short name like 'has'? I.e., has(x, var) returns TRUE if x has var in it. >>>> >>>> Thanks >>>> >>>> Russ >>>> >>>> On Jun 27, 2016, at 9:47 AM, Hadley Wickham <h.wick...@gmail.com> >>>>> wrote: >>>>> >>>>> On Mon, Jun 27, 2016 at 9:03 AM, Duncan Murdoch >>>>> <murdoch.dun...@gmail.com> wrote: >>>>> >>>>>> On 27/06/2016 9:22 AM, Lenth, Russell V wrote: >>>>>> >>>>>>> >>>>>>> My package 'lsmeans' is now suddenly broken because of a new >>>>>>> provision in the 'tibble' package (loaded by 'dplyr' 0.5.0), whereby >>>>>>> the "[[" and "$" >>>>>>> methods for 'tbl_df' objects - as documented - throw an error if >>>>>>> a variable is not found. >>>>>>> >>>>>>> The problem is that my code uses tests like this: >>>>>>> >>>>>>> if (is.null (x$var)) {...} >>>>>>> >>>>>>> to see whether 'x' has a variable 'var'. Obviously, I can work >>>>>>> around this using >>>>>>> >>>>>>> if (!("var" %in% names(x))) {...} >>>>>>> >>>>>>> but (a) I like the first version better, in terms of the code >>>>>>> being understandable; and (b) isn't there a long history whereby >>>>>>> we can expect a NULL result when accessing an absent member of a >>>>>>> list (and hence a data.frame)? (c) the code base for 'lsmeans' >>>>>>> has about 50 instances of such tests. >>>>>>> >>>>>>> Anyway, I wonder if a lot of other package developers test for >>>>>>> absent variables in that first way; if so, they too are in for a >>>>>>> rude awakening if their users provide a tbl_df instead of a >>>>>>> data.frame. And what is considered the best practice for testing >>>>>>> absence of a list member? Apparently, not either of the above; >>>>>>> and because of (c), I want to do these many tedious corrections only >>>>>>> once. >>>>>>> >>>>>>> Thanks for any light you can shed. >>>>>>> >>>>>> >>>>>> >>>>>> This is why CRAN asks that people test reverse dependencies. >>>>>> >>>>> >>>>> Which we did do - the problem is that this is actually caused by a >>>>> recursive reverse dependency (lsmeans -> dplyr -> tibble), and we >>>>> didn't correctly anticipate how much pain this would cause. >>>>> >>>>> I think the most defensive thing you can do is to write a small >>>>>> function >>>>>> >>>>>> name_missing <- function(x, name) >>>>>> !(name %in% names(x)) >>>>>> >>>>>> and use name_missing(x, "var") in your tests. (Pick your own name >>>>>> to make your code understandable if you don't like my choice.) >>>>>> >>>>>> You could suggest to the tibble maintainers that they add a >>>>>> function like this. >>>>>> >>>>> >>>>> We're definitely going to add this. >>>>> >>>>> And I think we'll make df[["var"]] return NULL too, so at least >>>>> there's one easy way to opt out. >>>>> >>>>> The motivation for this change was that returning NULL + recycling >>>>> rules means it's very easy for errors to silently propagate. But I >>>>> think this approach might be somewhat too aggressive - I hadn't >>>>> considered that people use `is.null()` to check for missing columns. >>>>> >>>>> We'll try and get an update to tibble out soon after useR. >>>>> Thoughts on what we should do are greatly appreciated. >>>>> >>>>> Hadley >>>>> >>>>> -- >>>>> http://hadley.nz >>>>> >>>> >>> >>> >>> >> > ______________________________________________ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel > [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel