To clarify, ?is.na docs say that 'na.omit' returns the object with incomplete cases removed. If we take is.na to be the definition of "incomplete cases" then a list element with scalar NA is incomplete. About the data.frame method, in my opinion it is highly confusing/inconsistent for na.omit to keep rows with incomplete cases in list columns, but not in columns which are atomic vectors,
> (f.num <- data.frame(num=c(1,NA,2))) num 1 1 2 NA 3 2 > is.na(f.num) num [1,] FALSE [2,] TRUE [3,] FALSE > na.omit(f.num) num 1 1 3 2 > (f.list <- data.frame(list=I(list(1,NA,2)))) list 1 1 2 NA 3 2 > is.na(f.list) list [1,] FALSE [2,] TRUE [3,] FALSE > na.omit(f.list) list 1 1 2 NA 3 2 On Sat, Aug 14, 2021 at 5:15 PM Gabriel Becker <gabembec...@gmail.com> wrote: > I understand what is.na does, the issue I have is that its task is not > equivalent to the conceptual task na.omit is doing, in my opinion, as > illustrated by what the data.frame method does. > > Thus what i was getting at above about it not being clear that lst[is.na(lst)] > being the correct thing for na.omit to do > > ~G > > ~G > > On Sat, Aug 14, 2021, 1:49 PM Toby Hocking <tdho...@gmail.com> wrote: > >> Some relevant information from ?is.na: the behavior for lists is >> documented, >> >> For is.na, elementwise the result is false unless that element >> is a length-one atomic vector and the single element of that >> vector is regarded as NA or NaN (note that any is.na method >> for the class of the element is ignored). >> >> Also there are other functions anyNA and is.na<- which are consistent >> with >> is.na. That is, anyNA only returns TRUE if the list has an element which >> is >> a scalar NA. And is.na<- sets list elements to logical NA to indicate >> missingness. >> >> On Fri, Aug 13, 2021 at 1:10 AM Hugh Parsonage <hugh.parson...@gmail.com> >> wrote: >> >> > The data.frame method deliberately skips non-atomic columns before >> > invoking is.na(x) so I think it is fair to assume this behaviour is >> > intentional and assumed. >> > >> > Not so clear to me that there is a sensible answer for list columns. >> > (List columns seem to collide with the expectation that in each >> > variable every observation will be of the same type) >> > >> > Consider your list L as >> > >> > L <- list(NULL, NA, c(NA, NA)) >> > >> > Seems like every observation could have a claim to be 'missing' here. >> > Concretely, if a data.frame had a list column representing the lat-lon >> > of an observation, we might only be able to represent missing values >> > like c(NA, NA). >> > >> > On Fri, 13 Aug 2021 at 17:27, Iñaki Ucar <iu...@fedoraproject.org> >> wrote: >> > > >> > > On Thu, 12 Aug 2021 at 22:20, Gabriel Becker <gabembec...@gmail.com> >> > wrote: >> > > > >> > > > Hi Toby, >> > > > >> > > > This definitely appears intentional, the first expression of >> > > > stats:::na.omit.default is >> > > > >> > > > if (!is.atomic(object)) >> > > > >> > > > return(object) >> > > >> > > I don't follow your point. This only means that the *default* method >> > > is not intended for non-atomic cases, but it doesn't mean it shouldn't >> > > exist a method for lists. >> > > >> > > > So it is explicitly just returning the object in non-atomic cases, >> > which >> > > > includes lists. I was not involved in this decision (obviously) but >> my >> > > > guess is that it is due to the fact that what constitutes an >> > observation >> > > > "being complete" in unclear in the list case. What should >> > > > >> > > > na.omit(list(5, NA, c(NA, 5))) >> > > > >> > > > return? Just the first element, or the first and the last? It >> seems, at >> > > > least to me, unclear. A small change to the documentation to to add >> > "atomic >> > > >> > > > is.na(list(5, NA, c(NA, 5))) >> > > [1] FALSE TRUE FALSE >> > > >> > > Following Toby's argument, it's clear to me: the first and the last. >> > > >> > > Iñaki >> > > >> > > > (in the sense of is.atomic returning \code{TRUE})" in front of >> > "vectors" >> > > > or similar where what types of objects are supported seems >> justified, >> > > > though, imho, as the current documentation is either ambiguous or >> > > > technically incorrect, depending on what we take "vector" to mean. >> > > > >> > > > Best, >> > > > ~G >> > > > >> > > > On Wed, Aug 11, 2021 at 10:16 PM Toby Hocking <tdho...@gmail.com> >> > wrote: >> > > > >> > > > > Also, the na.omit method for data.frame with list column seems to >> be >> > > > > inconsistent with is.na, >> > > > > >> > > > > > L <- list(NULL, NA, 0) >> > > > > > str(f <- data.frame(I(L))) >> > > > > 'data.frame': 3 obs. of 1 variable: >> > > > > $ L:List of 3 >> > > > > ..$ : NULL >> > > > > ..$ : logi NA >> > > > > ..$ : num 0 >> > > > > ..- attr(*, "class")= chr "AsIs" >> > > > > > is.na(f) >> > > > > L >> > > > > [1,] FALSE >> > > > > [2,] TRUE >> > > > > [3,] FALSE >> > > > > > na.omit(f) >> > > > > L >> > > > > 1 >> > > > > 2 NA >> > > > > 3 0 >> > > > > >> > > > > On Wed, Aug 11, 2021 at 9:58 PM Toby Hocking <tdho...@gmail.com> >> > wrote: >> > > > > >> > > > > > na.omit is documented as "na.omit returns the object with >> > incomplete >> > > > > cases >> > > > > > removed." and "At present these will handle vectors," so I >> > expected that >> > > > > > when it is used on a list, it should return the same thing as >> if we >> > > > > subset >> > > > > > via is.na; however I observed the following, >> > > > > > >> > > > > > > L <- list(NULL, NA, 0) >> > > > > > > str(L[!is.na(L)]) >> > > > > > List of 2 >> > > > > > $ : NULL >> > > > > > $ : num 0 >> > > > > > > str(na.omit(L)) >> > > > > > List of 3 >> > > > > > $ : NULL >> > > > > > $ : logi NA >> > > > > > $ : num 0 >> > > > > > >> > > > > > Should na.omit be fixed so that it returns a result that is >> > consistent >> > > > > > with is.na? I assume that is.na is the canonical definition of >> > what >> > > > > > should be considered a missing value in R. >> > > > > > >> > > > > >> > > > > [[alternative HTML version deleted]] >> > > > > >> > > > > ______________________________________________ >> > > > > R-devel@r-project.org mailing list >> > > > > https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > > >> > > > >> > > > [[alternative HTML version deleted]] >> > > > >> > > > ______________________________________________ >> > > > R-devel@r-project.org mailing list >> > > > https://stat.ethz.ch/mailman/listinfo/r-devel >> > > >> > > >> > > >> > > -- >> > > Iñaki Úcar >> > > >> > > ______________________________________________ >> > > R-devel@r-project.org mailing list >> > > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> > ______________________________________________ >> > R-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel