>>>>> Steven Nydick <swnyd...@gmail.com> >>>>> on Wed, 9 May 2018 13:25:11 +0000 writes:
> I do not have access to the bug reporting system. If somebody can get me > access, I can create a formal bug report. > The latter issues seem like duplicates of: > https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=12572 (with slightly > different output), but as that bug was reported nearly 10 years ago, it > might be worth creating an update under R version 3. I could not find the > first issue when searching the bug reports (which I ran into when trying to > parse JSON files), which is why I posted on r-devel. Indeed, thanks a lot Steven (and Duncan!), I've found the following: 1. The first issue is a new bug, in R "only" since R version 3.4.0, i.e. working upto R 3.3.3. Duncan's patch basically fixes. I've found that the C code there can be simplified and deconvoluted, and after that, I will commit basically the bug fix of Duncan Murdoch. 2. The second issues indeed are an entirely different bug, and I would say actually point to a "design problem" of the whole thing. The C code in islistfactor() talks about arbitrary trees with all leaves factors, whereas the R code -- in the islistfactor() is TRUE -- actually only correctly deals with simple trees, namely of depth exactly 1. That are those you typically get from e.g., lapply(), and so this old design-bug triggers relatively rarely. Last but not least: I have created an account for you, Steven, on the bugzilla site. Given we have holidays till the weekend and private duties of mine, I won't get to more for now. Best Martin Maechler > On Tue, May 8, 2018 at 7:51 PM Duncan Murdoch <murdoch.dun...@gmail.com> > wrote: >> On 08/05/2018 4:50 PM, Steven Nydick wrote: >> > It also does the same thing if the factor is not on the first level of >> > the list, which seems to be due to the fact that the islistfactor is >> > recursive, but if a list is a list-factor, the first level lists are >> > coerced into character strings. >> > >> > > x <- list(list(factor(LETTERS[1]))) >> > > unlist(x) >> > Error in as.character.factor(x) : malformed factor >> > >> > However, if one of the factors is at the top level, and one is nested, >> > then the result is: >> > >> > > x <- list(list(factor(LETTERS[1])), factor(LETTERS[2])) >> > > unlist(x) >> > >> > [1] <NA> B >> > Levels: B >> > >> > ... which does not seem to me to be desired behavior. >> >> The patch I suggested doesn't help with either of these. I'd suggest >> collecting examples, and posting a bug report to bugs.r-project.org. >> >> Duncan Murdoch >> >> >> > >> > >> > On Tue, May 8, 2018 at 2:22 PM Duncan Murdoch <murdoch.dun...@gmail.com >> > <mailto:murdoch.dun...@gmail.com>> wrote: >> > >> > On 08/05/2018 2:58 PM, Duncan Murdoch wrote: >> > > On 08/05/2018 1:48 PM, Steven Nydick wrote: >> > >> Reproducible example: >> > >> >> > >> x <- list(list(list(), list())) >> > >> unlist(x) >> > >> >> > >> *> Error in as.character.factor(x) : malformed factor* >> > > >> > > The error comes from the line >> > > >> > > structure(res, levels = lv, names = nm, class = "factor") >> > > >> > > which is called because unlist() thinks that some entry is a >> factor, >> > > with NULL levels and NULL names. It's not legal for a factor to >> have >> > > NULL levels. Probably it should never get here; the earlier test >> > > >> > > if (.Internal(islistfactor(x, recursive))) { >> > > >> > > should have been false, and then the result would have been >> > > >> > > .Internal(unlist(x, recursive, use.names)) >> > > >> > > (with both recursive and use.names being TRUE), which returns >> NULL. >> > >> > And the problem is in the islistfactor function in src/main/apply.c, >> > which looks like this: >> > >> > static Rboolean islistfactor(SEXP X) >> > { >> > int i, n = length(X); >> > >> > switch(TYPEOF(X)) { >> > case VECSXP: >> > case EXPRSXP: >> > if(n == 0) return NA_LOGICAL; >> > for(i = 0; i < LENGTH(X); i++) >> > if(!islistfactor(VECTOR_ELT(X, i))) return FALSE; >> > return TRUE; >> > break; >> > } >> > return isFactor(X); >> > } >> > >> > One of those deeply nested lists is length 0, so at the lowest level >> it >> > returns NA_LOGICAL. But then it does C-style logical testing on the >> > results. I think to C NA_LOGICAL counts as true, so at the next >> level >> > up we get the wrong answer. >> > >> > A fix would be to rewrite it like this: >> > >> > static Rboolean islistfactor(SEXP X) >> > { >> > int i, n = length(X); >> > Rboolean result = NA_LOGICAL, childresult; >> > switch(TYPEOF(X)) { >> > case VECSXP: >> > case EXPRSXP: >> > for(i = 0; i < LENGTH(X); i++) { >> > childresult = islistfactor(VECTOR_ELT(X, i)); >> > if(childresult == FALSE) return FALSE; >> > else if(childresult == TRUE) result = TRUE; >> > } >> > return result; >> > break; >> > } >> > return isFactor(X); >> > } >> > >> > >> > >> > -- >> > Steven Nydick >> > PhD, Quantitative Psychology >> > M.A., Psychology >> > M.S., Statistics >> > -- >> > "Beware of the man who works hard to learn something, learns it, and >> > finds himself no wiser than before, Bokonon tells us. He is full of >> > murderous resentment of people who are ignorant without having come by >> > their ignorance the hard way." >> > -Kurt Vonnegut >> >> > -- > Steven Nydick > PhD, Quantitative Psychology > M.A., Psychology > M.S., Statistics > -- > "Beware of the man who works hard to learn something, learns it, and finds > himself no wiser than before, Bokonon tells us. He is full of murderous > resentment of people who are ignorant without having come by their > ignorance the hard way." > -Kurt Vonnegut > [[alternative HTML version deleted]] > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel