Follow-up (inline) on my comment about a potential issue in `[<-.Date`. On Mon, May 27, 2019 at 9:31 AM Michael Chirico <michaelchiri...@gmail.com> wrote: > > Yes, thanks for following up on thread here. And thanks again for clearing > things up, your email was a finger snap of clarity on the whole issue. > > I'll add that actually it was data.table's code at fault on the storage > conversion -- note that if you use an arbitrary sub-class 'foo' with no > methods defined, it'll stay integer. > > That's because [<- calls as.Date and then as.Date.IDate, and that method > (ours) has as.numeric(); earlier I had recognized that if we commented that > line, the issue was "fixed" but I still wasn't understanding the root cause. > > My last curiosity on this issue will be in my follow-up thread. > > Mike C > > On Mon, May 27, 2019, 10:25 PM Joshua Ulrich <josh.m.ulr...@gmail.com> wrote: >> >> On Sun, May 26, 2019 at 6:47 AM Joshua Ulrich <josh.m.ulr...@gmail.com> >> wrote: >> > >> > On Sun, May 26, 2019 at 4:06 AM Michael Chirico >> > <michaelchiri...@gmail.com> wrote: >> > > >> > > Have finally managed to come up with a fix after checking out sys.calls() >> > > from within the as.Date.IDate debugger, which shows something like: >> > > >> > > [[1]] rbind(DF, DF) >> > > [[2]] rbind(deparse.level, ...) >> > > [[3]] `[<-`(`*tmp*`, ri, value = 18042L) >> > > [[4]] `[<-.Date`(`*tmp*`, ri, value = 18042L) >> > > [[5]] as.Date(value) >> > > [[6]] as.Date.IDate(value) >> > > >> > > I'm not sure why [<- is called, I guess the implementation is to assign >> > > to >> > > the output block by block? Anyway, we didn't have a [<- method. And >> > > [<-.Date looks like: >> > > >> > > value <- unclass(as.Date(value)) # <- converts to double >> > > .Date(NextMethod(.Generic), oldClass(x)) # <- restores 'IDate' class >> > > >> > > So we can fix our bug by defining a [<- class; the question that I still >> > > don't see answered in documentation or source code is, why/where is [<- >> > > called, exactly? >> > > >> > Your rbind(DF, DF) call dispatches to base::rbind.data.frame(). The >> > `[<-` call is this line: >> > value[[jj]][ri] <- if (is.factor(xij)) as.vector(xij) else xij >> > >> > That's where the storage.mode changes from integer to double. >> > >> > debug: value[[jj]][ri] <- if (is.factor(xij)) as.vector(xij) else xij >> > Browse[2]> >> > debug: xij >> > Browse[2]> storage.mode(xij) >> > [1] "integer" >> > Browse[2]> value[[jj]][ri] >> > [1] "2019-05-26" >> > Browse[2]> storage.mode(value[[jj]][ri]) >> > [1] "integer" >> > Browse[2]> >> > debug: if (!is.null(nm <- names(xij))) names(value[[jj]])[ri] <- nm >> > Browse[2]> storage.mode(value[[jj]][ri]) >> > [1] "double" >> > >> To be clear, I don't think this is a bug in rbind() or >> rbind.data.frame(). The confusion is that rbind.data.frame() calls >> `[<-` for each column of the data.frame, and there is no `[<-.IDate` >> method. So the parent class method is dispatched, which converts the >> storage mode to double. >> >> Someone may argue that this is an issue with `[<-.Date`, and that it >> shouldn't convert the storage.mode from integer to double.
I don't think this is an issue. The storage mode isn't converted if the replacement is the same storage mode. For example: R> x <- .Date(1:5) R> storage.mode(x) [1] "integer" R> x[1L] <- .Date(0L) R> storage.mode(x) [1] "integer" R> x[1L] <- .Date(0) R> storage.mode(x) [1] "double" >> > >> > > Mike C >> > > >> > > On Sun, May 26, 2019 at 1:16 PM Michael Chirico >> > > <michaelchiri...@gmail.com> >> > > wrote: >> > > >> > > > Debugging this issue: >> > > > >> > > > https://github.com/Rdatatable/data.table/issues/2008 >> > > > >> > > > We have custom class 'IDate' which inherits from 'Date' (it just forces >> > > > integer storage for efficiency, hence, I). >> > > > >> > > > The concatenation done by rbind, however, breaks this and returns a >> > > > double: >> > > > >> > > > library(data.table) >> > > > DF = data.frame(date = as.IDate(Sys.Date())) >> > > > storage.mode(rbind(DF, DF)$date) >> > > > # [1] "double" >> > > > >> > > > This is specific to base::rbind (data.table's rbind returns an integer >> > > > as >> > > > expected); in ?rbind we see: >> > > > >> > > > The method dispatching is not done via UseMethod(), but by C-internal >> > > > dispatching. Therefore there is no need for, e.g., rbind.default. >> > > > The dispatch algorithm is described in the source file >> > > > (ā.../src/main/bind.cā) as >> > > > 1. For each argument we get the list of possible class memberships from >> > > > the class attribute. >> > > > 2. *We inspect each class in turn to see if there is an applicable >> > > > method.* >> > > > 3. If we find an applicable method we make sure that it is identical to >> > > > any method determined for prior arguments. If it is identical, we >> > > > proceed, >> > > > otherwise we immediately drop through to the default code. >> > > > >> > > > It's not clear what #2 means -- an applicable method *for what*? >> > > > Glancing >> > > > at the source code would suggest it's looking for rbind.IDate: >> > > > >> > > > https://github.com/wch/r-source/blob/trunk/src/main/bind.c#L1051-L1063 >> > > > >> > > > const char *generic = ((PRIMVAL(op) == 1) ? "cbind" : "rbind"); // >> > > > should >> > > > be rbind here >> > > > const char *s = translateChar(STRING_ELT(classlist, i)); // iterating >> > > > over >> > > > the classes, should get to IDate first >> > > > sprintf(buf, "%s.%s", generic, s); // should be rbind.IDate >> > > > >> > > > but adding this method (or even exporting it) is no help [ simply >> > > > defining >> > > > rbind.IDate = function(...) as.IDate(NextMethod()) ] >> > > > >> > > > Lastly, it appears that as.Date.IDate is called, which is causing the >> > > > type >> > > > conversion: >> > > > >> > > > debug(data.table:::as.Date.IDate) >> > > > rbind(DF, DF) # launches debugger >> > > > x >> > > > # [1] "2019-05-26" <-- singleton, so apparently applied to DF$date, not >> > > > c(DF$date, DF$date) >> > > > undebug(data.table:::as.Date.IDate) >> > > > >> > > > I can't really wrap my head around why as.Date is being called here, >> > > > and >> > > > even allowing that, why the end result is still the original class [ >> > > > class(rbind(DF, DF)$date) == c('IDate', 'Date') ] >> > > > >> > > > So, I'm beginning to think this might be a bug. Am I missing something? >> > > > >> > > >> > > [[alternative HTML version deleted]] >> > > >> > > ______________________________________________ >> > > R-devel@r-project.org mailing list >> > > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> > >> > >> > -- >> > Joshua Ulrich | about.me/joshuaulrich >> > FOSS Trading | www.fosstrading.com >> > R/Finance 2019 | www.rinfinance.com >> >> >> >> -- >> Joshua Ulrich | about.me/joshuaulrich >> FOSS Trading | www.fosstrading.com >> R/Finance 2019 | www.rinfinance.com -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2019 | www.rinfinance.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel