Inspired by discussion in "Need very fast application of 'diff' - ideas?" (around https://stat.ethz.ch/pipermail/r-help/2012-January/301873.html), I have another suggestion.
Suggestion 3: Make 'diff.default' run faster. For vector case (if suggestion 2 is not applied or if unclassed input is treated specially), without resorting to C, I found that a speedup may be gained by changing r[-length(r):-(length(r)-lag+1L)] with `length<-`(r, length(r)-lag) Another way, with similar idea, that triggers warning, is doing as follows. { for (i in seq_len(differences)) r <- r[i1] - r length(r) <- xlen - lag * differences } Variables 'i1' and 'xlen' are as defined in function 'diff.default' in R. --- On Tue, 29/1/13, Suharto Anggono Suharto Anggono <suharto_angg...@yahoo.com> wrote: > From: Suharto Anggono Suharto Anggono <suharto_angg...@yahoo.com> > Subject: Re: Suggestions for 'diff.default' > To: r-de...@lists.r-project.org > Date: Tuesday, 29 January, 2013, 10:32 AM > > > --- On Mon, 28/1/13, Suharto Anggono Suharto Anggono > <suharto_angg...@yahoo.com> > wrote: > > > From: Suharto Anggono Suharto Anggono <suharto_angg...@yahoo.com> > > Subject: Suggestions for 'diff.default' > > To: r-de...@lists.r-project.org > > Date: Monday, 28 January, 2013, 5:31 PM > > I have suggestions for function > > 'diff.default' in R. > > > > > > Suggestion 1: If the input is matrix, always return > matrix, > > even if empty. > > > > What happens in R 2.15.2: > > > > > rbind(1:2) # matrix > > [,1] [,2] > > [1,] 1 2 > > > diff(rbind(1:2)) # not matrix > > integer(0) > > > sessionInfo() > > R version 2.15.2 (2012-10-26) > > Platform: i386-w64-mingw32/i386 (32-bit) > > > > locale: > > [1] LC_COLLATE=English_United States.1252 > > [2] LC_CTYPE=English_United States.1252 > > [3] LC_MONETARY=English_United States.1252 > > [4] LC_NUMERIC=C > > [5] LC_TIME=English_United States.1252 > > > > attached base packages: > > [1] stats graphics grDevices > > utils datasets > > methods base > > > > > > The documentation for 'diff' says, "If 'x' is a matrix > then > > the difference operations are carried out on each > column > > separately." > > If the result is empty, I expect that the result still > has > > as many columns as the input. > > > > > > Suggestion 2: Make 'diff.default' applicable more > generally > > by > > (a) not performing 'unclass'; > > (b) generalizing (changing) > > ismat <- is.matrix(x) > > to become > > ismat <- length(dim(x)) == 2L > > > > > > If suggestion 1 is to be applied, if 'unclass' is not > wanted > > (point (a) in suggestion 2 is also to be applied), > > > > if (lag * differences >= xlen) > > return(x[0L]) > > > > can be changed to > > > > if (lag * differences >= xlen) > > return( > > if (ismat) x[0L, , > > drop = FALSE] - x[0L, , drop = FALSE] else > > x[0L] - x[0L]) > > > > It will handle class where subtraction (minus) > operation > > change class. > Sorry, I wasn't careful enough. To obtain the correct class > for the result, differencing should be done as many times as > specified by argument 'differences'. > > I consider the case of > diff(as.POSIXct(c("2012-01-01", "2012-02-01"), tz="UTC"), > d=2) > versus > diff(diff(as.POSIXct(c("2012-01-01", "2012-02-01"), > tz="UTC"))) > To be safe, maybe just compute as usual, even when it is > known that the end result will be empty. It can be done like > this. > > empty <- integer() > if (ismat) > for (i in seq_len(differences)) > r <- if (lag >= > nrow(r)) > > r[empty, , drop = FALSE] - r[empty, , drop = FALSE] else > ... > else > for (i in seq_len(differences)) > r <- if (lag > >= length(r)) > > r[empty] - r[empty] else > ... > > If that way is used, 'xlen' is no longer needed. > > > > Otherwise, if 'unclass' is wanted, maybe the handling > of > > empty result can be moved to be after 'unclass', to be > > consistent with non-empty result. > > > > > > If point (a) in suggestion 2 is applied, 'diff.default' > can > > handle input of class "Date" and "POSIXt". If, in > addition, > > point (b) in suggestion 2 is also applied, > 'diff.default' > > can handle data frame as input. > > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel