Splus's rle() also grouped NA's (separately from NaN's): % Splus TIBCO Software Inc. Confidential Information Copyright (c) 1988-2008 TIBCO Software Inc. ALL RIGHTS RESERVED. TIBCO Spotfire S+ Version 8.1.1 for Linux 2.6.9-34.EL, 32-bit : 2008 > dput(rle(c(11,11,NA,NA,NA,NaN,14,14,14,14))) list("lengths" = c(2, 3, 1, 4) , "values" = c(11., NA, NaN, 14.) )
Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Aug 25, 2020 at 10:57 PM Gabriel Becker <gabembec...@gmail.com> wrote: > > Hi All, > > A twitter user, Mike fc (@coolbutuseless) mentioned today that he was > surprised that repeated NAs weren't treated as a run by the rle function. > > Now I know why they are not. NAs represent values which could be the same > or different from eachother if they were known, so from a purely conceptual > standpoint there is no way to tell whether they are the same and thus > constitute a run or not. > > This conceptual strictness isnt universally observed, though, because we > get the following: > > > unique(c(1, 2, 3, NA, NA, NA)) > > [1] 1 2 3 NA > > > Which means that rle(sort(x))$value is not guaranteed to be the same as > unique(x), which is a little strange (though likely of little practical > impact). > > > Personally, to me it also seems that, from a purely data-compression > standpoint, it would be valid to collapse those missing values into a run > of missing, as it reduces size in-memory/on disk without losing any > information. > > Now none of this is to say that I suggest the default behavior be changed > (that would surely disrupt some non-trivial amount of existing code) but > what do people think of a group.nas argument which defaults to FALSE > controlling the behavior? > > As a final point, there is some precedent here (though obviously not at all > binding), as Bioconductor's Rle functionality does group NAs. > > Best, > ~G > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel