@Kylie happy to collaborate on it if you're interested. ~G
On Wed, Jul 24, 2019 at 10:43 AM Gabriel Becker <gabembec...@gmail.com> wrote: > I can work on this. Thanks Luke. > > ~G > > On Wed, Jul 24, 2019 at 8:25 AM Tierney, Luke <luke-tier...@uiowa.edu> > wrote: > >> If one of you wanted to try to create a patch to support ALTREP >> generic vectors here are some notes: >> >> The main challenge I am aware of (there might be others): Allowing >> DATAPTR to return a writable pointer would be too dangerous because >> the GC write barrier needs to see all mutations. So it would be best >> if Dataptr and Dataptr_or_null methods were not allowed to be >> defined. The default methods in altrep.c should do the right think. >> >> A reasonable name for the abstract class would be 'altlist'. >> >> 'altrep' methods that a class can provide: >> >> Unserialize or UnserializeEX >> Serialized_state >> Duplicate or DuplicateEx >> Coerce >> Inspect >> Length >> >> 'altvec' methods a class should provide: >> >> Extract_subset >> not Dataptr >> not Dataptr_or_null >> >> 'altlist' specific methods: >> >> Elt >> Set_elt >> >> Best, >> >> luke >> >> On Tue, 23 Jul 2019, Gabriel Becker wrote: >> >> > Hi Kylie, >> > >> > Is it a list with only numerics in it? (I only see REALSXPs there, but >> > obviously inspect isn't showing all of them). If so, you could load it >> up >> > into one big vector and then also keep partitioning information around. >> > Bioconductor does this (see ?IRanges::CompressedList ). The potential >> > benefit here being that the underlying large vector could then be a big >> > out-of-memory altrep. How helpful this would be depends somewhat on what >> > you want to do with it, of course, but it is something that comes to >> mind. >> > >> > Also, I would expect some overhead but that seems like a lot (without >> > having done super much in the way of benchmarking). What exactly is >> > as.altrep doing? >> > >> > Best, >> > ~G >> > >> > On Tue, Jul 23, 2019 at 9:54 AM Michael Lawrence via R-devel < >> > r-devel@r-project.org> wrote: >> > >> >> Hi Kylie, >> >> >> >> As an alternative in the short term, you could consider deriving from >> >> S4Vector's List class, implementing the getListElement() method to >> >> lazily create the objects. >> >> >> >> Michael >> >> >> >> On Tue, Jul 23, 2019 at 9:09 AM Bemis, Kylie <k.be...@northeastern.edu >> > >> >> wrote: >> >>> >> >>> Hello, >> >>> >> >>> I was wondering if there were any plans for ALTREP lists (VECSXP)? >> >>> >> >>> It seems to me that they could be supported in a similar way to how >> >> ALTSTRING works, with Elt() and Set_elt() methods, or would there be >> some >> >> problems with that I’m not seeing due to lists not being atomic >> vectors? >> >>> >> >>> I was taking an approach of converting each list element (of a >> >> file-based list data structure) to an ALTREP representation to build >> up an >> >> “ALTREP list”. >> >>> >> >>> This seems fine for shorter lists with large elements, but I noticed >> >> that for longer lists with smaller elements, this could be far more >> >> time-consuming than simply reading the entire list into memory and >> >> returning a non-ALTREP list: >> >>> >> >>>> x >> >>> <34840 length> matter_list :: out-of-memory list >> >>> (1.1 MB real | 543.3 MB virtual) >> >>> >> >>>> system.time(y <- as.list(x)) >> >>> user system elapsed >> >>> 1.116 2.175 5.053 >> >>> >> >>>> system.time(z <- as.altrep(x)) >> >>> user system elapsed >> >>> 36.295 4.717 41.216 >> >>> >> >>>> .Internal(inspect(y)) >> >>> @108255000 19 VECSXP g1c7 [MARK,NAM(7)] (len=34840, tl=0) >> >>> @7f9044d9fc00 14 REALSXP g1c7 [MARK] (len=1129, tl=0) >> >> 404.093,404.096,404.099,404.102,404.105,... >> >>> @7f9044d25e00 14 REALSXP g1c7 [MARK] (len=890, tl=0) >> >> 409.924,409.927,409.931,409.934,409.937,... >> >>> @7f9044da6000 14 REALSXP g1c7 [MARK] (len=1878, tl=0) >> >> 400.3,400.303,400.306,400.309,400.312,... >> >>> @7f9031a6b000 14 REALSXP g1c7 [MARK] (len=2266, tl=0) >> >> 402.179,402.182,402.185,402.188,402.191,... >> >>> @7f9031a77a00 14 REALSXP g1c7 [MARK] (len=1981, tl=0) >> >> 403.021,403.024,403.027,403.03,403.033,... >> >>> ... >> >>> >> >>>> .Internal(inspect(z)) >> >>> @108210000 19 VECSXP g1c7 [MARK,NAM(7)] (len=34840, tl=0) >> >>> @7f904eea7660 14 REALSXP g1c0 [MARK,NAM(7)] matter vector (mode=4, >> >> len=1129, mem=0) >> >>> @7f9050347498 14 REALSXP g1c0 [MARK,NAM(7)] matter vector (mode=4, >> >> len=890, mem=0) >> >>> @7f904d286b20 14 REALSXP g1c0 [MARK,NAM(7)] matter vector (mode=4, >> >> len=1878, mem=0) >> >>> @7f904fd38820 14 REALSXP g1c0 [MARK,NAM(7)] matter vector (mode=4, >> >> len=2266, mem=0) >> >>> @7f904c75ce90 14 REALSXP g1c0 [MARK,NAM(7)] matter vector (mode=4, >> >> len=1981, mem=0) >> >>> ... >> >>> >> >>> In this situation, it would be much faster and simpler for me to >> return >> >> a theoretical ALTREP list that serves SEXP elements on-demand, similar >> to >> >> how ALTSTRING seems to be implemented. >> >>> >> >>> I don’t know how many other people would get a use out of ALTREP >> lists, >> >> but I certainly would. >> >>> >> >>> Are there any plans for this? >> >>> >> >>> Thanks! >> >>> >> >>> ~~~ >> >>> Kylie Ariel Bemis >> >>> Khoury College of Computer Sciences >> >>> Northeastern University >> >>> kuwisdelu.github.io<https://kuwisdelu.github.io> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> [[alternative HTML version deleted]] >> >>> >> >>> ______________________________________________ >> >>> R-devel@r-project.org mailing list >> >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> >> >> >> >> >> -- >> >> Michael Lawrence >> >> Scientist, Bioinformatics and Computational Biology >> >> Genentech, A Member of the Roche Group >> >> Office +1 (650) 225-7760 >> >> micha...@gene.com >> >> >> >> Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube >> >> >> >> ______________________________________________ >> >> R-devel@r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> >> -- >> Luke Tierney >> Ralph E. Wareham Professor of Mathematical Sciences >> University of Iowa Phone: 319-335-3386 >> Department of Statistics and Fax: 319-335-3017 >> Actuarial Science >> 241 Schaeffer Hall email: luke-tier...@uiowa.edu >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel