On Thu, May 4, 2023 at 3:02 PM Serguei Sokol via R-devel <r-devel@r-project.org> wrote: > > Le 03/05/2023 à 01:25, Henrik Bengtsson a écrit : > > Along the lines of calling R_CheckUserInterrupt() only onces in a while: > > > >> OTOH, in the past we have had to *disable* R_CheckUserInterrupt() > >> in parts of R's code because it was too expensive, > >> {see current src/main/{seq.c,unique.c} for a series of commented-out > >> R_CheckUserInterrupt() for such speed-loss reasons} > > First, here are links to these two files viewable online: > > > > * https://github.com/wch/r-source/blob/trunk/src/main/seq.c > > > > * https://github.com/wch/r-source/blob/trunk/src/main/unique.c > > > > When not commented out, R_CheckUserInterrupt() would have been called > > every 1,000,000 times per: > > > > /* interval at which to check interrupts */ > > #define NINTERRUPT 1000000 > > > > and > > > > if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt() > > > > in each iteration. That '(i+1) % NINTERRUPT == 0' expression can be > > quite expensive too. > I vaguely remember a hack that was mentioned on this list as close to > 0-cost. It looked something like: > > iint = NINTERRUPT; > for (...) { > if (--iint == 0) { > R_CheckUserInterrupt(); > iint = NINTERRUPT; > } > } > > Best, > Serguei.
> > Alternatively, one can increment a counter and reset to zero after > > calling R_CheckUserInterrupt(); I think that's equally performant. Yes, that's the one, e.g. Tomas K migrated some "modulo" ones in R-devel to this one yesterday (https://github.com/wch/r-source/commit/1ca6c6c6246629c6a98a526a2906595e5cfcd45e). /Henrik > > > However, if we change the code to use NINTERRUPT > > = 2^k where k = {1, 2, ...}, say > > > > #define NINTERRUPT 1048576 > > > > the compiler would optimize the condition to use "the modulo of powers > > of 2 can alternatively be expressed as a bitwise AND operation" > > (Thomas Lumley, 2015-06-15). The speedup is quite impressive, cf. > > <https://www.jottr.org/2015/06/05/checkuserinterrupt/>. > > Alternatively, one can increment a counter and reset to zero after > > calling R_CheckUserInterrupt(); I think that's equally performant. > > > > Regarding making serialize() / unserialize() interruptible: I think > > can be a good idea since we work with larger objects these days. > > However, if we implement this, we probably have to consider what > > happens when an interrupt happens. For example, transfers between a > > client and a server are no longer atomic at this level, which means we > > might end up in a corrupt state. This may, for instance, happen to > > database transactions, and PSOCK parallel worker communication. A > > quick fix would be to use base::suspendInterrupts(), but better would > > of course be to handle interrupts gracefully. > > > > My $.02 + $0.02 > > > > /Henrik > > > > On Tue, May 2, 2023 at 3:56 PM Jeroen Ooms <jeroeno...@gmail.com> wrote: > >> On Tue, May 2, 2023 at 3:29 PM Martin Maechler > >> <maech...@stat.math.ethz.ch> wrote: > >>>>>>>> Ivan Krylov > >>>>>>>> on Tue, 2 May 2023 14:59:36 +0300 writes: > >>> > В Sat, 29 Apr 2023 00:00:02 +0000 > >>> > Dario Strbenac via R-devel <r-devel@r-project.org> пишет: > >>> > >>> >> Could save.image() be redesigned so that it promptly responds to > >>> >> Ctrl+C? It prevents the command line from being used for a number > >>> of > >>> >> hours if the contents of the workspace are large. > >>> > >>> > This is ultimately caused by serialize() being non-interruptible. A > >>> > relatively simple way to hang an R session for a long-ish time > >>> would > >>> > therefore be: > >>> > >>> > f <- xzfile(nullfile(), 'a+b') > >>> > x <- rep(0, 1e9) # approx. 8 gigabytes, adjust for your RAM size > >>> > serialize(x, f) > >>> > close(f) > >>> > >>> > This means that calling R_CheckUserInterrupt() between saving > >>> > individual objects is not enough: R also needs to check for > >>> interrupts > >>> > while saving sufficiently long vectors. > >>> > >>> > Since the serialize() infrastructure is carefully written to avoid > >>> > resource leaks on allocation failures, it looks relatively safe to > >>> > liberally sprinkle R_CheckUserInterrupt() where it makes sense to > >>> do > >>> > so, i.e. once per WriteItem() (which calls itself recursively and > >>> > non-recursively) and once per every downstream for loop iteration. > >>> > Valgrind doesn't show any new leaks if I apply the patch, interrupt > >>> > serialize() and then exit. R also passes make check after the > >>> applied > >>> > patch. > >>> > >>> > Do these changes make sense, or am I overlooking some other > >>> problem? > >>> > >>> Thank you, Ivan! > >>> > >>> They do make sense... but : > >>> > >>> OTOH, in the past we have had to *disable* R_CheckUserInterrupt() > >>> in parts of R's code because it was too expensive, > >>> {see current src/main/{seq.c,unique.c} for a series of commented-out > >>> R_CheckUserInterrupt() for such speed-loss reasons} > >>> > >>> so adding these may need a lot of care when we simultaneously > >>> want to remain efficient for "morally valid" use of serialization... > >>> where we really don't want to pay too much of a premium. > >> Alternatively, one could consider making R throttle or debounce calls > >> to R_CheckUserInterrupt such that a repeated calls within x time are > >> ignored, cf: https://www.freecodecamp.org/news/javascript-debounce-example/ > >> > >> The reasoning being that it may be difficult for (contributed) code to > >> determine when/where it is appropriate to check for interrupts, given > >> varying code paths and cpu speed. Maybe it makes more sense to call > >> R_CheckUserInterrupt frequently wherever it is safe to do so, and let > >> R decide if reasonable time has elapsed to actually run the (possibly > >> expensive) ui check again. > >> > >> Basic example: https://github.com/r-devel/r-svn/pull/125/files > >> > >> > >> > >> > >>> {{ saving the whole user workspace is not "valid" in that sense > >>> in my view. I tell all my (non-beginner) Rstudio-using > >>> students they should turn *off* the automatic saving and > >>> loading at session end / beginning; and for reproducibility > >>> only saveRDS() [or save()] *explicitly* a few precious > >>> objects .. > >>> }} > >>> > >>> Again, we don't want to punish people who know what they are > >>> doing, just because other R users manage to hang their R session > >>> by too little thinking ... > >>> > >>> Your patch adds 15 such interrupt checking calls which may > >>> really be too much -- I'm not claiming they are: with our > >>> recursive objects it's surely not very easy to determine the > >>> "minimally necessary" such calls. > >>> > >>> In addition, we may still consider adding an extra optional > >>> argument, say `check.interrupt = TRUE` > >>> which we may default to TRUE when save.image() is called > >>> but e.g., not when serialize() is called.. > >>> > >>> Martin > >>> > >>> > -- > >>> > Best regards, > >>> > Ivan > >>> > x[DELETED ATTACHMENT external: > >>> Rd_IvanKrylov_interrupt-serialize.patch, text/x-patch] > >>> > ______________________________________________ > >>> > R-devel@r-project.org mailing list > >>> > https://stat.ethz.ch/mailman/listinfo/r-devel > >>> > >>> ______________________________________________ > >>> R-devel@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-devel > >> ______________________________________________ > >> R-devel@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > -- > Serguei Sokol > Ingenieur de recherche INRAE > > Cellule Mathématiques > TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504 > 135 Avenue de Rangueil > 31077 Toulouse Cedex 04 > > tel: +33 5 61 55 98 49 > email: so...@insa-toulouse.fr > http://www.toulouse-biotechnology-institute.fr/en/technology_platforms/mathematics-cell.html > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel