Re: [Rd] [External] Possible ALTREP bug
By the way, where is the documentation for INTEGER_ELT, REAL_ELT, etc? I looked in Writing R Extensions and R Internals but I did not see any mention. REAL_ELT is briefly mentioned on https://svn.r-project.org/R/branches/ALTREP/ALTREP.html Would it be possible to please add some mention of them to Writing R Extensions? - how many of these _ELT functions are there? INTEGER, REAL, ... ? - in what version of R were they introduced? - I guess input types are always SEXP and int? - What are the output types for each? On Fri, May 28, 2021 at 5:16 PM wrote: > Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may > be possible to check that places where they are used allow for them to > allocate. I have fixed the one that got caught by Gabor's example, and > a rchk run might be able to pick up others if rchk knows these could > allocate. (I may also be forgetting other places where the _ELt > methods are used.) Fixing all call sites for REAL, INTEGER, etc, was > never realistic so there GC has to be suspended during the method > call, and that is done in the dispatch mechanism. > > The bigger problem is jumps from inside things that existing code > assumes will not do that. Catching those jumps is possible but > expensive; doing anything sensible if one is caught is really not > possible. > > Best, > > luke > > On Fri, 28 May 2021, Gabriel Becker wrote: > > > Hi Jim et al, > > Just to hopefully add a bit to what Luke already answered, from what I am > > recalling looking back at that bioconductor thread Elt methods are used > in > > places where there are hard implicit assumptions that no garbage > collection > > will occur (ie they are called on things that aren't PROTECTed), and > beyond > > that, in places where there are hard assumptions that no error (longjmp) > > will occur. I could be wrong, but I don't know that suspending garbage > > collection would protect from the second one. Ie it is possible that an > > error *ever* being raised from R code that implements an elt method could > > cause all hell to break loose. > > > > Luke or Tomas Kalibera would know more. > > > > I was disappointed that implementing ALTREPs in R code was not in the > cards > > (it was in my original proposal back in 2016 to the DSC) but I trust Luke > > that there are important reasons we can't safely allow that. > > > > Best, > > ~G > > > > On Fri, May 28, 2021 at 8:31 AM Jim Hester > wrote: > > From reading the discussion on the Bioconductor issue tracker it > > seems like > > the reason the GC is not suspended for the non-string ALTREP Elt > > methods is > > primarily due to performance concerns. > > > > If this is the case perhaps an additional flag could be added to > > the > > `R_set_altrep_*()` functions so ALTREP authors could indicate if > > GC should > > be halted when that particular method is called for that > > particular ALTREP > > class. > > > > This would avoid the performance hit (other than a boolean > > check) for the > > standard case when no allocations are expected, but allow > > authors to > > indicate that R should pause GC if needed for methods in their > > class. > > > > On Fri, May 28, 2021 at 9:42 AM wrote: > > > > > integer and real Elt methods are not expected to allocate. You > > would > > > have to suspend GC to be able to do that. This currently can't > > be done > > > from package code. > > > > > > Best, > > > > > > luke > > > > > > On Fri, 28 May 2021, Gábor Csárdi wrote: > > > > > > > I have found some weird SEXP corruption behavior with > > ALTREP, which > > > > could be a bug. (Or I could be doing something wrong.) > > > > > > > > I have an integer ALTREP vector that calls back to R from > > the Elt > > > > method. When this vector is indexed in a lapply(), its first > > element > > > > gets corrupted. Sometimes it's just a type change to > > logical, but > > > > sometimes the corruption causes a crash. > > > > > > > > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small > > package > > > > that demonstrates this: > > https://github.com/gaborcsardi/redfish > > > > > > > > The R callback in this package calls > > `loadNamespace("Matrix")`, but > > > > the same crash happens for other packages as well, and > > sometimes it > > > > also happens if I don't load any packages at all. (But that > > example > > > > was much more complicated, so I went with the package > > loading.) > > > > > > > > It is somewhat random, and sometimes turning off the JIT > > avoids the > > > > crash, but not always. > > > > > > > > Hopefully I am just doing something wrong in the ALTREP code > > (see > > > > > >
Re: [Rd] [External] Possible ALTREP bug
*_ELT accessor functions are described in "vector accessor functions" in Writing R extensions. https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Vector-accessor-functions On Wed, Jun 16, 2021 at 4:22 PM Toby Hocking wrote: > By the way, where is the documentation for INTEGER_ELT, REAL_ELT, etc? I > looked in Writing R Extensions and R Internals but I did not see any > mention. > REAL_ELT is briefly mentioned on > https://svn.r-project.org/R/branches/ALTREP/ALTREP.html > Would it be possible to please add some mention of them to Writing R > Extensions? > - how many of these _ELT functions are there? INTEGER, REAL, ... ? > - in what version of R were they introduced? > - I guess input types are always SEXP and int? > - What are the output types for each? > > On Fri, May 28, 2021 at 5:16 PM wrote: > > > Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may > > be possible to check that places where they are used allow for them to > > allocate. I have fixed the one that got caught by Gabor's example, and > > a rchk run might be able to pick up others if rchk knows these could > > allocate. (I may also be forgetting other places where the _ELt > > methods are used.) Fixing all call sites for REAL, INTEGER, etc, was > > never realistic so there GC has to be suspended during the method > > call, and that is done in the dispatch mechanism. > > > > The bigger problem is jumps from inside things that existing code > > assumes will not do that. Catching those jumps is possible but > > expensive; doing anything sensible if one is caught is really not > > possible. > > > > Best, > > > > luke > > > > On Fri, 28 May 2021, Gabriel Becker wrote: > > > > > Hi Jim et al, > > > Just to hopefully add a bit to what Luke already answered, from what I > am > > > recalling looking back at that bioconductor thread Elt methods are used > > in > > > places where there are hard implicit assumptions that no garbage > > collection > > > will occur (ie they are called on things that aren't PROTECTed), and > > beyond > > > that, in places where there are hard assumptions that no error > (longjmp) > > > will occur. I could be wrong, but I don't know that suspending garbage > > > collection would protect from the second one. Ie it is possible that an > > > error *ever* being raised from R code that implements an elt method > could > > > cause all hell to break loose. > > > > > > Luke or Tomas Kalibera would know more. > > > > > > I was disappointed that implementing ALTREPs in R code was not in the > > cards > > > (it was in my original proposal back in 2016 to the DSC) but I trust > Luke > > > that there are important reasons we can't safely allow that. > > > > > > Best, > > > ~G > > > > > > On Fri, May 28, 2021 at 8:31 AM Jim Hester > > wrote: > > > From reading the discussion on the Bioconductor issue tracker it > > > seems like > > > the reason the GC is not suspended for the non-string ALTREP Elt > > > methods is > > > primarily due to performance concerns. > > > > > > If this is the case perhaps an additional flag could be added to > > > the > > > `R_set_altrep_*()` functions so ALTREP authors could indicate if > > > GC should > > > be halted when that particular method is called for that > > > particular ALTREP > > > class. > > > > > > This would avoid the performance hit (other than a boolean > > > check) for the > > > standard case when no allocations are expected, but allow > > > authors to > > > indicate that R should pause GC if needed for methods in their > > > class. > > > > > > On Fri, May 28, 2021 at 9:42 AM wrote: > > > > > > > integer and real Elt methods are not expected to allocate. You > > > would > > > > have to suspend GC to be able to do that. This currently can't > > > be done > > > > from package code. > > > > > > > > Best, > > > > > > > > luke > > > > > > > > On Fri, 28 May 2021, Gábor Csárdi wrote: > > > > > > > > > I have found some weird SEXP corruption behavior with > > > ALTREP, which > > > > > could be a bug. (Or I could be doing something wrong.) > > > > > > > > > > I have an integer ALTREP vector that calls back to R from > > > the Elt > > > > > method. When this vector is indexed in a lapply(), its first > > > element > > > > > gets corrupted. Sometimes it's just a type change to > > > logical, but > > > > > sometimes the corruption causes a crash. > > > > > > > > > > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small > > > package > > > > > that demonstrates this: > > > https://github.com/gaborcsardi/redfish > > > > > > > > > > The R callback in this package calls > > > `loadNamespace("Matrix")`, but > > > > > the same crash happens for other packages as well,
Re: [Rd] [External] Possible ALTREP bug
The usual quote applies: "use the source, Luke": $ grep _ELT *.h | sort Rdefines.h:#define SET_ELEMENT(x, i, val) SET_VECTOR_ELT(x, i, val) Rinternals.h: The function STRING_ELT is used as an argument to arrayAssign even Rinternals.h:#define VECTOR_ELT(x,i)((SEXP *) DATAPTR(x))[i] Rinternals.h://SEXP (STRING_ELT)(SEXP x, R_xlen_t i); Rinternals.h:Rbyte (RAW_ELT)(SEXP x, R_xlen_t i); Rinternals.h:Rbyte ALTRAW_ELT(SEXP x, R_xlen_t i); Rinternals.h:Rcomplex (COMPLEX_ELT)(SEXP x, R_xlen_t i); Rinternals.h:Rcomplex ALTCOMPLEX_ELT(SEXP x, R_xlen_t i); Rinternals.h:SEXP (STRING_ELT)(SEXP x, R_xlen_t i); Rinternals.h:SEXP (VECTOR_ELT)(SEXP x, R_xlen_t i); Rinternals.h:SEXP ALTSTRING_ELT(SEXP, R_xlen_t); Rinternals.h:SEXP SET_VECTOR_ELT(SEXP x, R_xlen_t i, SEXP v); Rinternals.h:double (REAL_ELT)(SEXP x, R_xlen_t i); Rinternals.h:double ALTREAL_ELT(SEXP x, R_xlen_t i); Rinternals.h:int (INTEGER_ELT)(SEXP x, R_xlen_t i); Rinternals.h:int (LOGICAL_ELT)(SEXP x, R_xlen_t i); Rinternals.h:int ALTINTEGER_ELT(SEXP x, R_xlen_t i); Rinternals.h:int ALTLOGICAL_ELT(SEXP x, R_xlen_t i); Rinternals.h:void ALTCOMPLEX_SET_ELT(SEXP x, R_xlen_t i, Rcomplex v); Rinternals.h:void ALTINTEGER_SET_ELT(SEXP x, R_xlen_t i, int v); Rinternals.h:void ALTLOGICAL_SET_ELT(SEXP x, R_xlen_t i, int v); Rinternals.h:void ALTRAW_SET_ELT(SEXP x, R_xlen_t i, Rbyte v); Rinternals.h:void ALTREAL_SET_ELT(SEXP x, R_xlen_t i, double v); Rinternals.h:void ALTSTRING_SET_ELT(SEXP, R_xlen_t, SEXP); Rinternals.h:void SET_INTEGER_ELT(SEXP x, R_xlen_t i, int v); Rinternals.h:void SET_LOGICAL_ELT(SEXP x, R_xlen_t i, int v); Rinternals.h:void SET_REAL_ELT(SEXP x, R_xlen_t i, double v); Rinternals.h:void SET_STRING_ELT(SEXP x, R_xlen_t i, SEXP v); So the indexing is with R_xlen_t and they return the value itself as one would expect. Cheers, Simon > On Jun 17, 2021, at 2:22 AM, Toby Hocking wrote: > > By the way, where is the documentation for INTEGER_ELT, REAL_ELT, etc? I > looked in Writing R Extensions and R Internals but I did not see any > mention. > REAL_ELT is briefly mentioned on > https://svn.r-project.org/R/branches/ALTREP/ALTREP.html > Would it be possible to please add some mention of them to Writing R > Extensions? > - how many of these _ELT functions are there? INTEGER, REAL, ... ? > - in what version of R were they introduced? > - I guess input types are always SEXP and int? > - What are the output types for each? > > On Fri, May 28, 2021 at 5:16 PM wrote: > >> Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may >> be possible to check that places where they are used allow for them to >> allocate. I have fixed the one that got caught by Gabor's example, and >> a rchk run might be able to pick up others if rchk knows these could >> allocate. (I may also be forgetting other places where the _ELt >> methods are used.) Fixing all call sites for REAL, INTEGER, etc, was >> never realistic so there GC has to be suspended during the method >> call, and that is done in the dispatch mechanism. >> >> The bigger problem is jumps from inside things that existing code >> assumes will not do that. Catching those jumps is possible but >> expensive; doing anything sensible if one is caught is really not >> possible. >> >> Best, >> >> luke >> >> On Fri, 28 May 2021, Gabriel Becker wrote: >> >>> Hi Jim et al, >>> Just to hopefully add a bit to what Luke already answered, from what I am >>> recalling looking back at that bioconductor thread Elt methods are used >> in >>> places where there are hard implicit assumptions that no garbage >> collection >>> will occur (ie they are called on things that aren't PROTECTed), and >> beyond >>> that, in places where there are hard assumptions that no error (longjmp) >>> will occur. I could be wrong, but I don't know that suspending garbage >>> collection would protect from the second one. Ie it is possible that an >>> error *ever* being raised from R code that implements an elt method could >>> cause all hell to break loose. >>> >>> Luke or Tomas Kalibera would know more. >>> >>> I was disappointed that implementing ALTREPs in R code was not in the >> cards >>> (it was in my original proposal back in 2016 to the DSC) but I trust Luke >>> that there are important reasons we can't safely allow that. >>> >>> Best, >>> ~G >>> >>> On Fri, May 28, 2021 at 8:31 AM Jim Hester >> wrote: >>> From reading the discussion on the Bioconductor issue tracker it >>> seems like >>> the reason the GC is not suspended for the non-string ALTREP Elt >>> methods is >>> primarily due to performance concerns. >>> >>> If this is the case perhaps an additional flag could be added to >>> the >>> `R_set_altrep_*()` functions so ALTREP authors could indicate if >>> GC should >>> be halted when that particular method is called for that >>> particular ALTREP >>> class. >>> >>> This would avoid the performance hit (other than a boolean >>>