Sorry for post a lot of things, for the first part of code, I copied my C++ iter macro by mistake(and you can see an explicit type casting). Here is the macro definition from R_exts/Itermacros.h
#define ITERATE_BY_REGION_PARTIAL(sx, px, idx, nb, etype, vtype, \ strt, nfull, expr) do { \ * const** etype *px = DATAPTR_OR_NULL(sx); * \ if (px != NULL) { \ R_xlen_t __ibr_n__ = strt + nfull; \ R_xlen_t nb = __ibr_n__; \ for (R_xlen_t idx = strt; idx < __ibr_n__; idx += nb) { \ expr \ } \ } \ else ITERATE_BY_REGION_PARTIAL0(sx, px, idx, nb, etype, vtype, \ strt, nfull, expr); \ } while (0) Best, Jiefei On Mon, Sep 23, 2019 at 3:12 PM Wang Jiefei <szwj...@gmail.com> wrote: > Hi Gabriel, > > I have tried the macro and found a small issue, it seems like the macro is > written in C and does an implicit type conversion(const void * to const int > *), see below. While it is allowed in C, C++ seems not happy with it. Is it > possible to add an explicit type casting so that it can be compatible with > both language? > > > #define ITERATE_BY_REGION_PARTIAL(sx, px, idx, nb, etype, vtype, \ > > strt, nfull, expr) do { \ > > *const etype *px = (const** etype *)DATAPTR_OR_NULL(sx); * > \ > > if (px != NULL) { \ > > R_xlen_t __ibr_n__ = strt + nfull; \ > > R_xlen_t nb = __ibr_n__; \ > > for (R_xlen_t idx = strt; idx < __ibr_n__; idx += nb) { \ > > expr \ > > } \ > > } \ > > else ITERATE_BY_REGION_PARTIAL0(sx, px, idx, nb, etype, > vtype, \ > > strt, nfull, expr); \ > > } while (0) > > > Also, I notice that the element type(etype) and vector type(vtype) has > to be specified in the macro. Since the SEXP is the first argument in the > macro, it seems redundant to define etype and vtype for they have to match > the type of the SEXP. I'm wondering if this is intentional? Will there be a > type-free macro in R in the future? Here is a simple type-free macro I'm > using. > > #define type_free_iter(sx, ptr, ind, nbatch,expr)\ > > switch(TYPEOF(sx)){\ > > case INTSXP:\ > > ITERATE_BY_REGION(sx, ptr, ind, nbatch, int, INTEGER, expr);\ > > break; \ > > case REALSXP:\ > > ITERATE_BY_REGION(sx, ptr, ind, nbatch, double, REAL, expr);\ > > break; \ > > case LGLSXP:\ > > ITERATE_BY_REGION(sx, ptr, ind, nbatch, int, LOGICAL, expr);\ > > break; \ > > default:\ > > Rf_error("Unknow data type\n"); \ > > break; \ > > } > > > > // [[Rcpp::export]] > > double sillysum(SEXP x) { > > double s = 0.0; > > type_free_iter(x, ptr, ind, nbatch, > > { > > for (int i = 0; i < nbatch; i++) { s = s + ptr[i]; } > > }); > > return s; > > } > > > > > Best, > > Jiefei > > On Wed, Aug 28, 2019 at 2:32 PM Wang Jiefei <szwj...@gmail.com> wrote: > >> Thank you, Gabriel. The loop macro is very helpful. It is also exciting >> to see that there are lots of changes in ALTREP in R devel version. I >> really appreciate your help! >> >> Best, >> Jiefei >> >> On Wed, Aug 28, 2019 at 7:37 AM Gabriel Becker <gabembec...@gmail.com> >> wrote: >> >>> Jiefei, >>> >>> I've been meaning to write up something about this so hopefully this >>> will be an impetus for me to actually do that, but until then, responses >>> inline. >>> >>> >>> On Tue, Aug 27, 2019, 7:22 PM Wang Jiefei <szwj...@gmail.com> wrote: >>> >>>> Hi devel team, >>>> >>>> I'm working on C/C++ level ALTREP compatibility for a package. The >>>> package >>>> previously used pointers to access the data of a SEXP, so it would not >>>> work >>>> for some ALTREP objects which do not have a pointer. I plan to rewrite >>>> the >>>> code and use functions like get_elt, get_region, and get_subset to >>>> access >>>> the values of a vector, so I have a few questions for ALTREP: >>>> >>>> 1. Since an ALTREP do not have to define all of the above >>>> functions(element, region, subset), is there any way to check which >>>> function has been defined for an ALTREP class? I did a search on >>>> RInternal.h and altrep.c but did not find a solution for it. If not, >>>> will >>>> it be added in the future? >>>> >>> >>> Element and region are guaranteed to always be defined and work (for >>> altrep and non-altrep INTSXP, REALSXP, LGLSXPs, etc, we currently don't >>> have region for STRSXP or VECSXP, I believe). If the altrep class does not >>> provide them then default methods will be used, which may be inefficient in >>> some cases but will work. Subset is currently a forward looking stub, but >>> once implimented, that will also be guaranteed to work for all valid ALTREP >>> classes. >>> >>> >>>> >>>> 2. Given the diversity of ALTREP classes, what is the best way to loop >>>> over >>>> an ALTREP object? I hope there can be an all-in-one function which can >>>> get >>>> the values from a vector as long as at least one of the above functions >>>> has >>>> been defined, so package developers would not be bothered by tons of >>>> `if-else` statement if they want their package to work with ALTREP. >>>> Since >>>> it seems like there is no such function exist, what could be the best >>>> way >>>> to do the loop under the current R version? >>>> >>> >>> The best way to loop over all SEXPs, which supports both altrep and >>> nonaltrep objects is, with the ITERATE_BY_REGION (which has been in R for a >>> number of released versions, at least since 3.5.0 I think) and the much >>> newer (devel only) ITERATE_BY_REGION_PARTIAL macros defined in >>> R_exts/Itermacros.h >>> >>> The meaning of the arguments is as follows for ITERATE_BY_REGION_PARTIAL >>> are as follows (ITERATE_BY_REGION is the same except no strt, and nfull). >>> >>> >>> - sx - C level variable name of the SEXP to iterate over >>> - px - variable name to use for the pointer populated with data from >>> a region of sx >>> - idx - variable name to use for the "outer", batch counter in the >>> for loop. This will contain the 0-indexed start position of the batch >>> you're currently processing >>> - nb - variable name to use for the current batch size. This will >>> always either be GET_REGION_BUFFSIZE (512), or the number of elements >>> remaining in the vector, whichever is smaller >>> - etype - element (C) type, e.g., int, double, of the data >>> - vtype - vector (access API) type, e.g, INTEGER, REAL >>> - strt - the 0-indexed position in the vector to start iterating >>> - nfull - the total number oif elements to iterate over from the >>> vector >>> - expr - the code to process a single batch (Which will do things to >>> px, typically) >>> >>> >>> So code to perform badly implemented not good idea summing of REALSXP >>> data might look like >>> >>> double sillysum(SEXP x) { >>> >>> double s = 0.0; >>> >>> ITERATE_BY_REGION(x, ptr, ind, nbatch, double, REAL, >>> { >>> >>> for(int i = 0; i < nbatch; i++) { s = s + ptr[i];} >>> }) >>> >>> return s; >>> } >>> >>> For meatier examples of ITERATE_BY_REGION's use in practice you can grep >>> the R sources. I know it is used in the implementations of the various >>> C-level summaries (summary.c), print and formatting functions, and anyNA. >>> >>> Some things to remember >>> >>> - If you have an inner loop like the one above, your total position >>> in the original vector is ind + i >>> - ITERATE_BY_REGION always processes the whole vector, if you need >>> to only do part of it yo'll either need custom breaking for both inner >>> and >>> outer loopsl, or in R-devel you can use ITERATE_BY_REGION_PARTIAL >>> - Don't use the variants ending in 0, all they do is skip over >>> things that are a good idea in the case of non-altreps (and some very >>> specific altreps). >>> >>> Hope that helps. >>> >>> Best, >>> ~G >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>> Best, >>>> Jiefei >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>> [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel