[R-pkg-devel] using portable simd instructions
Hello R-package-devel, I recently got inspired by the rcppsimdjson package to try out simd registers. It works fantastic on my computer but I struggle to find information on how to make it portable. It doesn't help in this case that R and Rcpp make including Cpp code so easy that I have never had to learn about cmake and compiler flags. I would appreciate any help, including of the type: "go read instructions at ...". I use RcppArmadillo and Rcpp. I currenlty include the following header: #include The functions in immintrin that I use are: _mm256_loadu_pd _mm256_set1_pd _mm256_mul_pd _mm256_fmadd_pd _mm256_storeu_pd and I define up to four __m256d registers. From information found online (not sure where anymore) I constructed the following makevars file: CXX_STD = CXX14 PKG_CPPFLAGS = -I../inst/include -mfma -msse4.2 -mavx PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS) PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) (I also use openmp, that has always worked fine, I just included all lines for completeness) Rcheck gives me two notes: ─ using R version 4.3.2 (2023-10-31 ucrt) ─ using platform: x86_64-w64-mingw32 (64-bit) ─ R was compiled by gcc.exe (GCC) 12.3.0 GNU Fortran (GCC) 12.3.0 ❯ checking compilation flags used ... NOTE Compilation used the following non-portable flag(s): '-mavx' '-mfma' '-msse4.2' ❯ checking C++ specification ... NOTE Specified C++14: please drop specification unless essential But as far as I understand, the flags are necessary, at least in GCC. How can I make this portable and CRAN-acceptable? kind regards, Jesse __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] How to store large data to be used in an R package?
On 25 March 2024 at 11:12, Jairo Hidalgo Migueles wrote: | I'm reaching out to seek some guidance regarding the storage of relatively | large data, ranging from 10-40 MB, intended for use within an R package. | Specifically, this data consists of regression and random forest models | crucial for making predictions within our R package. | | Initially, I attempted to save these models as internal data within the | package. While this approach maintains functionality, it has led to a | package size exceeding 20 MB. I'm concerned that this would complicate | submitting the package to CRAN in the future. | | I would greatly appreciate any suggestions or insights you may have on | alternative methods or best practices for efficiently storing and accessing | this data within our R package. Brooke and I wrote a paper on one way of addressing it via a 'data' package accessibly via an Additional_repositories: entry supported by a drat repo. See https://journal.r-project.org/archive/2017/RJ-2017-026/index.html for the paper which contains a nice slow walkthrough of all the details. Dirk -- dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Check results on r-devel-windows claiming error but tests seem to pass?
Avi, That was a hickup and is now taken care of. When discussing this (off-line) with Jeroen we (rightly) suggested that keeping an eye on https://contributor.r-project.org/svn-dashboard/ is one possibility to keep track while we have no status alert system from CRAN. I too was quite confused because a new upload showed errors, and win-builder for r-devel just swallowed any uploads. Cheers, Dirk -- dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] using portable simd instructions
On 26 March 2024 at 10:53, jesse koops wrote: | How can I make this portable and CRAN-acceptable? But writing (or borrowing ?) some hardware detection via either configure / autoconf or cmake. This is no different than other tasks decided at install-time. Start with 'Writing R Extensions', as always, and work your way up from there. And if memory serves there are already a few other packages with SIMD at CRAN so you can also try to take advantage of the search for a 'token' (here: 'SIMD') at the (unofficial) CRAN mirror at GitHub: https://github.com/search?q=org%3Acran%20SIMD&type=code Hth, Dirk -- dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] using portable simd instructions
On 3/26/24 10:53, jesse koops wrote: Hello R-package-devel, I recently got inspired by the rcppsimdjson package to try out simd registers. It works fantastic on my computer but I struggle to find information on how to make it portable. It doesn't help in this case that R and Rcpp make including Cpp code so easy that I have never had to learn about cmake and compiler flags. I would appreciate any help, including of the type: "go read instructions at ...". I use RcppArmadillo and Rcpp. I currenlty include the following header: #include The functions in immintrin that I use are: _mm256_loadu_pd _mm256_set1_pd _mm256_mul_pd _mm256_fmadd_pd _mm256_storeu_pd and I define up to four __m256d registers. From information found online (not sure where anymore) I constructed the following makevars file: CXX_STD = CXX14 PKG_CPPFLAGS = -I../inst/include -mfma -msse4.2 -mavx PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS) PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) (I also use openmp, that has always worked fine, I just included all lines for completeness) Rcheck gives me two notes: ─ using R version 4.3.2 (2023-10-31 ucrt) ─ using platform: x86_64-w64-mingw32 (64-bit) ─ R was compiled by gcc.exe (GCC) 12.3.0 GNU Fortran (GCC) 12.3.0 ❯ checking compilation flags used ... NOTE Compilation used the following non-portable flag(s): '-mavx' '-mfma' '-msse4.2' ❯ checking C++ specification ... NOTE Specified C++14: please drop specification unless essential But as far as I understand, the flags are necessary, at least in GCC. How can I make this portable and CRAN-acceptable? I think it the best way for portability is to use a higher-level library that already has done the low-level business of maintaining multiple versions of the code (with multiple instruction sets) and choosing one appropriate for the current CPU. It could be say LAPACK, BLAS, openmp, depending of the problem at hand. In some cases, code can be rewritten so that the compiler can vectorize it better, using the level of vectorized instructions that have been enabled. Unconditionally using GCC-specific or architecture-specific options in packages would certainly not be portable. Even on Windows, R is now used also with clang and on aarch64, so one should not assume a concrete compiler and architecture. Please note also that GCC on Windows has a bug due to which AVX2 instructions cannot be used reliably - the compiler doesn't always properly align local variables on the stack when emitting these. See [1,2] for more information. Best Tomas [1] https://stat.ethz.ch/pipermail/r-sig-windows/2024q1/000113.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 kind regards, Jesse __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] using portable simd instructions
Hi Jesse, What I've done is to use a mix of compile-time detection of compiler SIMD support and run-time detection of SIMD hardware support. At package load, SIMD-specific versions of functions are installed in a symbol table. It's not perfect and it can be hard to support evolving platforms, especially now that ARM is more prevalent. However, it does allow for distribution on CRAN as it uses only autoconf, POSIX make, and no specific compiler. At compile time: 1. Use a configure script to detect the platform and any SIMD instructions supported by the compiler. This is also the time to identify the compiler flags necessary to enable instruction sets. Unlike what the existing autoconf macros do, you can ignore whether or not the host system supports the instruction sets (with the exception when compiling with Solaris Studio - it won't let you load a binary with instructions not supported by the host, even if they cannot be executed). 2. Use makefiles to conditionally compile different versions of the functions you want, one for each level of instruction set supported by the compiler, using the flags detected above. They all should be in different files with different symbols. For example: partition_sse2.c defines partition_sse2(), partition_avx.c defines partition_avx(), etc., while partition.c defines partition_c() - a fall-back compiled without any SIMD instructions. Note that echoing compilations with SIMD flags will trigger a check warning, as those units are not inherently portable. That is addressed below. At run time: 1. On package load, detect what instruction sets are supported by the host. On x86 machines, this usually involves a call to cpuid. 2. For the maximum level of instruction set supported by the host, install the relevant symbol for each function into a symbol table. Using the example above, a header defines an external function pointer partition(), which gets set to one of the SIMD-specific implementations. In setting that up, I found Agner Fog's notes on CPU dispatching to be extremely helpful. They can be found here: https://www.agner.org/optimize. I use this strategy in the dbarts package, the code for which is here: https://github.com/vdorie/dbarts. Best, Vince On Tue, Mar 26, 2024 at 10:45 AM Dirk Eddelbuettel wrote: > > On 26 March 2024 at 10:53, jesse koops wrote: > | How can I make this portable and CRAN-acceptable? > > But writing (or borrowing ?) some hardware detection via either configure / > autoconf or cmake. This is no different than other tasks decided at > install-time. > > Start with 'Writing R Extensions', as always, and work your way up from > there. And if memory serves there are already a few other packages with > SIMD > at CRAN so you can also try to take advantage of the search for a 'token' > (here: 'SIMD') at the (unofficial) CRAN mirror at GitHub: > >https://github.com/search?q=org%3Acran%20SIMD&type=code > > Hth, Dirk > > -- > dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > > __ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel > [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Check results on r-devel-windows claiming error but tests seem to pass?
On 26 March 2024 at 09:37, Dirk Eddelbuettel wrote: | | Avi, | | That was a hickup and is now taken care of. When discussing this (off-line) | with Jeroen we (rightly) suggested that keeping an eye on Typo, as usual, "he (rightly) suggested". My bad. D. | |https://contributor.r-project.org/svn-dashboard/ | | is one possibility to keep track while we have no status alert system from | CRAN. I too was quite confused because a new upload showed errors, and | win-builder for r-devel just swallowed any uploads. | | Cheers, Dirk | | -- | dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org | | __ | R-package-devel@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-package-devel -- dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel