I work with problems that have rather large data requirements -- typically
a bunch of multigig arrays. Given how generous R is with using memory, the
only way for me to work with R has been to use bigmatrices from bigmemory
package. One thing that is missing a bit is interoperability of bigmatrices
with C++ and possibly java. What i mean by that is API that would allow
read and write filebacked matrices from C++, and ideally java without being
called from R. Having ability to save armadillo matrices into filebacked
matrices and load them back into armadillo would be another very useful
thing. This would allow really smooth cooperation between various pieces of
software. I would prefer to avoid using Rinside for that.

I guess i could hack bigmemory C++ code a bit, compile it into a C++ shared
library and it'll do. I guess i could hack it a bit to work with armadillo
matrices as well. I don't want however to reinvent the wheel and if there
is something like that already somewhere i would rather use it for the
moment. Looking very much for suggestions. If there is truly nothing like
that and someone with C++ or especially java development experience is
interested and want to cooperate on this, let me know too.

Best
Andre

NB. I guess something like what i want -- access to the same disc caches
from R, C++, and java (and python) exists in HDF world. I, however, don't
know how performance of HDF compares with bigmemory matrices, which i come
to like and appreciate a lot. If there is someone who could address
simplicity of use and performance of HDF vs bigmemory, it'd be very
interesting.

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to