Hi everyone. In response to my previous message (Memory management issues), I've come up with the following patch against R 2.9.1.
To summarize the situation: - We're hitting the memory barrier in our lab when running concurrent R processes due to the large datasets we use. - We don't want to copy data back-and-forth between our R extension and R in order to reduce overall memory usage. There were some very useful suggestions in the list, but nothing optimal. With this patch, I export two new functions from memory.c called R_RegisterObject and R_UnregisterObject which simply allow to bypass allocVector. They accept a SEXP node (which needs to be allocated and initialized externally), protect it from collection by calling R_ProtectObject, and snap it temporarily into the GC oldest and largest heap generation until the object is unregistered. Since these functions require knowledge of the inner workings of the SEXP object, they are exported only if USE_RINTERNALS is defined. By using these two functions, we developed a simple R extension which allows to load data.frames directly from COW memory pages by using mmap(), resulting in significant memory sharing between various processes using the same datasets (and instantaneous load times). This allowed us to program most of our code directly in R instead or resorting to C for performance or memory constraints. Could someone review the attached patch and spot any potential problems? Is a change like this likely to be integrated into the R sources? We would like to release our current R extension for anyone to use. Thanks.
diff -rud R-2.9.1.Orig/include/Rinternals.h R-2.9.1/include/Rinternals.h --- R-2.9.1.Orig/include/Rinternals.h 2009-08-06 17:54:14.802212121 +0200 +++ R-2.9.1/include/Rinternals.h 2009-08-13 16:25:27.357130186 +0200 @@ -811,6 +811,12 @@ void R_PreserveObject(SEXP); void R_ReleaseObject(SEXP); +#ifdef USE_RINTERNALS +/* register external memory pages */ +void R_RegisterObject(SEXP); +void R_UnregisterObject(SEXP); +#endif + /* Shutdown actions */ void R_dot_Last(void); /* in main.c */ void R_RunExitFinalizers(void); /* in memory.c */ diff -rud R-2.9.1.Orig/src/main/memory.c R-2.9.1/src/main/memory.c --- R-2.9.1.Orig/src/main/memory.c 2009-08-06 17:54:14.426211791 +0200 +++ R-2.9.1/src/main/memory.c 2009-08-13 17:26:39.477168623 +0200 @@ -2450,6 +2450,37 @@ } +/* Allow to register an object allocated outside of the GC. The object must be + a valid SEXP, with a minimum of sizeof(SEXPREC_ALIGN) bytes of r/w memory. + The object is automatically added to the protection list in order to avoid + collection until the object is unregistered. The node is kept into the + oldest/largest heap generation. */ + +void R_RegisterObject(SEXP object) +{ + int class = LARGE_NODE_CLASS; + int gen = NUM_OLD_GENERATIONS - 1; + + SET_NODE_CLASS(object, class); + SET_NODE_GENERATION(object, gen); + SNAP_NODE(object, R_GenHeap[class].Old[gen]); + R_GenHeap[class].OldCount[gen]++; + + R_PreserveObject(object); +} + +void R_UnregisterObject(SEXP object) +{ + R_ReleaseObject(object); + + UNSNAP_NODE(object); + + int class = NODE_CLASS(object); + int gen = NODE_GENERATION(object); + R_GenHeap[class].OldCount[gen]--; +} + + /* External Pointer Objects */ SEXP R_MakeExternalPtr(void *p, SEXP tag, SEXP prot) {
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel