Thanks for this. Anyone know how I can find what those initial settings are from within R? Do I need to parse/look at both environment variables R_NSIZE and R_VSIZE and then commandArgs()?
/Henrik On Tue, Jan 20, 2015 at 1:42 AM, Martin Maechler <maech...@stat.math.ethz.ch> wrote: >>>>>> Peter Haverty <haverty.pe...@gene.com> >>>>>> on Mon, 19 Jan 2015 08:50:08 -0800 writes: > > > Hi All, This is a very important issue. It would be very > > sad to leave most users unaware of a free speedup of this > > size. These options don't appear in the R --help > > output. They really should be added there. > > Indeed, I've found that myself and had added them there about > 24 hours ago. > ((I think they were accidentally dropped a while ago)) > > > if the garbage collector is working very hard, might it > > emit a note about better setting for these variables? > > > It's not really my place to comment on design philosophy, > > but if there is a configure option for small memory > > machines I would assume that would be sufficient for the > > folks that are not on fairly current hardware. > > There's quite a few more issues with this, > notably how the growth *steps* are done. > That has been somewhat experimental and for that reason is > _currently_ quite configurable via R_GC_* environment variables, > see the code in src/main/memory.c > > This is currently discussed "privately" within the R core. > I'm somewhat confident that R 3.2.0 in April will have changes. > > And -- coming back to the beginning -- at least the "R-devel" version now > shows > > R --help | grep -e min-.size > > --min-nsize=N Set min number of fixed size obj's ("cons cells") to N > --min-vsize=N Set vector heap minimum to N bytes; '4M' = 4 MegaB > > -- > Martin Maechler, ETH Zurich > > > On Sat, Jan 17, 2015 at 11:40 PM, Nathan Kurz <n...@verse.com> wrote: > > >> On Thu, Jan 15, 2015 at 3:55 PM, Michael Lawrence > >> <lawrence.mich...@gene.com> wrote: > >> > Just wanted to start a discussion on whether R could ship with more > >> > appropriate GC parameters. > >> > >> I've been doing a number of similar measurements, and have come to the > >> same conclusion. R is currently very conservative about memory usage, > >> and this leads to unnecessarily poor performance on certain problems. > >> Changing the defaults to sizes that are more appropriate for modern > >> machines can often produce a 2x speedup. > >> > >> On Sat, Jan 17, 2015 at 8:39 AM, <luke-tier...@uiowa.edu> wrote: > >> > Martin Morgan discussed this a year or so ago and as I recall bumped > >> > up these values to the current defaults. I don't recall details about > >> > why we didn't go higher -- maybe Martin does. > >> > >> I just checked, and it doesn't seem that any of the relevant values > >> have been increased in the last ten years. Do you have a link to the > >> discussion you recall so we can see why the changes weren't made? > >> > >> > I suspect the main concern would be with small memory machines in > >> student labs > >> > and less developed countries. > >> > >> While a reasonable concern, I'm doubtful there are many machines for > >> which the current numbers are optimal. The current minimum size > >> increases for node and vector heaps are 40KB and 80KB respectively. > >> This grows as the heap grows (min + .05 * heap), but still means that > >> we do many more expensive garbage collections at while growing than we > >> need to. Paradoxically, the SMALL_MEMORY compile option (which is > >> suggestd for computers with up to 32MB of RAM) has slightly larger at > >> 50KB and 100KB. > >> > >> I think we'd get significant benefit for most users by being less > >> conservative about memory consumption. The exact sizes should be > >> discussed, but with RAM costing about $10/GB it doesn't seem > >> unreasonable to assume most machines running R have multiple GB > >> installed, and those that don't will quite likely be running an OS > >> that needs a custom compiled binary anyway. > >> > >> I could be way off, but my suggestion might be a 10MB start with 1MB > >> minimum increments for SMALL_MEMORY, 100MB start with 10MB increments > >> for NORMAL_MEMORY, and 1GB start with 100MB increments for > >> LARGE_MEMORY might be a reasonable spread. > >> > >> Or one could go even larger, noting that on most systems, > >> overcommitted memory is not a problem until it is used. Until we > >> write to it, it doesn't actually use physical RAM, just virtual > >> address space. Or we could stay small, but make it possible to > >> programmatically increase the granularity from within R. > >> > >> For ease of reference, here are the relevant sections of code: > >> > >> https://github.com/wch/r-source/blob/master/src/include/Defn.h#L217 > >> (ripley last authored on Jan 26, 2000 / pd last authored on May 8, > 1999) > >> 217 #ifndef R_NSIZE > >> 218 #define R_NSIZE 350000L > >> 219 #endif > >> 220 #ifndef R_VSIZE > >> 221 #define R_VSIZE 6291456L > >> 222 #endif > >> > >> https://github.com/wch/r-source/blob/master/src/main/startup.c#L169 > >> (ripley last authored on Jun 9, 2004) > >> 157 Rp->vsize = R_VSIZE; > >> 158 Rp->nsize = R_NSIZE; > >> 166 #define Max_Nsize 50000000 /* about 1.4Gb 32-bit, 2.8Gb 64-bit */ > >> 167 #define Max_Vsize R_SIZE_T_MAX /* unlimited */ > >> 169 #define Min_Nsize 220000 > >> 170 #define Min_Vsize (1*Mega) > >> > >> https://github.com/wch/r-source/blob/master/src/main/memory.c#L335 > >> (luke last authored on Nov 1, 2000) > >> #ifdef SMALL_MEMORY > >> 336 /* On machines with only 32M of memory (or on a classic Mac OS > port) > >> 337 it might be a good idea to use settings like these that are > more > >> 338 aggressive at keeping memory usage down. */ > >> 339 static double R_NGrowIncrFrac = 0.0, R_NShrinkIncrFrac = 0.2; > >> 340 static int R_NGrowIncrMin = 50000, R_NShrinkIncrMin = 0; > >> 341 static double R_VGrowIncrFrac = 0.0, R_VShrinkIncrFrac = 0.2; > >> 342 static int R_VGrowIncrMin = 100000, R_VShrinkIncrMin = 0; > >> 343#else > >> 344 static double R_NGrowIncrFrac = 0.05, R_NShrinkIncrFrac = 0.2; > >> 345 static int R_NGrowIncrMin = 40000, R_NShrinkIncrMin = 0; > >> 346 static double R_VGrowIncrFrac = 0.05, R_VShrinkIncrFrac = 0.2; > >> 347 static int R_VGrowIncrMin = 80000, R_VShrinkIncrMin = 0; > >> 348#endif > >> > >> static void AdjustHeapSize(R_size_t size_needed) > >> { > >> R_size_t R_MinNFree = (R_size_t)(orig_R_NSize * R_MinFreeFrac); > >> R_size_t R_MinVFree = (R_size_t)(orig_R_VSize * R_MinFreeFrac); > >> R_size_t NNeeded = R_NodesInUse + R_MinNFree; > >> R_size_t VNeeded = R_SmallVallocSize + R_LargeVallocSize + > >> size_needed + R_MinVFree; > >> double node_occup = ((double) NNeeded) / R_NSize; > >> double vect_occup = ((double) VNeeded) / R_VSize; > >> > >> if (node_occup > R_NGrowFrac) { > >> R_size_t change = (R_size_t)(R_NGrowIncrMin + R_NGrowIncrFrac > >> * R_NSize); > >> if (R_MaxNSize >= R_NSize + change) > >> R_NSize += change; > >> } > >> else if (node_occup < R_NShrinkFrac) { > >> R_NSize -= (R_NShrinkIncrMin + R_NShrinkIncrFrac * R_NSize); > >> if (R_NSize < NNeeded) > >> R_NSize = (NNeeded < R_MaxNSize) ? NNeeded: R_MaxNSize; > >> if (R_NSize < orig_R_NSize) > >> R_NSize = orig_R_NSize; > >> } > >> > >> if (vect_occup > 1.0 && VNeeded < R_MaxVSize) > >> R_VSize = VNeeded; > >> if (vect_occup > R_VGrowFrac) { > >> R_size_t change = (R_size_t)(R_VGrowIncrMin + R_VGrowIncrFrac > >> * R_VSize); > >> if (R_MaxVSize - R_VSize >= change) > >> R_VSize += change; > >> } > >> else if (vect_occup < R_VShrinkFrac) { > >> R_VSize -= R_VShrinkIncrMin + R_VShrinkIncrFrac * R_VSize; > >> if (R_VSize < VNeeded) > >> R_VSize = VNeeded; > >> if (R_VSize < orig_R_VSize) > >> R_VSize = orig_R_VSize; > >> } > >> > >> DEBUG_ADJUST_HEAP_PRINT(node_occup, vect_occup); > >> } > >> > Rp-> nsize is overridden at startup by environment variable R_NSIZE if > >> Min_Nsize <= $R_NSIZE <= Max_Nsize. Rp->vsize is overridden at > >> startup by environment variable R_VSIZE if Min_Vsize <= $R_VSIZE <= > >> Max_Vsize. These are then used to set the global variables R_Nsize > >> and R_Vsize with R_SetMaxVSize(Rp->max_vsize). > >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel