Greetings! This is just a discussion post on where things stand. Please feel free to skip whatever you wish, but any feedback is of course helpful.
Bob makes the excellent point that we should design things to make one process run as fast as possible, and forget about other jobs as much as possible. Given my experiments thus far, it looks like this approach might win out in any case. Bob, I hope you are pleased by this :-). That said, the attempt to use all of physical ram conflicts openly with multiple jobs, so something must be done, even if minimal. And the minimal solution is this environment variable: GCL_MEM_MULTIPLE=0.125 will multiply the physical ram seen by each process by this value. So make -j 8 GCL_MEM_MULTIPLE=0.125 is the logical approach, though one might do better by raising the 0.125 somewhat as all jobs won't use all that memory anyway. On the plus side, each process decides when to start gc independently. On the minus side, big jobs will bear a larger gc load then they would have to in theory. So the other approach is this environment variable: GCL_MULTIPROCESS_MEMORY_POOL=t which (only) when set, will maintain a shared locked file /tmp/gcl_pool containing the summed resident set size of all processes, and use this as the value to compare against physical ram when deciding we're full enough to start gc. This is working, and one can see (via top) how big jobs are afforded more ram. Paradoxically, it may or may not improve the overall regression time. We'll know more here soon. There are two environment variables which jointly determine the gc threshold: GCL_GC_PAGE_THRESH (default 0.75) means we will not start gc until the data size is at least 0.75 of physical ram. This can be set to 1.0, and perhaps should logically, but remember that GCL is constantly calling gcc in a subprocess, and this can be a memory hog leading to a swap storm. Alas, at this point I know of no way to manage the memory use of gcc, so this value is a heuristic. GCL_GC_ALLOCATION_THRESH (default 0.125) means we will not gc until we have allocated (since the last gc) one eighth of physical ram. This is an alternative solution to the problem of rebalancing maxpages, whereby a job could load up on cons for a long time, leave a tiny array allocation, then start allocating arrays when there is no more physical ram to expand into. Recall that the variable si::*optimize-maximum-pages* would attempt to collect gc statistics and rebalance these maxpage limits based on the actual demand. This is OK as a workaround, but does require you start collecting statistics before its 'too late' and you've already allocated most of physical ram. But the real problem is that gc cost is proportional to heap size and live heap size, and triggering based on an unrelated quantity (suballocation of a given data element size) makes no real sense. Earlier in the 2.6.13 series, we found that simply scaling the maxpages to physical ram at the outset was a big win, but then again, all we had to scale by was the current allocation in the saved image, which makes no real sense. So in short, when si::*optimize-maximum-pages* is set GCL will now ignore maxpage settings as a gc trigger, and use the above thresholds instead. When unset, GCL will use minimal maxpage expansion via its traditional algorithm and trigger (frequent) gc when these maxpage limits are hit, without any attempt to collect statistics to expand/rebalance them. This mode is to be used when preparing a small image to be saved to disk, e.g. acl2 build time. My concern is there appear to be too many variables here. At a minimum, we need a 'small image to be saved to disk' mode, and a 'use as much ram as possible for speed' mode, and some mechanism to reduce the ram used when running multiple jobs. But in principle the last three environment variables could be removed and replaced with constants. Takg Version_2_6_13pre14a is build and installed at ut, and undergoing testing since last night. It looks solid so far. Thoughts most appreciated. Take care, Robert Boyer <[email protected]> writes: >> This seems closest in the spirit to sol-gc. > > As best I can guess, Acl2 is headed towards > not using sol-gc in CCL in the 7.1 release of Acl2. > > It's not my place to speak, and those who know may > say that any problem with sol-gc may have been, who really knows, that it was > using interrupts of the gc and that was too dangerous to do. Interrupts > should scare the crap out of anyone. > > But Sol's main idea I think was to allocate a hell of a lot of memory, all of > the memory, for > the heap to free space after a gc in order to keep gc costs as low as > possible for this one process. And to hell with any other processes except > this one. > > Camm, > > I think that your objective should be for  j=1 speed and not j=8 at all. > The ordinary user almost > all of the time is using j=1, and as far as I know, only people like Matt > regularly use j=8 > and that only for regression testing before they release a new version of > Acl2. > > Just my two cents worth. I would certainly go with whatever Matt advises, > rather than with what I advise. > > Bob > > On Mon, May 4, 2015 at 11:51 AM, Camm Maguire <[email protected]> wrote: -- Camm Maguire [email protected] ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah _______________________________________________ Gcl-devel mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gcl-devel
