Greetings,

I noticed that all the viewers using tcmalloc (mesh viewers under Linux
and Windows, since those use SSE2 and must gain memory-aligned malloc()
and new() calls that their standard library does not provide) suffer
from a serious problem: they never release memory back to the system,
meaning that after visiting a crowded place and caming around a lot,
the viewer can occupy 2.5Gb of memory, and even after you TP out to a
skybox with almost no objects/textures and no avatar around, the viewer
retains the full amount of alloctaed memory for itself.

Worst, should you manage to keep the viewer from crashing during a full
hour or so, its allocated memory will get so badly fragmented that it
starts crawling down and finally crashes, even in quiet sims.

Bao Linden recently worked on private memory pools to work around these
issues, but so far and despite his hard work, the result is less than 
satisfactory: the memory is still never released to the system, and the
viewers using private memory pools crash every few minutes after issuing
a warning:
"LLPluginProcessParent::poll: apr_pollset_poll failed with status 4

Well, be happy since I found an easy work around for these problems
while working on the Cool VL Viewer v1.26.1 (the mesh branch).

tcmalloc is actually supposed to release back to the system the memory
freed by the application using it, but it does so only after a certain
number of memory blocks have been freed. There is an environment
variable that you can set (TCMALLOC_RELEASE_RATE) to adjust the "rate"
at which tcmalloc will release the freed blocks back to the system.
In fact, this is not really a rate, but a divisor (the number of freed
blocks is divided by the rate number (when != 0: a 0 rate means "never
release memory"), and compared to a threshold. If the number is below
the threshold, the freed blocks are released.
The documentation for tcmalloc says that "Reasonable rates are in the
range [0,10]", but even with a rate of 10, you never get the viewer to
release more than a couple hundreds megabytes for 2+Gb of allocated
memory. It occurred to me that the algorithm tcmalloc uses is simply
crippled !

The good news, is that if you pass an "unreasonnable" rate, tcmalloc
will finally release memory (the more "unreasonnable" and the more
memory is released). With a rate of 10000 (yes, ten thousands), you
get the viewer to release everything when it doesn't need it any more,
which matches the behaviour of tcmalloc-less viewers.

Since the Windows builds don't use a wrapper script to launch the
viewer, it is however best to hardcode this new rate as the default
one in tcmalloc istelf. This is what I did for the Cool VL Viewer
and it works like a charm. There is only one line to change in
tcmalloc source, in src/page_heap.cc:
DEFINE_double(tcmalloc_release_rate,
    EnvToDouble("TCMALLOC_RELEASE_RATE", 10000.0),    <--- HERE
    "Rate at which we release unused memory to the system.  "
    "Zero means we never release memory back to the system.  "
    "Increase this flag to return memory faster; decrease it "
    "to return memory slower.  Reasonable rates are in the "
    "range [0,10]");

Now, the viewer runs rock stable (just like the non-mesh, tcmalloc-less
version) and uses very reasonnable amounts of memory. It also doesn't
suffer from memory fragmentation any more since it is transparently
taken care of by the OS (via the page table and the PMMU of the CPU,
something neither tcmalloc nor Bao's private memory pool can do since
these are userspace code).

For what it is worth...

Henri.
_______________________________________________
Policies and (un)subscribe information available here:
http://wiki.secondlife.com/wiki/OpenSource-Dev
Please read the policies before posting to keep unmoderated posting privileges

Reply via email to