On 08/19/2015 03:07 AM, Haomai Wang wrote:
On Wed, Aug 19, 2015 at 1:36 PM, Somnath Roy <[email protected]> wrote:
Mark,
Thanks for verifying this. Nice report !
Since there is a big difference in memory consumption with jemalloc, I would 
say a recovery performance data or client performance data during recovery 
would be helpful.


The RSS memory usage in the report is per OSD I guess(really?). It
can't be ignored since it's really a great improvement memory usage.

Do you mean with tcmalloc? I think it's a tough decision. For jemalloc, 300MB more of RSS per OSD does add up (about 18GB for 60 OSDs). On the other hand, the cost of memory is such a small fraction of the overall cost of systems like this that it might be worth it to switch over anyway. In the 4K write tests it's pretty clear that even with 128MB TC, TCMalloc is suffering and jemalloc appears to still have headroom left. It's possible that bumping the thread cache even higher might help TCMalloc close the gap though. It's also possible that jemalloc might have worse memory behavior under recovery scenarios as we discussed at the hackathon (And Somnath mentioned above), so I think we probably need to run the tests.


Thanks & Regards
Somnath

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Mark Nelson
Sent: Tuesday, August 18, 2015 9:46 PM
To: ceph-devel
Subject: Ceph Hackathon: More Memory Allocator Testing

Hi Everyone,

One of the goals at the Ceph Hackathon last week was to examine how to improve 
Ceph Small IO performance.  Jian Zhang presented findings showing a dramatic 
improvement in small random IO performance when Ceph is used with jemalloc.  
His results build upon Sandisk's original findings that the default thread 
cache values are a major bottleneck in TCMalloc 2.1.  To further verify these 
results, we sat down at the Hackathon and configured the new performance test 
cluster that Intel generously donated to the Ceph community laboratory to run 
through a variety of tests with different memory allocator configurations.  
I've since written the results of those tests up in pdf form for folks who are 
interested.

The results are located here:

http://nhm.ceph.com/hackathon/Ceph_Hackathon_Memory_Allocator_Testing.pdf

I want to be clear that many other folks have done the heavy lifting here.  
These results are simply a validation of the many tests that other folks have 
already done.  Many thanks to Sandisk and others for figuring this out as it's 
a pretty big deal!

Side note:  Very little tuning other than swapping the memory allocator and a 
couple of quick and dirty ceph tunables were set during these tests. It's quite 
possible that higher IOPS will be achieved as we really start digging into the 
cluster and learning what the bottlenecks are.

Thanks,
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the 
body of a message to [email protected] More majordomo info at  
http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).




--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to