Hi,
I think I am eventually able to figure out what is happening.. First, here is
the step to reproduce it in any kernel (not specific to 3.16/3.18 as I said
earlier)
1. In a SSD create a big journal partition say > 20 GB or so.
2. Make all the filestore/journal parameter default other than the following.
Set these value in your conf file. This is just to make sure journal writes are
not throttled and going far ahead than backend writes.
filestore_queue_max_ops = 5000000
filestore_queue_max_bytes = 1000000000000
filestore_queue_committing_max_ops = 5000000
filestore_queue_committing_max_bytes = 1000000000000
3. Run any release say Firefly/Giant/Hammer and create a single OSD cluster
giving rest of the SSD as data partition.
4. Run say fio_rbd random write workload as 16K, QD 64 , num_jobs=8
5. run 'dstat -m' and see how used memory is rising !
Now, this behavior I found with glibcmalloc/tcmalloc/jemalloc and that I
communicated earlier. But, I didn't wait enough earlier to see if memory is
coming down when IO stopped :-)
I saw in case of tcmalloc it is not coming down , *but* in case of jemalloc it
is *coming down* the following way to the old place. I didn't go back to
glibcmalloc again, but, it should be releasing also..
1. If I stop IO, journal write stops, but backend flash is catching up and the
memory is coming down accordingly. This is expected because all the
transactions are piling up in workQ while journal is way ahead, but, the moment
it is processed the transactions are deleted and memory usage coming down.
2. After journal is full, it started throttling and the overall IO rate is
coming down. The backend flash now has the opportunity to catch up and thus
releasing the memory.
None of the above is happening in case of tcmalloc. There it is not releasing
the memory at all..Digging down some of the doc I found out this can happen and
there is a flag to control the release rate..But, no luck after changing that
as well..Didn't invest much time on that though.
What I saw next time when I ran IO, in case of tcmalloc the memory is not
rising at the beginning, probably, it is reusing those memory and started
rising again after some time...But, I doubt this behavior is good..
So, this could be another step ahead of removing tcmalloc as Ceph's default
allocator and moving to jemalloc.
Thanks Greg for asking me to relook at tcmalloc otherwise I was kind of out of
option :-)..
Regards
Somnath
-----Original Message-----
From: Somnath Roy
Sent: Wednesday, July 01, 2015 4:58 PM
To: 'Gregory Farnum'
Cc: [email protected]
Subject: RE: Probable memory leak in Hammer write path ?
Thanks Greg!
Yeah, I will double check..But, I built the code without tcmalloc (with glibc)
and it was also showing the similar behavior.
Thanks & Regards
Somnath
-----Original Message-----
From: Gregory Farnum [mailto:[email protected]]
Sent: Wednesday, July 01, 2015 9:07 AM
To: Somnath Roy
Cc: [email protected]
Subject: Re: Probable memory leak in Hammer write path ?
On Mon, Jun 29, 2015 at 4:39 PM, Somnath Roy <[email protected]> wrote:
> Greg,
> Updating to the new kernel updating the gcc version too. Recent kernel is
> changing tcmalloc version too, but, 3.16 has old tcmalloc but still
> exhibiting the issue.
> Yes, the behavior is very confusing and compiler is main variable I could
> think of from application perspective.
> If you have a 3.16/3.19 kernel, you could reproduce this following these
> steps.
>
> 1. Build ceph-hammer code base
>
> 2. Run with single OSD.
>
> 3. Create an image and run a fio-bed workload from client (say 16K bs,
> 8 num_jobs)
>
> 4. run 'dstat -m' and observe the memory usage.
>
> What I am thinking of doing is to install ceph from ceph.com and see the
> behavior.
In addition to that, I'd look for if there are any known bugs in the tcmalloc
version you're using on the leaky systems, and check the tcmalloc stats to see
if they have a bunch of free memory which hasn't been released to the OS yet.
-Greg
________________________________
PLEASE NOTE: The information contained in this electronic mail message is
intended only for the use of the designated recipient(s) named above. If the
reader of this message is not the intended recipient, you are hereby notified
that you have received this message in error and that any review,
dissemination, distribution, or copying of this message is strictly prohibited.
If you have received this communication in error, please notify the sender by
telephone or e-mail (as shown above) immediately and destroy any and all copies
of this message in your possession (whether hard copies or electronically
stored copies).
N�����r��y����b�X��ǧv�^�){.n�+���z�]z���{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�m��������zZ+�����ݢj"��!�i