[
https://issues.apache.org/jira/browse/KAFKA-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840533#comment-16840533
]
John Roesler commented on KAFKA-8367:
-------------------------------------
Hi [~pavelsavov],
It looks like you might be using a RockDB Config Setter. Is this right?
[~ableegoldman] recently noticed that any objects created in that setter would
allocate off-heap memory that needs to be explicitly freed. She reported
https://issues.apache.org/jira/browse/KAFKA-8324 and created
https://cwiki.apache.org/confluence/display/KAFKA/KIP-453%3A+Add+close%28%29+method+to+RocksDBConfigSetter
to resolve it. It was committed in https://github.com/apache/kafka/pull/6697 .
Is there any chance you can build from the latest trunk and implement the close
method to see if it fixes this for you?
Thanks,
-John
> Non-heap memory leak in Kafka Streams
> -------------------------------------
>
> Key: KAFKA-8367
> URL: https://issues.apache.org/jira/browse/KAFKA-8367
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 2.2.0
> Reporter: Pavel Savov
> Priority: Major
> Attachments: memory-prod.png, memory-test.png
>
>
> We have been observing a non-heap memory leak after upgrading to Kafka
> Streams 2.2.0 from 2.0.1. We suspect the source to be around RocksDB as the
> leak only happens when we enable stateful stream operations (utilizing
> stores). We are aware of *KAFKA-8323* and have created our own fork of 2.2.0
> and ported the fix scheduled for release in 2.2.1 to our fork. It did not
> stop the leak, however.
> We are having this memory leak in our production environment where the
> consumer group is auto-scaled in and out in response to changes in traffic
> volume, and in our test environment where we have two consumers, no
> autoscaling and relatively constant traffic.
> Below is some information I'm hoping will be of help:
> * RocksDB Config:
> Block cache size: 4 MiB
> Write buffer size: 2 MiB
> Block size: 16 KiB
> Cache index and filter blocks: true
> Manifest preallocation size: 64 KiB
> Max write buffer number: 3
> Max open files: 6144
>
> * Memory usage in production
> The attached graph (memory-prod.png) shows memory consumption for each
> instance as a separate line. The horizontal red line at 6 GiB is the memory
> limit.
> As illustrated on the attached graph from production, memory consumption in
> running instances goes up around autoscaling events (scaling the consumer
> group either in or out) and associated rebalancing. It stabilizes until the
> next autoscaling event but it never goes back down.
> An example of scaling out can be seen from around 21:00 hrs where three new
> instances are started in response to a traffic spike.
> Just after midnight traffic drops and some instances are shut down. Memory
> consumption in the remaining running instances goes up.
> Memory consumption climbs again from around 6:00AM due to increased traffic
> and new instances are being started until around 10:30AM. Memory consumption
> never drops until the cluster is restarted around 12:30.
>
> * Memory usage in test
> As illustrated by the attached graph (memory-test.png) we have a fixed number
> of two instances in our test environment and no autoscaling. Memory
> consumption rises linearly until it reaches the limit (around 2:00 AM on
> 5/13) and Mesos restarts the offending instances, or we restart the cluster
> manually.
>
> * No heap leaks observed
> * Window retention: 2 or 11 minutes (depending on operation type)
> * Issue not present in Kafka Streams 2.0.1
> * No memory leak for stateless stream operations (when no RocksDB stores are
> used)
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)