zhangpenggh opened a new issue, #32215: URL: https://github.com/apache/doris/issues/32215
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version doris-2.1.0-rc11 ### What's Wrong? 集群BE节点内存持续处于高位没有降低,排查BE日志: W20240314 10:40:19.775930 23679 mem_tracker_limiter.cpp:303] Process Memory Summary: os physical memory 31.18 GB. process memory used 26.43 GB, limit 28.06 GB, soft limit 25.25 GB. sys available memory 3.27 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B Memory Tracker Summary: Type=experimental, Used=0(0 B), Peak=0(0 B) Type=clone, Used=0(0 B), Peak=0(0 B) Type=schema_change, Used=0(0 B), Peak=0(0 B) Type=compaction, Used=0(0 B), Peak=63.52 MB(66609586 B) Type=load, Used=139.72 MB(146506471 B), Peak=940.94 MB(986652172 B) Type=query, Used=18.96 GB(20361775651 B), Peak=22.97 GB(24662225761 B) Type=global, Used=2.85 GB(3058973366 B), Peak=5.09 GB(5467455842 B) Type=tc/jemalloc cache, Used=961.09 MB(1007771520 B), Peak=-1.00 B(-1 B) Type=sum of all trackers, Used=22.89 GB(24575027008 B), Peak=-1.00 B(-1 B) Type=process resident memory, Used=26.43 GB(28380532736 B), Peak=28.35 GB(30437425152 B) Type=process virtual memory, Used=61.03 GB(65530044416 B), Peak=61.15 GB(65659797504 B) MemTrackerLimiter Label=Orphan, Type=global, Limit=-1.00 B(-1 B), Used=-98.06 MB(-102819557 B), Peak=98.76 MB(103553375 B) MemTracker Label=PageNoCache, Parent Label=Orphan, Used=0(0 B), Peak=14.60 MB(15312546 B) MemTracker Label=IOBufBlockMemory, Parent Label=Orphan, Used=9.69 MB(10158080 B), Peak=26.85 MB(28155904 B) MemTracker Label=VerticalSegmentWriter:Segment-0, Parent Label=Orphan, Used=5.47 KB(5602 B), Peak=5.47 KB(5602 B) MemTracker Label=VerticalSegmentWriter:Segment-0, Parent Label=Orphan, Used=5.15 KB(5277 B), Peak=5.15 KB(5277 B) MemTrackerLimiter Label=DataPageCache[size], Type=global, Limit=-1.00 B(-1 B), Used=2.39 GB(2561238533 B), Peak=4.33 GB(4653908957 B) MemTrackerLimiter Label=IndexPageCache[size], Type=global, Limit=-1.00 B(-1 B), Used=482.65 MB(506093484 B), Peak=562.29 MB(589601318 B) MemTrackerLimiter Label=PKIndexPageCache[size], Type=global, Limit=-1.00 B(-1 B), Used=78.03 MB(81818979 B), Peak=316.83 MB(332218969 B) MemTrackerLimiter Label=PointQueryRowCache[size], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=SegmentCache[number], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=SchemaCache[number], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=CommonObjLRUCache[number], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=PointQueryLookupConnectionCache[size], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=InvertedIndexSearcherCache[size], Type=global, Limit=-1.00 B(-1 B), Used=4.00 MB(4194666 B), Peak=9.00 MB(9438051 B) MemTrackerLimiter Label=InvertedIndexQueryCache[size], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=LastSuccessChannelCache[size], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=TabletSchemaCache[number], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=MowTabletVersionCache[number], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=CreateTabletRRIdxCache[number], Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B) MemTrackerLimiter Label=MowDeleteBitmapAggCache[size], Type=global, Limit=-1.00 B(-1 B), Used=8.06 MB(8447261 B), Peak=8.06 MB(8447261 B) MemTrackerLimiter Label=MemTableMemoryLimiter, Type=load, Limit=-1.00 B(-1 B), Used=139.72 MB(146506471 B), Peak=1.05 GB(1132129183 B) MemTrackerLimiter Label=Query#Id=a597e9f3543f4f13-97d72a34fe743f9f, Type=query, Limit=3.00 GB(3221225472 B), Used=127.04 MB(133216043 B), Peak=127.04 MB(133216043 B) MemTrackerLimiter Label=Query#Id=be38f48bce394e35-9b11b4fa5b03f30f, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=89105a6c18ad4807-a6a679953d96f4f2, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=53f84a373e86462f-8c1ba8f337ab2d8e, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=8448d6e3c98d49b0-b7e135a02e558107, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=83f3e57d3a67464f-b5213fdfdd9cb78e, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=64b9d2b6683a48e3-98897f8ec0d24ebb, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=7c73482277ea4571-b7d9f987e002fc9f, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=e55a9d8e938d454d-840a3c3477054aa2, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=345effe0c1a94406-942b8fcc085a2234, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=c8d55d942f5142b9-8839beb8a295af18, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=bb7ba679728b4b31-8ea0768ecdba6dd9, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=6e09104af4f44933-ae8b3750890692f9, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=Query#Id=4551cc209d344fac-ab1eb0210b5efeb2, Type=query, Limit=3.00 GB(3221225472 B), Used=126.39 MB(132534304 B), Peak=126.39 MB(132534304 B) MemTrackerLimiter Label=MemTableMemoryLimiter, Type=load, Limit=-1.00 B(-1 B), Used=139.72 MB(146506471 B), Peak=1.05 GB(1132129183 B) 日志中存在大量查询,按照QueryId查找审计日志: query_id: a597e9f3543f4f13-97d72a34fe743f9f time: 2024-03-14 07:41:09.423 client_ip: xxxx user: xxxx catalog: internal db: xxxxx state: ERR error_code: 1105 error_message: RpcException, msg: timeout when waiting for send fragments rpc, query timeout:900, left timeout for this operation:30, host: xxxxxxxx query_time: 30006 scan_bytes: 871628800 scan_rows: 12354560 return_rows: 0 stmt_id: 112845 is_query: 1 frontend_ip: xxxxxxx cpu_time_ms: 1231 sql_hash: e6a776cd33aac0f401980652b6eef1e9 sql_digest: peak_memory_bytes: 133216043 workload_group: normal 相关查询都是几个小时早已失败的查询。但是当前的memory tracker中一直还在。 重启节点后,内存使用率恢复正常。 另外,查询异常的原因是个别节点内存过高被系统杀死,systemctl自动恢复后 show backend 显示正常,但是执行查询时就会抱query timeout异常,手动重启之后才能恢复。 ### What You Expected? 解决内存泄露问题 ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org