walterddr commented on code in PR #10108: URL: https://github.com/apache/pinot/pull/10108#discussion_r1067508840
########## pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/GrpcMailboxService.java: ########## @@ -44,16 +48,17 @@ * </ul> */ public class GrpcMailboxService implements MailboxService<TransferableBlock> { + private static final Duration MAILBOX_CACHE_EXPIRY = Duration.ofMinutes(5); // channel manager private final ChannelManager _channelManager; private final String _hostname; private final int _mailboxPort; // maintaining a list of registered mailboxes. - private final ConcurrentHashMap<String, ReceivingMailbox<TransferableBlock>> _receivingMailboxMap = - new ConcurrentHashMap<>(); - private final ConcurrentHashMap<String, SendingMailbox<TransferableBlock>> _sendingMailboxMap = - new ConcurrentHashMap<>(); + private final Cache<String, ReceivingMailbox<TransferableBlock>> _receivingMailboxCache = + CacheBuilder.newBuilder().expireAfterAccess(MAILBOX_CACHE_EXPIRY.toMinutes(), TimeUnit.MINUTES).build(); + private final Cache<String, SendingMailbox<TransferableBlock>> _sendingMailboxCache = + CacheBuilder.newBuilder().expireAfterAccess(MAILBOX_CACHE_EXPIRY.toMinutes(), TimeUnit.MINUTES).build(); Review Comment: IMO 1. there's a very good concept we should apply first: the ownership of a mailbox should be handed over from this ConcurrentHashMap to the OpChain when it is being registered. - with this logic i think expiring the mailbox that are still kept in the concurrent hashmap is good. 2. If we are hitting OOM it is highly unlikely that it is due to the mailbox concurrent hashmap, but b/c the mailbox it is holding on (GRPC or InMemory is holding on to the cached data in the blocking queue but the OpChain that suppose to consume it has already gone. (e.g.. the object itself is small, but it can hold large data buffer) - with this in mind I think it is all the more reason to clarify the ownership issue first. CC @61yao and @agavra please share your thoughts as well. I know that operator resource management is a big item. But I was thinking if we have a suitable-sized MVP that we can nailed down and then we can continue to iterate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org