walterddr commented on code in PR #10108:
URL: https://github.com/apache/pinot/pull/10108#discussion_r1067508840


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/GrpcMailboxService.java:
##########
@@ -44,16 +48,17 @@
  * </ul>
  */
 public class GrpcMailboxService implements MailboxService<TransferableBlock> {
+  private static final Duration MAILBOX_CACHE_EXPIRY = Duration.ofMinutes(5);
   // channel manager
   private final ChannelManager _channelManager;
   private final String _hostname;
   private final int _mailboxPort;
 
   // maintaining a list of registered mailboxes.
-  private final ConcurrentHashMap<String, ReceivingMailbox<TransferableBlock>> 
_receivingMailboxMap =
-      new ConcurrentHashMap<>();
-  private final ConcurrentHashMap<String, SendingMailbox<TransferableBlock>> 
_sendingMailboxMap =
-      new ConcurrentHashMap<>();
+  private final Cache<String, ReceivingMailbox<TransferableBlock>> 
_receivingMailboxCache =
+      
CacheBuilder.newBuilder().expireAfterAccess(MAILBOX_CACHE_EXPIRY.toMinutes(), 
TimeUnit.MINUTES).build();
+  private final Cache<String, SendingMailbox<TransferableBlock>> 
_sendingMailboxCache =
+      
CacheBuilder.newBuilder().expireAfterAccess(MAILBOX_CACHE_EXPIRY.toMinutes(), 
TimeUnit.MINUTES).build();

Review Comment:
   IMO 
   1. there's a very good concept we should apply first: the ownership of a 
mailbox should be handed over from this ConcurrentHashMap to the OpChain when 
it is being registered. 
       - with this logic i think expiring the mailbox that are still kept in 
the concurrent hashmap is good. 
   
   2. If we are hitting OOM it is highly unlikely that it is due to the mailbox 
concurrent hashmap, but b/c the mailbox it is holding on (GRPC or InMemory is 
holding on to the cached data in the blocking queue but the OpChain that 
suppose to consume it has already gone. (e.g.. the object itself is small, but 
it can hold large data buffer)
       - with this in mind I think it is all the more reason to clarify the 
ownership issue first. 
   
   CC @61yao and @agavra please share your thoughts as well. I know that 
operator resource management is a big item. But I was thinking if we have a 
suitable-sized MVP that we can nailed down and then we can continue to iterate. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to