mcvsubbu commented on pull request #6778: URL: https://github.com/apache/pinot/pull/6778#issuecomment-917256835
I am not sure how it is adding a lot of memory overhead to cache segments that are NOT in deep store. It should be the rare case, so in the most common case the memory overhead is zero. If deepstore is unavailable for (say) 12 hours, then only the names of segments completed within that window will be stored, so again it is a very small overhead. As for retrieving all zk metadata all the time, it depends on the installation. If there are multiple realtime-only tables, with segments for several days, it could start to become a problem. One of the bottlenecks we have identified in our cluster is large pull of data from zookeeper. I am fine with a first implementation of pulling all metadata from zk and then fixing it later as needed, but please pull the metadata only under the config variable. On the other hand, the code is already done, so maybe it is best to leave it there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org