kkrugler commented on issue #7791: URL: https://github.com/apache/pinot/issues/7791#issuecomment-978106616
Hi @Jackie-Jiang - yes, it would be more efficient to save the metadata when doing a generate-and-push job. But for our use case, we run Hadoop map-reduce jobs to build segments (highly parallelized) and save to HDFS, and then separately have Pinot jobs to push the segments. So for that use case, we'd want to still do this optimization. Others on Slack have suggested saving the metadata files to a separate directory during build time, but in my mind that adds both complexity and the potential for data miss-match without much benefit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org