cbalci commented on pull request #6530: URL: https://github.com/apache/incubator-pinot/pull/6530#issuecomment-772211536
Thanks for the reviews @Jackie-Jiang , @yupeng9 . > Since this is the first time we have queries across tables, I think it's a good time to discuss the policy. There are two options: join tables within the tenant, and join tables across tenants. Personally I prefer a default constraint that the tables to join are within the same tenant for better isolation. But given the broadcast join nature, the dimension table is in fact copied to all tenants. Nevertheless, I feel it's good to have this high-level consideration. @yupeng9, small correction on the last part, currently dimension tables are not copied to all tenants, but all servers in a single tenant. I also agree with your opinion here to default to restricting joins to in-tenant only. Otherwise, besides isolation, unnecessary memory usage will be another issue since dim tables are loaded into heap directly. I think @Jackie-Jiang's tag based cross-tenant solution could be useful for cases where folks have set up different tenants for REALTIME and OFFLINE servers of the same table and would like to share a dimension table. @siddharthteotia > Currently there is no broadcast happening right? You're right, there is no data movement at query time. Dimension tables are distributed only at segment placement time, according to the segment assignment policy as updated in this PR. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org