egalpin commented on issue #10712: URL: https://github.com/apache/pinot/issues/10712#issuecomment-1693798924
> What is the use case that drives mapping mulitple physical tables to the same logical table? Can you elaborate a bit? Here are some example use cases that piqued my interest: 1. I have a use case where I can liken the data to user sessions, where a session can be either active or closed. I would like to be able to have _3_ tables which represent the total data: upsert-enabled realtime table representing an active sessions, plus a hybrid table to account for realtime ingestion of newly closed sessions as well as historical closed sessions. It isn't currently possible to query all of these tables at once, but it would be very nice to do so. 2. "whale" or VIP tables, also "Priority queue". Sometimes, certain customers or set of customers represent an outsized portion of data which might not work well to overcome with Pinot's existing partitioning. Being able to isolate a certain set of customer data in a separate table that would still be queryable via a single table name such that those issuing queries do not need to have awareness of DB organization details to conditionally target the correct table 3. User-managed time partitioning. Imagine a time series dataset. Being able to have a collection of tables which each holds a given time-period of data would be helpful operationally. > Do the physical tables have the same schema? Yes I would guess so (like in the case of a hybrid table today). Or at very least, mutually shared columns would have the same types. It might be ideal to be able to provide support for tables having a subset/superset of columns, but that's not a "must" feature for a v1 IMO. >How does a given query (that may only have the logical tablename) choose between the physical tables to run the query in? I believe that, at least initially, the query would strictly choose all physical tables with the same logical name. There might be ways to optimize that in the future Ex. in the above example of VIP tables, where we might be able to select only 1 out of many physical tables based on some fact we know about the table architecture and query inputs. But I don't think that would be a requirement of an initial version. My main priority would be the use case of being able to replace a table easily and seamlessly. That wouldn't require the ability to support multiple physical tables with the same logical name. That said, I can foresee making use of the ability to have multiple physical tables with the same logical name, so it would be nice to do all in one go if feasible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org