mcvsubbu commented on issue #10712: URL: https://github.com/apache/pinot/issues/10712#issuecomment-1732365406
Can you fill me in on what "to solve the hybrid table" means? My suggestion would be to NOT change the hybrid table definition. Instead, keep it the same. The logical table binding should happen _before_ we branch between realtime/offline. So, the query comes into the broker, we lookup if there is a physical table defined, substitute the physical table name(s) and then do further query processing. Each of the underlying physical tables could be hybrid, or realtime-only or offline-only. In terms of allowed mapping, we should have something that enables mapping one logical table to one or more physical tables. If multiple physical tables are configured, then another config could say whether it the code should pick _any_ or _all_ (@egalpin 's requirement). Not sure if there is need for specific additional (configurable) logic depending on which table is picked, but we can let that ride for now. +1 on the brokers should recognize immediately when mapping is changed. To that effect, maybe the mapping should be stored in zookeeper, away from TableConfig. Maybe it can be under the PROPERTYSTORE/CONFIGS/CLUSTER ? It is OK if the brokers do not set a watch (perhaps preferred that way). The mapping can be updated via a controller API, and the brokers informed by the controller. Some other random thoughts: - The logical table should not be a Helix resource (intuitively). Let me know if there is a problem with this, and we can discuss further. - As a consequence, the logical table cannot have a `logicalTableName_REALTIME` physical table, ever. - How will table metrics be emitted? Ideally, all table level metrics should be emitted under the logical table name. Code may become a bit messy at place (emit physical table, logical table, and global metrics) - Operational tools need to be examined: If a logical table maps to a different physical table, then some of the table APIs should be modified to reflect that there is a different physical table. Not sure how this will work if there is more than one physical table. - At least for a start, let us assume that all physical tables have the same schema. This can throw a wrenh into having multiple copies of the same schema, since we insist now that schema name is the same as table name. Either the restriction should be relaxed, or some way provided so that schema changes are updated for all physical tables at the same time (e.g. a schama change is allowed only on the logical table). Instead of thinking this through peace meal, I strongly suggest we start writing a design doc, with at least the requirements part clearly identified. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org