deepthi912 commented on PR #17047: URL: https://github.com/apache/pinot/pull/17047#issuecomment-3434366131
> Thanks Deepthi, I think excluding virtual columns is a reasonable solution for the `NATURAL JOIN` case where it definitely does not make sense to include those columns in the join condition at the very least. Ideally we'd want to make it so that the virtual columns are only excluded from the join condition but not from the schema altogether but it sounds like that's gonna be much trickier to do than expected? > > > Even if we try to extend the existing public methods that can be modified to take in common columns excluding virtual, the sources are created from Schema in the SqlValidatorImpl and it will still have virtual columns which will throw QueryValidationError: Index 66 out of bounds for length 54 at [exception](https://github.com/apache/calcite/blob/7a0564a7d0b2629d4537fcfd6879528bde378d9f/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L8009) > > Could you elaborate a bit on this root cause - what exactly is Calcite trying to do that is causing the `IndexOutOfBoundsException` issue? Also `Index 66 out of bounds for length 54 ` looks a bit strange to me considering we only have 4 built in virtual columns per table? Internally Calcite has this [private function] (https://github.com/apache/calcite/blob/7a0564a7d0b2629d4537fcfd6879528bde378d9f/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L669) which cannot be extended to implement our own logic to pass in right columns excluding virtual columns, secondly if I extend the [usingNames](https://github.com/apache/calcite/blob/7a0564a7d0b2629d4537fcfd6879528bde378d9f/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L595) to exclude virtual columns, that is not alone solving the purpose because the sources are still having virtual columns from the schema defined. Here we have 29 columns on left table and 6 columns on the right table including virtual columns. Since the sources virtual columns are not excluded, it is trying to continue the list and make it 29 + 6 = 35 for sourceSet, for sources even though I excluded the virtual column in the common list, it would count it as 29+6 -1 = 34 (0-33). Now when the reference to the index happens I think it tries to allocate 2x of the size and says 66 here for the case of extending a `usingNames` from `SqlValidatorImpl`. This is a bit hard to explain but debugging through calcite code will make much more sense. In conclusion, I think having an OPTION is a better way to allow user to have flexibility to choose those virtual columns or not without having to deal with the Calcite rules and stitching their errors or bugs caused because of Pinot rules or configs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
