xiangfu0 opened a new pull request, #16480:
URL: https://github.com/apache/pinot/pull/16480

   ## Summary
   
   This PR fixes the CalciteSQLParser error that occurs when executing 
multi-stage queries with SET statements, resolving issue #11823.
   
   ## Problem
   
   The original issue was a **ClassCastException** when 
`Connection.resolveTableName()` tried to parse multi-stage queries like:
   
   ```sql
   SET useMultistageEngine=true;
   SELECT stats.* FROM airlineStats stats LIMIT 10
   ```
   
   The error occurred because the method expected a single `SqlSelect` node but 
received a `SqlIdentifier` when parsing complex query structures.
   
   ## Solution
   
   ### 1. **Created TableNameExtractor Class**
   - Extracted table name resolution logic to a dedicated `TableNameExtractor` 
class
   - Implemented proper Calcite SQL AST traversal for complex query structures
   - Made the class public for better modularity and reusability
   
   ### 2. **Enhanced Multi-Stage Query Support**
   - **SET statements**: Properly handles queries with SET statements before 
the main query
   - **CTEs (Common Table Expressions)**: Extracts tables from WITH clauses and 
nested CTEs
   - **JOINs**: Supports all JOIN types (INNER, LEFT, RIGHT) with proper table 
extraction
   - **Subqueries**: Handles subqueries in SELECT, FROM, and JOIN conditions
   - **Aliases**: Correctly identifies base table names from both explicit (AS) 
and implicit aliases
   - **ORDER BY**: Processes ORDER BY clauses including subqueries within them
   
   ### 3. **Comprehensive Test Coverage**
   - Merged `ConnectionResolveTableNameTest.java` and `ConnectionTest.java` 
into `TableNameExtractorTest.java`
   - Added **20 comprehensive test cases** covering all scenarios:
     - Multi-statement queries with SET statements
     - Complex JOINs with multiple tables and aliases
     - CTEs and nested CTEs
     - Subqueries with aliases
     - ORDER BY with subqueries
     - Error handling for invalid queries
   
   ### 4. **Backward Compatibility**
   - Maintains 100% backward compatibility with existing single-stage query 
parsing
   - Falls back gracefully when queries cannot be parsed
   - No breaking changes to existing APIs
   
   ## Technical Details
   
   ### AST Traversal Strategy
   The new implementation uses a recursive AST visitor pattern that:
   - Tracks context (`_inFromClause` flag) to distinguish table references from 
column references
   - Manages CTE aliases (`_cteNames` set) to avoid extracting CTE names as 
table names
   - Handles all major SQL node types: `SqlSelect`, `SqlJoin`, `SqlWith`, 
`SqlBasicCall`, `SqlOrderBy`
   
   ### Key Code Changes
   - **`Connection.java`**: Simplified to delegate to 
`TableNameExtractor.resolveTableName()`
   - **`TableNameExtractor.java`**: New class with comprehensive table 
extraction logic
   - **`TableNameExtractorTest.java`**: Consolidated test suite with 20 test 
cases
   
   ## Testing
   
   ✅ **All 130 tests passing** in the pinot-java-client module  
   ✅ **20 specific table extraction tests** covering the reported issue and 
edge cases  
   ✅ **No regressions** in existing functionality  
   ✅ **Checkstyle compliant**
   
   ## Fixes
   
   Closes #11823
   
   ## Example
   
   **Before**: The query would throw `ClassCastException` and log errors  
   **After**: The query executes successfully with proper table name extraction
   
   ```sql
   SET useMultistageEngine=true;
   SELECT stats.* FROM airlineStats stats LIMIT 10
   ```
   
   Now correctly extracts `["airlineStats"]` for proper broker selection.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to