xiangfu0 opened a new pull request, #16480:
URL: https://github.com/apache/pinot/pull/16480
## Summary
This PR fixes the CalciteSQLParser error that occurs when executing
multi-stage queries with SET statements, resolving issue #11823.
## Problem
The original issue was a **ClassCastException** when
`Connection.resolveTableName()` tried to parse multi-stage queries like:
```sql
SET useMultistageEngine=true;
SELECT stats.* FROM airlineStats stats LIMIT 10
```
The error occurred because the method expected a single `SqlSelect` node but
received a `SqlIdentifier` when parsing complex query structures.
## Solution
### 1. **Created TableNameExtractor Class**
- Extracted table name resolution logic to a dedicated `TableNameExtractor`
class
- Implemented proper Calcite SQL AST traversal for complex query structures
- Made the class public for better modularity and reusability
### 2. **Enhanced Multi-Stage Query Support**
- **SET statements**: Properly handles queries with SET statements before
the main query
- **CTEs (Common Table Expressions)**: Extracts tables from WITH clauses and
nested CTEs
- **JOINs**: Supports all JOIN types (INNER, LEFT, RIGHT) with proper table
extraction
- **Subqueries**: Handles subqueries in SELECT, FROM, and JOIN conditions
- **Aliases**: Correctly identifies base table names from both explicit (AS)
and implicit aliases
- **ORDER BY**: Processes ORDER BY clauses including subqueries within them
### 3. **Comprehensive Test Coverage**
- Merged `ConnectionResolveTableNameTest.java` and `ConnectionTest.java`
into `TableNameExtractorTest.java`
- Added **20 comprehensive test cases** covering all scenarios:
- Multi-statement queries with SET statements
- Complex JOINs with multiple tables and aliases
- CTEs and nested CTEs
- Subqueries with aliases
- ORDER BY with subqueries
- Error handling for invalid queries
### 4. **Backward Compatibility**
- Maintains 100% backward compatibility with existing single-stage query
parsing
- Falls back gracefully when queries cannot be parsed
- No breaking changes to existing APIs
## Technical Details
### AST Traversal Strategy
The new implementation uses a recursive AST visitor pattern that:
- Tracks context (`_inFromClause` flag) to distinguish table references from
column references
- Manages CTE aliases (`_cteNames` set) to avoid extracting CTE names as
table names
- Handles all major SQL node types: `SqlSelect`, `SqlJoin`, `SqlWith`,
`SqlBasicCall`, `SqlOrderBy`
### Key Code Changes
- **`Connection.java`**: Simplified to delegate to
`TableNameExtractor.resolveTableName()`
- **`TableNameExtractor.java`**: New class with comprehensive table
extraction logic
- **`TableNameExtractorTest.java`**: Consolidated test suite with 20 test
cases
## Testing
✅ **All 130 tests passing** in the pinot-java-client module
✅ **20 specific table extraction tests** covering the reported issue and
edge cases
✅ **No regressions** in existing functionality
✅ **Checkstyle compliant**
## Fixes
Closes #11823
## Example
**Before**: The query would throw `ClassCastException` and log errors
**After**: The query executes successfully with proper table name extraction
```sql
SET useMultistageEngine=true;
SELECT stats.* FROM airlineStats stats LIMIT 10
```
Now correctly extracts `["airlineStats"]` for proper broker selection.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]