arunkumarucet opened a new pull request, #16704:
URL: https://github.com/apache/pinot/pull/16704

   ### **Summary**
   This PR introduces a new specialized aggregation function 
`IntSumAggregationFunction` that provides optimized performance for `SUM` 
operations on `INT` columns by avoiding type conversion overhead and using 
native integer arithmetic.
   
   ### **Problem**
   The existing `SUM` aggregation function converts all values to `DOUBLE` 
before processing, which introduces unnecessary overhead for `INT` columns. 
This is particularly impactful for queries that perform multiple `SUM` 
operations on integer data.
   
   ### **Solution**
   - **New Aggregation Function**: Created `IntSumAggregationFunction` that 
operates directly on `INT` values using `LONG` for accumulation to prevent 
overflow
   - **Type-Specific Optimization**: Eliminates the need for `INT` → `DOUBLE` 
conversion, reducing CPU cycles and memory allocations
   - **Proper Null Handling**: Implements correct null handling semantics that 
return `null` when no non-null values are processed (when null handling is 
enabled)
   - **Comprehensive Testing**: Added extensive test coverage following Pinot's 
established testing patterns
   
   ### **Changes Made**
   
   #### **Core Implementation**
   - **`IntSumAggregationFunction.java`**: New specialized aggregation function 
for INT columns
   - **`AggregationFunctionType.java`**: Added `INTSUM` enum constant
   - **`AggregationFunctionFactory.java`**: Registered the new function type
   
   #### **Test Coverage**
   - **`IntSumAggregationFunctionTest.java`**: Comprehensive test suite 
covering null handling, group by operations, and edge cases
   - **`AggregationFunctionFactoryTest.java`**: Added test for proper function 
creation
   - **`AggregationFunctionTypeTest.java`**: Added test for enum recognition
   
   ### **Benefits**
   1. **Performance Improvement**: Eliminates type conversion overhead for INT 
column aggregations
   2. **Memory Efficiency**: Reduces unnecessary object allocations and type 
conversions
   3. **Correctness**: Proper null handling that matches SQL semantics
   4. **Maintainability**: Follows Pinot's established patterns and includes 
comprehensive test coverage
   
   ### **Usage**
   ```sql
   -- Use the new optimized function for INT columns
   SELECT INTSUM(ResolutionWidth) FROM hits;
   
   -- Falls back to regular SUM for non-INT columns
   SELECT SUM(ResolutionWidth) FROM hits;
   ```
   
   ### **Testing**
   - All tests pass successfully
   - Covers null handling scenarios (enabled/disabled)
   - Tests group by operations (single-value and multi-value)
   - Verifies proper function registration and recognition
   - Follows existing Pinot test patterns for consistency


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to