Jackie-Jiang commented on code in PR #8927:
URL: https://github.com/apache/pinot/pull/8927#discussion_r903173704


##########
pinot-core/src/main/java/org/apache/pinot/core/data/table/TableResizer.java:
##########
@@ -84,9 +84,20 @@ public TableResizer(DataSchema dataSchema, QueryContext 
queryContext) {
       _orderByValueExtractors[i] = 
getOrderByValueExtractor(orderByExpression.getExpression());
       comparators[i] = orderByExpression.isAsc() ? Comparator.naturalOrder() : 
Comparator.reverseOrder();
     }
+    // TODO: return a diff. comparator that does not handle nulls when 
nullHandlingEnabled is false.
     _intermediateRecordComparator = (o1, o2) -> {
       for (int i = 0; i < _numOrderByExpressions; i++) {
-        int result = comparators[i].compare(o1._values[i], o2._values[i]);
+        Object v1 = o1._values[i];
+        Object v2 = o2._values[i];
+        if (v1 == null) {
+          if (v2 != null) {
+            // The default null ordering is NULLS LAST, regardless of the 
ordering direction.
+            return 1;
+          }
+        } else if (v2 == null) {
+          return -1;
+        }
+        int result = comparators[i].compare(v1, v2);

Review Comment:
   It can cause NPE when both v1 and v2 are `null`. Also I don't think it is 
wired correctly, or some test should already fail



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/combine/GroupByOrderByCombineOperator.java:
##########
@@ -235,11 +235,15 @@ protected IntermediateResultsBlock mergeResults()
     }
 
     IndexedTable indexedTable = _indexedTable;
-    indexedTable.finish(false);
+    if (indexedTable != null) {

Review Comment:
   I saw several extra `null` checks introduced. Want to understand why because 
most likely this is because something is not wired correctly



##########
pinot-core/src/main/java/org/apache/pinot/core/common/RowBasedBlockValueFetcher.java:
##########
@@ -43,6 +44,15 @@ public Object[] getRow(int docId) {
     return row;
   }
 
+  public RoaringBitmap getColumnNullBitmap(int colId) {

Review Comment:
   I don't think the changes in this file is required since we don't directly 
read `null` into the row. The nullBitmap is not in row format, so we should 
directly read it from `BlockValSet` on the caller side



##########
pinot-core/src/main/java/org/apache/pinot/core/data/table/TableResizer.java:
##########
@@ -84,9 +84,20 @@ public TableResizer(DataSchema dataSchema, QueryContext 
queryContext) {
       _orderByValueExtractors[i] = 
getOrderByValueExtractor(orderByExpression.getExpression());
       comparators[i] = orderByExpression.isAsc() ? Comparator.naturalOrder() : 
Comparator.reverseOrder();
     }
+    // TODO: return a diff. comparator that does not handle nulls when 
nullHandlingEnabled is false.

Review Comment:
   Please address this TODO because this can cause performance regression



##########
pinot-common/src/main/java/org/apache/pinot/common/utils/DataTable.java:
##########
@@ -42,6 +42,8 @@ public interface DataTable {
 
   Map<Integer, String> getExceptions();
 
+  int getVersion();

Review Comment:
   (minor) Remove the override in `BaseDataTable`



##########
pinot-core/src/main/java/org/apache/pinot/core/common/datablock/RowDataBlock.java:
##########
@@ -53,23 +53,20 @@ public RowDataBlock(ByteBuffer byteBuffer)
   public RoaringBitmap getNullRowIds(int colId) {
     // _fixedSizeData stores two ints per col's null bitmap: offset, and 
length.
     int position = _numRows * _rowSizeInBytes + colId * Integer.BYTES * 2;
-    if (position >= _fixedSizeData.limit()) {
+    if (_fixedSizeData == null || position >= _fixedSizeData.limit()) {
       return null;
     }
 
     _fixedSizeData.position(position);
     int offset = _fixedSizeData.getInt();
     int bytesLength = _fixedSizeData.getInt();
-    RoaringBitmap nullBitmap;
     if (bytesLength > 0) {
       _variableSizeData.position(offset);
       byte[] nullBitmapBytes = new byte[bytesLength];
       _variableSizeData.get(nullBitmapBytes);
-      nullBitmap = 
ObjectSerDeUtils.ROARING_BITMAP_SER_DE.deserialize(nullBitmapBytes);
-    } else {
-      nullBitmap = new RoaringBitmap();
+      return 
ObjectSerDeUtils.ROARING_BITMAP_SER_DE.deserialize(nullBitmapBytes);
     }
-    return nullBitmap;
+    return new RoaringBitmap();

Review Comment:
   Not introduced in this PR, but maybe we should return `null` here



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/blocks/IntermediateResultsBlock.java:
##########
@@ -80,45 +83,69 @@ public IntermediateResultsBlock() {
   /**
    * Constructor for selection result.
    */
-  public IntermediateResultsBlock(DataSchema dataSchema, Collection<Object[]> 
selectionResult) {
+  public IntermediateResultsBlock(DataSchema dataSchema, Collection<Object[]> 
selectionResult,
+      boolean isNullHandlingEnabled) {
     _dataSchema = dataSchema;
     _selectionResult = selectionResult;
+    _isNullHandlingEnabled = isNullHandlingEnabled;
   }
 
   /**
    * Constructor for aggregation result.
    * <p>For aggregation only, the result is a list of values.
    * <p>For aggregation group-by, the result is a list of maps from group keys 
to aggregation values.
    */
-  public IntermediateResultsBlock(AggregationFunction[] aggregationFunctions, 
List<Object> aggregationResult) {
+  public IntermediateResultsBlock(AggregationFunction[] aggregationFunctions, 
List<Object> aggregationResult,
+      boolean isNullHandlingEnabled) {
     _aggregationFunctions = aggregationFunctions;
     _aggregationResult = aggregationResult;
+    _isNullHandlingEnabled = isNullHandlingEnabled;
+  }
+
+  /**
+   * Constructor for aggregation result.
+   * <p>For aggregation only, the result is a list of values.
+   * <p>For aggregation group-by, the result is a list of maps from group keys 
to aggregation values.
+   */
+  public IntermediateResultsBlock(AggregationFunction[] aggregationFunctions, 
List<Object> aggregationResult,
+      DataSchema dataSchema, boolean isNullHandlingEnabled) {
+    _aggregationFunctions = aggregationFunctions;
+    _aggregationResult = aggregationResult;
+    _dataSchema = dataSchema;
+    _isNullHandlingEnabled = isNullHandlingEnabled;
   }
 
   /**
    * Constructor for aggregation group-by order-by result with {@link 
AggregationGroupByResult}.
    */
   public IntermediateResultsBlock(AggregationFunction[] aggregationFunctions,
-      @Nullable AggregationGroupByResult aggregationGroupByResults, DataSchema 
dataSchema) {
+      @Nullable AggregationGroupByResult aggregationGroupByResults, DataSchema 
dataSchema,
+      boolean isNullHandlingEnabled) {
     _aggregationFunctions = aggregationFunctions;
     _aggregationGroupByResult = aggregationGroupByResults;
     _dataSchema = dataSchema;
+    _isNullHandlingEnabled = isNullHandlingEnabled;
   }
 
   /**
    * Constructor for aggregation group-by order-by result with {@link 
AggregationGroupByResult} and
    * with a collection of intermediate records.
    */
   public IntermediateResultsBlock(AggregationFunction[] aggregationFunctions,
-      Collection<IntermediateRecord> intermediateRecords, DataSchema 
dataSchema) {
+      Collection<IntermediateRecord> intermediateRecords, DataSchema 
dataSchema, boolean isNullHandlingEnabled) {
     _aggregationFunctions = aggregationFunctions;
     _dataSchema = dataSchema;
     _intermediateRecords = intermediateRecords;
+    _isNullHandlingEnabled = isNullHandlingEnabled;
   }
 
   public IntermediateResultsBlock(Table table) {
     _table = table;
-    _dataSchema = table.getDataSchema();
+    if (_table != null) {

Review Comment:
   Why do we need to add this check?



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/blocks/IntermediateResultsBlock.java:
##########
@@ -311,16 +343,50 @@ private DataTable getResultDataTable()
       throws IOException {
     DataTableBuilder dataTableBuilder = 
DataTableFactory.getDataTableBuilder(_dataSchema);
     ColumnDataType[] storedColumnDataTypes = 
_dataSchema.getStoredColumnDataTypes();
+    int numColumns = _dataSchema.size();
     Iterator<Record> iterator = _table.iterator();
-    while (iterator.hasNext()) {
-      Record record = iterator.next();
-      dataTableBuilder.startRow();
-      int columnIndex = 0;
-      for (Object value : record.getValues()) {
-        setDataTableColumn(storedColumnDataTypes[columnIndex], 
dataTableBuilder, columnIndex, value);
-        columnIndex++;
+    RoaringBitmap[] nullBitmaps = null;
+    if (_isNullHandlingEnabled) {
+      nullBitmaps = new RoaringBitmap[numColumns];
+      Object[] colDefaultNullValues = new Object[numColumns];
+      for (int colId = 0; colId < numColumns; colId++) {
+        if (storedColumnDataTypes[colId] != ColumnDataType.OBJECT) {
+          colDefaultNullValues[colId] = 
FieldSpec.getDefaultNullValue(FieldSpec.FieldType.METRIC,

Review Comment:
   Several data types are not supported as METRIC. We should allow setting 
`null` values in `setDataTableColumn()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to