Copilot commented on code in PR #16027: URL: https://github.com/apache/pinot/pull/16027#discussion_r2132435490
########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/RecordReaderSegmentCreationDataSource.java: ########## @@ -89,6 +96,18 @@ public SegmentPreIndexStatsCollector gatherStats(StatsCollectorConfig statsColle } } + private Set<String> getNotNullColumns(Schema schema) { + Set<String> notNullColumns = new HashSet<>(); + for (FieldSpec fieldSpec : schema.getAllFieldSpecs()) { + String columnName = fieldSpec.getName(); + LOGGER.info("Column: {} is not nullable", columnName); + if (schema.isEnableColumnBasedNullHandling() && !fieldSpec.isNullable()) { Review Comment: Logging every column as not nullable is misleading; move this log inside the conditional that checks for non-nullable fields or adjust the message to reflect the actual condition. ```suggestion if (schema.isEnableColumnBasedNullHandling() && !fieldSpec.isNullable()) { LOGGER.info("Column: {} is not nullable", columnName); ``` ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/RecordReaderSegmentCreationDataSource.java: ########## @@ -100,4 +119,15 @@ public RecordReader getRecordReader() { return _recordReader; } + + private boolean skipRowIfNeeded(Set<String> notNullColumns, GenericRow row) { + for (String columnName : notNullColumns) { + if (row.getValue(columnName) == null) { + // If column is not nullable and not present in the row, skip the row Review Comment: The comment mentions 'not present' but this condition checks for a null value; consider clarifying it to 'value is null' so it accurately describes the check. ```suggestion // If column is not nullable and its value is null in the row, skip the row ``` ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java: ########## @@ -352,6 +361,17 @@ public void indexRow(GenericRow row) _docIdCounter++; } + private boolean skipRowIfNeeded(GenericRow row) { Review Comment: The skip logic is duplicated across multiple classes; consider extracting this into a shared utility method to reduce code duplication and improve maintainability. ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java: ########## @@ -352,6 +361,17 @@ public void indexRow(GenericRow row) _docIdCounter++; } + private boolean skipRowIfNeeded(GenericRow row) { + for (String columnName : _notNullColumns) { + if (row.getValue(columnName) == null) { + // If column is not nullable and not present in the row, skip the row + LOGGER.debug("Skipping row because the not-null column {} is not present in the row", columnName); Review Comment: Update the log message to reflect that the column's value is null rather than 'not present' to avoid confusion. ```suggestion LOGGER.debug("Skipping row because the not-null column {} has a null value in the row", columnName); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org