Copilot commented on code in PR #17113:
URL: https://github.com/apache/pinot/pull/17113#discussion_r2478052272
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/stats/MapColumnPreIndexStatsCollector.java:
##########
@@ -105,6 +110,18 @@ public void collect(Object entry) {
updatePartition(key);
}
}
+ if (keyStats instanceof StringColumnPreIndexStatsCollector) {
+ if (value instanceof String || value instanceof Number || value
instanceof Boolean) {
+ keyStats.collect(String.valueOf(value));
+ continue;
+ }
+ try {
+ keyStats.collect(JsonUtils.objectToString(value));
+ continue;
+ } catch (JsonProcessingException e) {
+ throw new RuntimeException(e);
Review Comment:
The RuntimeException wrapping JsonProcessingException lacks context about
which key or value caused the failure. Consider adding a descriptive message
like: `throw new RuntimeException(\"Failed to serialize value for key '\" + key
+ \"'\", e);`
```suggestion
throw new RuntimeException("Failed to serialize value for key '"
+ key + "': " + value, e);
```
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/stats/MapColumnPreIndexStatsCollector.java:
##########
@@ -51,14 +55,15 @@
* heterogeneous value types for a key are encountered will construct the Map
statistics it can be raised as a fault.
*/
public class MapColumnPreIndexStatsCollector extends
AbstractColumnStatisticsCollector {
+ private static final Logger LOGGER =
LoggerFactory.getLogger(MapColumnPreIndexStatsCollector.class);
Review Comment:
The logger is declared but not used elsewhere in this file. If it's intended
for the newly added warning at line 233, ensure the logger is being used
consistently. If the logger was already present and is used in other parts of
the class not shown in this diff, this comment can be disregarded.
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/stats/MapColumnPreIndexStatsCollector.java:
##########
@@ -208,18 +224,24 @@ private AbstractColumnStatisticsCollector
createKeyStatsCollector(String key, Ob
return new DoubleColumnPreIndexStatsCollector(key, config);
case BIG_DECIMAL:
return new BigDecimalColumnPreIndexStatsCollector(key, config);
+ case BOOLEAN:
case STRING:
+ case MAP:
+ case OBJECT:
return new StringColumnPreIndexStatsCollector(key, config);
default:
- throw new UnsupportedOperationException(String.format("MAP column does
not yet support '%s'", type));
+ LOGGER.warn("Unknown data type {} for key {} and value {}", type, key,
value);
+ return new StringColumnPreIndexStatsCollector(key, config);
}
}
+ /**
+ * Convert Map value data type to stored field type.
+ * Note that all unknown types are automatically converted to String type.
+ */
private static FieldSpec.DataType convertToDataType(PinotDataType ty) {
// TODO: I've been told that we already have a function to do this, so
find that function and replace this
Review Comment:
This TODO should be resolved or removed. If a duplicate function exists
elsewhere in the codebase, it should be used to avoid code duplication. If no
such function exists, remove the TODO comment.
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/defaultcolumn/BaseDefaultColumnHandler.java:
##########
@@ -1193,10 +1221,12 @@ private void
createDerivedColumnForwardIndexWithoutDictionary(String column, Fie
forwardIndexCreator.putBytesMV((byte[][]) outputValue);
break;
default:
- throw new IllegalStateException();
+ throw new IllegalStateException(
+ "Unsupported data type: " + fieldSpec.getDataType() + ", for
value: " + outputValue);
}
}
}
+ forwardIndexCreator.seal();
Review Comment:
The `forwardIndexCreator.seal()` call is added outside the conditional
blocks but should only be called once per creator instance. This placement
suggests it will always execute, but verify that this doesn't cause issues in
the `if (_isSingleValue)` branch where seal() may have already been called
within the block for certain data types.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]