Jackie-Jiang commented on code in PR #18368:
URL: https://github.com/apache/pinot/pull/18368#discussion_r3321232618
##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/ComplexFieldSpec.java:
##########
@@ -132,6 +172,46 @@ public static ComplexFieldSpec
fromMapFieldSpec(MapFieldSpec mapFieldSpec) {
Map.of(KEY_FIELD, mapFieldSpec.getKeyFieldSpec(), VALUE_FIELD,
mapFieldSpec.getValueFieldSpec()));
}
+ /**
+ * View over a {@link ComplexFieldSpec} whose {@code dataType} is {@link
DataType#OPEN_STRUCT}.
+ * Exposes the per-key declared types (from {@code childFieldSpecs}) and the
required fallback
+ * {@code defaultValueFieldSpec}.
+ */
+ public static class OpenStructFieldSpec {
Review Comment:
I don't think we need this special wrapper for `OpenStructFieldSpec`. It is
the same as `ComplexFieldSpec` and we can just use that
##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/ComplexFieldSpec.java:
##########
@@ -56,8 +58,17 @@ public final class ComplexFieldSpec extends FieldSpec {
public static final String KEY_FIELD = "key";
public static final String VALUE_FIELD = "value";
+ /// Default {@code defaultValueFieldSpec} used for {@link
DataType#OPEN_STRUCT} columns when
+ /// the schema does not declare one explicitly. Keys with no per-key type
hint are stored as
+ /// single-value STRING.
+ public static final FieldSpec DEFAULT_OPEN_STRUCT_VALUE_FIELD_SPEC =
+ new DimensionFieldSpec("default", DataType.STRING, true);
+
private final Map<String, FieldSpec> _childFieldSpecs;
+ @Nullable
+ private FieldSpec _defaultValueFieldSpec;
Review Comment:
What will be the actual usage of the default field spec? I'd imagine the
data type should always be auto-derived. For JSON data, it would be very
inefficient to have every value stored as STRING. It can even produce wrong
behavior for numeric comparison
##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/ComplexFieldSpec.java:
##########
@@ -83,6 +107,22 @@ public Map<String, FieldSpec> getChildFieldSpecs() {
return _childFieldSpecs;
}
+ /// Returns the {@code defaultValueFieldSpec} for OPEN_STRUCT columns,
falling back to
+ /// {@link #DEFAULT_OPEN_STRUCT_VALUE_FIELD_SPEC} (single-value STRING) when
the schema did
+ /// not declare one. Returns {@code null} for non-OPEN_STRUCT data types.
+ @JsonProperty("defaultValueFieldSpec")
Review Comment:
(minor) No need to add `@JsonProperty`. We won't use this method to
serialize it. The serialization is always through `toJsonObject()`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]