Jackie-Jiang commented on a change in pull request #6878: URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r626835206
########## File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java ########## @@ -538,6 +600,58 @@ public String convert(Object value, PinotDataType sourceType) { } }, + JSON { + @Override + public int toInt(Object value) { + throw new UnsupportedOperationException("Cannot convert value from JSON to INT"); + } + + @Override + public long toLong(Object value) { + throw new UnsupportedOperationException("Cannot convert value from JSON to LONG"); + } + + @Override + public float toFloat(Object value) { + throw new UnsupportedOperationException("Cannot convert value from JSON to FLOAT"); + } + + @Override + public double toDouble(Object value) { + throw new UnsupportedOperationException("Cannot convert value from JSON to DOUBLE"); + } + + @Override + public boolean toBoolean(Object value) { + throw new UnsupportedOperationException("Cannot convert value from JSON to BOOLEAN"); + } + + @Override + public Timestamp toTimestamp(Object value) { + throw new UnsupportedOperationException("Cannot convert value from JSON to TIMESTAMP"); + } + + @Override + public String toString(Object value) { + return value.toString(); + } + + @Override + public String toJson(Object value) { + return value.toString(); + } + + @Override + public byte[] toBytes(Object value) { + throw new UnsupportedOperationException("Cannot convert value from JSON to BYTES"); + } + + @Override + public String convert(Object value, PinotDataType sourceType) { + return sourceType.toString(value); Review comment: (Critical) ```suggestion return sourceType.toJson(value); ``` ########## File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java ########## @@ -538,6 +600,58 @@ public String convert(Object value, PinotDataType sourceType) { } }, + JSON { + @Override + public int toInt(Object value) { Review comment: We should parse the json and convert to value if the json is a value node, or use the same method as the STRING type because value node json can be read as plain string ########## File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java ########## @@ -986,7 +1126,7 @@ public PinotDataType getSingleValueType() { /** * Returns the {@link PinotDataType} for the given {@link FieldSpec} for data ingestion purpose. Returns object array * type for multi-valued types. - * TODO: Add MV support for BOOLEAN, TIMESTAMP, BYTES + * TODO: Add MV support for BOOLEAN, TIMESTAMP, BYTES, JSON Review comment: I don't think we need MV support for JSON as it can hold array ########## File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/columnminmaxvalue/ColumnMinMaxValueGenerator.java ########## @@ -89,7 +89,7 @@ private void addColumnMinMaxValueForColumn(String columnName) } PinotDataBuffer dictionaryBuffer = _segmentWriter.getIndexFor(columnName, ColumnIndexType.DICTIONARY); - DataType dataType = columnMetadata.getDataType(); + DataType dataType = columnMetadata.getDataType().getStoredType(); Review comment: Good catch ########## File path: pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java ########## @@ -285,7 +285,7 @@ void readIntValues(int[] docIds, int length, int[] valueBuffer) { _reader.readDictIds(docIds, length, dictIdBuffer, readerContext); _dictionary.readIntValues(dictIdBuffer, length, valueBuffer); } else { - switch (_reader.getValueType()) { + switch (_reader.getValueType().getStoredType()) { Review comment: We don't need stored type lookup here as the value type of reader is already stored type ########## File path: pinot-common/src/main/java/org/apache/pinot/common/function/FunctionUtils.java ########## @@ -31,6 +31,9 @@ private FunctionUtils() { } + // TODO: Do we need to create a class for handling JSON. It doesn't seem like this is needed since JSON type directly Review comment: Currently we use `String` to represent JSON. If we change that to use `JsonNode`, then we can add `JsonNode` here. One problem of adding `JsonNode` is that it is an abstract class, and won't match the actual class ########## File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java ########## @@ -748,6 +874,20 @@ public String toString(Object value) { return getSingleValueType().toString(toObjectArray(value)[0]); } + public String toJson(Object value) { + String jsonString = getSingleValueType().toString(toObjectArray(value)[0]); + + // Check if the string is valid JSON. + JsonNode jsonNode = null; + try { + jsonNode = JsonUtils.stringToJsonNode(jsonString); + } catch (IOException ioe) { + throw new UnsupportedOperationException("Cannot convert value from STRING to JSON"); + } + + return jsonNode.toString(); + } Review comment: For string value, we should try to parse it as json string; for other type value, we should try to parse it as general object: ```suggestion public String toJson(Object value) { if (value instanceof String) { try { return JsonUtils.stringToJsonNode((String) value).toString(); } catch (Exception e) { throw new RuntimeException("Caught exception while parsing string '" + value + "' as json", e); } } else { try { return JsonUtils.objectToString(value).toString(); } catch (Exception e) { throw new RuntimeException("Caught exception while serializing object '" + value + "' as json", e); } } } ``` ########## File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java ########## @@ -748,6 +874,20 @@ public String toString(Object value) { return getSingleValueType().toString(toObjectArray(value)[0]); } + public String toJson(Object value) { + String jsonString = getSingleValueType().toString(toObjectArray(value)[0]); + + // Check if the string is valid JSON. + JsonNode jsonNode = null; + try { + jsonNode = JsonUtils.stringToJsonNode(jsonString); + } catch (IOException ioe) { + throw new UnsupportedOperationException("Cannot convert value from STRING to JSON"); + } + + return jsonNode.toString(); + } Review comment: No need to override this method for different data types ########## File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java ########## @@ -633,6 +633,7 @@ public static void addColumnMetadataInfo(PropertiesConfiguration properties, Str properties.setProperty(getKeyFor(column, TOTAL_DOCS), String.valueOf(totalDocs)); DataType dataType = fieldSpec.getDataType(); properties.setProperty(getKeyFor(column, DATA_TYPE), String.valueOf(dataType)); + properties.setProperty(getKeyFor(column, STORAGE_FORMAT), V1Constants.Str.STORAGE_FORMAT_DEFAULT_VALUE); Review comment: Let's not add the storage format for now. I don't have a clear picture on how this field is going to be used. When we want to support another storage format, we can add it then. ########## File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/inv/json/BaseJsonIndexCreator.java ########## @@ -87,7 +89,15 @@ @Override public void add(String jsonString) throws IOException { - addFlattenedRecords(JsonUtils.flatten(JsonUtils.stringToJsonNode(jsonString))); + JsonNode jsonNode = JsonUtils.stringToJsonNode(jsonString); Review comment: We don't really need another validation here. It is already validated during the type conversion (PinotDataType) ########## File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/predicate/EqualsPredicateEvaluatorFactory.java ########## @@ -72,6 +72,7 @@ public static BaseRawValueBasedPredicateEvaluator newRawValueBasedEvaluator(EqPr case TIMESTAMP: return new LongRawValueBasedEqPredicateEvaluator(TimestampUtils.toMillisSinceEpoch(value)); case STRING: + case JSON: Review comment: Not sure if we want to allow EQ operation on `JSON` field, same for other predicates -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org