siddharthteotia commented on a change in pull request #7918:
URL: https://github.com/apache/pinot/pull/7918#discussion_r772589770



##########
File path: 
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java
##########
@@ -210,14 +212,15 @@ public String getSegmentMetadata(
               columnLength = storedDataType.size();
             } else if (columnMetadata.hasDictionary()) {
               // For type of variable width (String, Bytes), if it's stored 
using dictionary encoding, set the
-              // columnLength as the max
-              // length in dictionary.
+              // columnLength as the max length in dictionary.
               columnLength = columnMetadata.getColumnMaxLength();
             } else if (storedDataType == DataType.STRING || storedDataType == 
DataType.BYTES) {
               // For type of variable width (String, Bytes), if it's stored 
using raw bytes, set the columnLength as
-              // the length
-              // of the max value.
-              columnLength = ((String) 
columnMetadata.getMaxValue()).getBytes(StandardCharsets.UTF_8).length;
+              // the length of the max value.
+              ByteArray maxValueByteArray =
+                  storedDataType == DataType.BYTES ? ((ByteArray) 
columnMetadata.getMaxValue())
+                      : BytesUtils.toByteArray((String) 
columnMetadata.getMaxValue());
+              columnLength = maxValueByteArray.length();

Review comment:
       @mqliang  This code change does not look correct to me. AFAIK, we want 
to fix couple of problems
   
   - Use of` * `to allow fetching metadata for all columns instead of 
individual column. Looks like this has been fixed by encoding `*` ? Please add 
tests for the same
   
   - A weird scenario where the use of particular column name (internal) fails 
the API call for some reason. That column type is STRING and stored type is 
also STRING (UTF-8 bytes essentially). I don't think this is related to the 
handling of BYTES / BYTE_ARRAY. While column of data types / stored types as 
BYTES and BYTE_ARRAY should also be handled, the problem we wanted to fix does 
not seem to be related to it or at least there is no evidence from debugging. I 
suggest debugging this more to understand the root cause and then come back on 
this part of the change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to