siddharthteotia commented on a change in pull request #5746:
URL: https://github.com/apache/incubator-pinot/pull/5746#discussion_r460317334



##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/recordtransformer/DataTypeTransformer.java
##########
@@ -87,6 +90,8 @@ public GenericRow transform(GenericRow record) {
         source = MULTI_VALUE_TYPE_MAP.get(values[0].getClass());
         if (source == null) {
           source = PinotDataType.OBJECT_ARRAY;
+        } else if (source == PinotDataType.HASHMAP) {

Review comment:
       This should be done conditionally. In other words, consider two columns:
   
   col1 - primitive column defined as MV int/float/double/string (simple array 
of primitive types that Pinot supports).  Something like "CompaniesWorkedAt".
   ```
   [
        "val1",
        "val2",
        "val3"
   ]
   ```
   col2 - complex column defined as MV. Something like addresses which is an 
array (struct) or array (map).
   
   ```
   [
       {
            "k1" : "v1",
            "k2":  "v2",
            "k3": "v3"
      }
   ]
   ```
   
   The second is a complex column whereas is first is standard primitive. Now 
the AvroRecordExtractor and AvroUtils.convert() would have returned the second 
as an array of Map/HashMap for our sample data
   
   ```
   "dimension_***" : [ {
       "item_id" : {
         "string" : "some data"
       }
     }, {
       "item_id" : {
         "string" : "some data"
       }
     }
   
   ```
   Now since schema for "dimension_***" column indicates it is an array of 
primitives, we need to extract the actual value from each HashMap. In such 
cases, it is also true that map size would be 1 but that alone should not be 
the condition since map size could be 1 even if the object is actually a 
complex one. So we should convert from Map to primitive only  if the schema 
says so. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to