siddharthteotia commented on a change in pull request #6918:
URL: https://github.com/apache/incubator-pinot/pull/6918#discussion_r632854615



##########
File path: 
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/ComplexTypeTransformerTest.java
##########
@@ -237,4 +238,76 @@ public void testUnnestCollection() {
     next = itr.next();
     Assert.assertEquals("v2", next.getValue("array.a"));
   }
+
+  @Test
+  public void testConvertCollectionToString() {
+    // json convert inner collections
+    // {
+    //   "array":[
+    //      {
+    //         "array1":[
+    //            {
+    //               "b":"v1"
+    //            }
+    //         ]
+    //      }
+    //   ]
+    // }
+    // is converted to
+    // [{
+    //   "array.array1":"[
+    //            {
+    //               "b":"v1"
+    //            }
+    //         ]"
+    // }]
+    ComplexTypeTransformer transformer = new 
ComplexTypeTransformer(Arrays.asList("array"), ".");
+    GenericRow genericRow = new GenericRow();
+    Map<String, Object> map = new HashMap<>();
+    Object[] array1 = new Object[1];
+    array1[0] = ImmutableMap.of("b", "v1");
+    map.put("array1", array1);
+    Object[] array = new Object[1];
+    array[0] = map;
+    genericRow.putValue("array", array);
+    transformer.transform(genericRow);
+    Assert.assertNotNull(genericRow.getValue(GenericRow.MULTIPLE_RECORDS_KEY));
+    Collection<GenericRow> collection = (Collection<GenericRow>) 
genericRow.getValue(GenericRow.MULTIPLE_RECORDS_KEY);
+    GenericRow row = collection.iterator().next();
+    Assert.assertTrue(row.getValue("array.array1") instanceof String);
+
+    // primitive array not converted

Review comment:
       As a MV column. But here we are talking about array inside a nested 
object which doesn't get the same storage format and treatment (at least in the 
current design) as an MV column in the schema. 
   
   If I have something like 
   
   ```
   {
      name
      age
      salary
      array of organizationIds (primitive integers)
      array of my addresses (another nested struct)
   }
   ```
   
   Even though array of organizationIds is an array of primitive INTs, we 
currently don't store it in the same way as we would have done if it were an 
independent MV column in the schema. So the collection unnesting rule we have 
in design should be applicable to both collections in the same way. Similarly, 
I think if the user chooses to not unnest them then the behavior (json string 
serialization rule will apply) should be same for both collections regardless 
of whether one is for primitives and other is not. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to