amogh-jahagirdar commented on code in PR #16501:
URL: https://github.com/apache/iceberg/pull/16501#discussion_r3282538671


##########
arrow/src/test/java/org/apache/iceberg/arrow/vectorized/TestArrowReader.java:
##########
@@ -388,6 +388,99 @@ public void testTimestampMillisAreReadCorrectly() throws 
Exception {
     assertThat(totalRowsRead).as("Should read all 
rows").isEqualTo(millisValues.size());
   }
 
+  /**
+   * Regression test: a decimal column whose Iceberg field carries an 
initialDefault/writeDefault
+   * must be readable by the vectorized reader. Before the fix to {@code
+   * VectorizedArrowReader#getPhysicalType}, allocating the vector copied the 
decimal-typed default
+   * onto the underlying physical type (int/long/fixed) and failed with {@code
+   * IllegalArgumentException: Cannot cast default value to ...}.
+   */
+  @Test
+  public void testDecimalWithDefaultIsReadByVectorizedReader() throws 
Exception {

Review Comment:
   So the short answer is that when the parquet schema is visited by the 
vectorized read builder and identifies there's a field not in the parquet file 
but in the table schema with a default , a Constant vector reader is created 
with the default value.  



##########
arrow/src/test/java/org/apache/iceberg/arrow/vectorized/TestArrowReader.java:
##########
@@ -388,6 +388,99 @@ public void testTimestampMillisAreReadCorrectly() throws 
Exception {
     assertThat(totalRowsRead).as("Should read all 
rows").isEqualTo(millisValues.size());
   }
 
+  /**
+   * Regression test: a decimal column whose Iceberg field carries an 
initialDefault/writeDefault

Review Comment:
   Is there a way to update the existing TestParquetVectorizedReads. That 
already implements a mixin `supportsDefaultValues()` and already runs through 
decimal type. I think the issue as you pointed out is specifically when it's 
not dictionary encoded (which for decimal I'd actually expect to generally be 
the case). Maybe in that class when we produce the writer there's a way to pass 
through options that disable dictionary encoding?  



##########
arrow/src/test/java/org/apache/iceberg/arrow/vectorized/TestArrowReader.java:
##########
@@ -388,6 +388,99 @@ public void testTimestampMillisAreReadCorrectly() throws 
Exception {
     assertThat(totalRowsRead).as("Should read all 
rows").isEqualTo(millisValues.size());
   }
 
+  /**
+   * Regression test: a decimal column whose Iceberg field carries an 
initialDefault/writeDefault

Review Comment:
   I'd also look at goldenFilesAndEncodings in that existing test class 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to