pvary commented on code in PR #12771:
URL: https://github.com/apache/iceberg/pull/12771#discussion_r2039250002


##########
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkParquetWriter.java:
##########
@@ -151,4 +152,27 @@ public void testFpp() throws IOException, 
NoSuchFieldException, IllegalAccessExc
       assertThat(fpp).isEqualTo(0.05);
     }
   }
+
+  @Test
+  public void testColumnStatsEnabled()
+      throws IOException, NoSuchFieldException, IllegalAccessException {
+    File testFile = File.createTempFile("junit", null, temp.toFile());
+    try (FileAppender<InternalRow> writer =
+        Parquet.write(Files.localOutput(testFile))
+            .schema(SCHEMA)
+            .set(PARQUET_COLUMN_STATS_ENABLED_PREFIX + "id_long", "false")
+            .createWriterFunc(
+                msgType ->
+                    
SparkParquetWriters.buildWriter(SparkSchemaUtil.convert(SCHEMA), msgType))
+            .build()) {
+      // Using reflection to access the private 'props' field in ParquetWriter
+      Field propsField = writer.getClass().getDeclaredField("props");
+      propsField.setAccessible(true);
+      ParquetProperties props = (ParquetProperties) propsField.get(writer);
+      MessageType parquetSchema = ParquetSchemaUtil.convert(SCHEMA, "test");
+      ColumnDescriptor idlDescriptor = parquetSchema.getColumnDescription(new 
String[] {"id_long"});
+      // Default statisticsEnabled should be true and for column id_long, it 
is disabled.
+      assertThat(props.getStatisticsEnabled(idlDescriptor)).isEqualTo(false);
+    }
+  }

Review Comment:
   Why is this test in Spark?
   The test is testing a Parquet writer feature. I think it should be in 
TestParquet, or somewhere near that. In this case we don't need reflection to 
test the setting. Package-private method, and VisibleForTesting annotation 
would be enough.
   
   Also it would be good to test the actual effect on the Parquet files if that 
is possible.
   
   I see that the previous test for bloom filter is done here, but that means 
that should have done there too. Maybe move it in a different PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to