huaxingao commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1597722756
########## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkParquetWriter.java: ########## @@ -116,4 +128,27 @@ public void testCorrectness() throws IOException { assertThat(rows).as("Should not have extra rows").isExhausted(); } } + + @Test + public void testFpp() throws IOException, NoSuchFieldException, IllegalAccessException { + File testFile = File.createTempFile("junit", null, temp.toFile()); + try (FileAppender<InternalRow> writer = + Parquet.write(Files.localOutput(testFile)) + .schema(SCHEMA) + .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id", "true") + .set(PARQUET_BLOOM_FILTER_COLUMN_FPP_PREFIX + "id", "0.05") + .createWriterFunc( + msgType -> + SparkParquetWriters.buildWriter(SparkSchemaUtil.convert(SCHEMA), msgType)) + .build()) { + // Using reflection to access the private 'props' field in ParquetWriter + Field propsField = writer.getClass().getDeclaredField("props"); + propsField.setAccessible(true); + ParquetProperties props = (ParquetProperties) propsField.get(writer); + MessageType parquetSchema = ParquetSchemaUtil.convert(SCHEMA, "test"); + ColumnDescriptor descriptor = parquetSchema.getColumnDescription(new String[] {"id"}); + double fpp = props.getBloomFilterFPP(descriptor).getAsDouble(); + assertThat(fpp).isEqualTo(0.05); Review Comment: parquet-mr takes the `bloomFilterFPPs` in `ParquetProperties` and uses it to build the bloom filter. Checking `bloomFilterFPPs` in `ParquetProperties` is sufficient for verifying bloom filter fpp is set correctly in Iceberg. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org