Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-04-05 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001625405 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/DataFileServices.java: ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-04-05 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001607244 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-04-05 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001143364 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,450 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-04-04 Thread via GitHub
snazy commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2003301101 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/data/FlinkObjectModels.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
pvary commented on PR #12298: URL: https://github.com/apache/iceberg/pull/12298#issuecomment-2736976799 Here are a few important open questions: 1. We should decide on the expected filtering behavior. Currently the filters are applied as best effort for the file format readers. We might d

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
snazy commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2003295410 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileToObjectModelRegistry.java: ## @@ -0,0 +1,257 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2002938197 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2002920042 ## core/src/main/java/org/apache/iceberg/io/datafile/WriterBuilderBase.java: ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2002913615 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -99,92 +121,187 @@ public static WriteBuilder write(OutputFile file) { return new WriteBuilder(file)

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2002914040 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileToObjectModelRegistry.java: ## @@ -0,0 +1,257 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2002908440 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/data/FlinkObjectModels.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2002757412 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileToObjectModelRegistry.java: ## @@ -0,0 +1,257 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-19 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2002558500 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -287,14 +405,17 @@ CodecFactory codec() { } } + @Deprecated Review Comment: Definitely dep

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001465756 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001145882 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,450 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
snazy commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001211823 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -287,14 +405,17 @@ CodecFactory codec() { } } + @Deprecated Review Comment: Are these just

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001642637 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001631404 ## data/src/main/java/org/apache/iceberg/data/RegistryBasedFileWriterFactory.java: ## @@ -0,0 +1,191 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001644270 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001623231 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/DataFileServices.java: ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001637794 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001632538 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001631968 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001618838 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/DataFileServices.java: ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001616111 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Expr

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001614642 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Expr

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001612190 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Expr

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001610963 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Expr

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001610427 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Expr

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001609254 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -43,7 +43,7 @@ * populated via delegated read calls to

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001604660 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001466783 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001472189 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001470065 ## core/src/main/java/org/apache/iceberg/io/datafile/DeleteFilter.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001467974 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001469432 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001467341 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001463969 ## core/src/main/resources/META-INF/services/org.apache.iceberg.io.datafile.DataFileServiceRegistry$WriterService: ## @@ -0,0 +1,20 @@ +# +# Licensed to the Apache Soft

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001462905 ## core/src/main/java/org/apache/iceberg/io/datafile/DataWriterBuilder.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-18 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r2001139816 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,420 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-17 Thread via GitHub
liurenjie1024 commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r165495 ## core/src/main/java/org/apache/iceberg/io/datafile/AppenderBuilder.java: ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-17 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1999135766 ## core/src/main/java/org/apache/iceberg/io/datafile/AppenderBuilder.java: ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-17 Thread via GitHub
liurenjie1024 commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1998644714 ## core/src/main/java/org/apache/iceberg/io/datafile/AppenderBuilder.java: ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-17 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1998451396 ## core/src/main/java/org/apache/iceberg/io/datafile/AppenderBuilder.java: ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996040055 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996026806 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996047949 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Exp

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996050564 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Exp

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996055858 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/DataFileServices.java: ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996102983 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996110305 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996027399 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996063073 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/DataFileServices.java: ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996049790 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Exp

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996038651 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996104117 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996058909 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/DataFileServices.java: ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
liurenjie1024 commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996663917 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-15 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996035290 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,420 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996046263 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -43,7 +43,7 @@ * populated via delegated read calls t

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996104527 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996077906 ## data/src/main/java/org/apache/iceberg/data/RegistryBasedFileWriterFactory.java: ## @@ -0,0 +1,191 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996105772 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996106267 ## core/src/main/java/org/apache/iceberg/io/datafile/WriteBuilder.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
danielcweeks commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996093918 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996064834 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/DataFileServices.java: ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996048683 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Exp

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996047261 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -65,76 +59,32 @@ protected CloseableIterable newBatchIterable( Exp

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996031442 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
liurenjie1024 commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1995602495 ## core/src/main/java/org/apache/iceberg/io/datafile/AppenderBuilder.java: ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996040462 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996038345 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996036485 ## core/src/main/java/org/apache/iceberg/io/datafile/DeleteFilter.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996033869 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996030623 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996028761 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1996029195 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-14 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1995629867 ## core/src/main/java/org/apache/iceberg/io/datafile/ReadBuilder.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-03-02 Thread via GitHub
pvary commented on PR #12298: URL: https://github.com/apache/iceberg/pull/12298#issuecomment-2691104513 @rdblue: Addressed most of your comments. Could you check if you agree with the general approach? If the community is satisfied with the approach, we might want to proceed by separating o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-24 Thread via GitHub
pvary commented on PR #12298: URL: https://github.com/apache/iceberg/pull/12298#issuecomment-2678012816 > While I think the goal here is a good one, the implementation looks too complex to be workable in its current form. I'm happy that we agree with the goals. I created a PR to start

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-24 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1967301718 ## core/src/main/resources/META-INF/services/org.apache.iceberg.io.datafile.DataFileServiceRegistry$WriterService: ## @@ -0,0 +1,20 @@ +# +# Licensed to the Apache Soft

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-24 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1967264015 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -786,4 +831,51 @@ public AvroIterable build() { public static long rowCount(InputFile file) { re

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-24 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1967261354 ## core/src/main/java/org/apache/iceberg/io/datafile/AppenderBuilder.java: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
rdblue commented on PR #12298: URL: https://github.com/apache/iceberg/pull/12298#issuecomment-2675856645 While I think the goal here is a good one, the implementation looks too complex to be workable in its current form. The primary issue that we currently have is adapting object mode

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1966319213 ## core/src/main/resources/META-INF/services/org.apache.iceberg.io.datafile.DataFileServiceRegistry$WriterService: ## @@ -0,0 +1,20 @@ +# +# Licensed to the Apache Sof

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
rdblue commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1966319213 ## core/src/main/resources/META-INF/services/org.apache.iceberg.io.datafile.DataFileServiceRegistry$WriterService: ## @@ -0,0 +1,20 @@ +# +# Licensed to the Apache Sof

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
danielcweeks commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1966087640 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -786,4 +831,51 @@ public AvroIterable build() { public static long rowCount(InputFile file) {

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
danielcweeks commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1966087640 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -786,4 +831,51 @@ public AvroIterable build() { public static long rowCount(InputFile file) {

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
danielcweeks commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1966102494 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,420 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
danielcweeks commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1966074053 ## core/src/main/java/org/apache/iceberg/io/datafile/AppenderBuilder.java: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
danielcweeks commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1966063531 ## core/src/main/java/org/apache/iceberg/io/datafile/DataWriterBuilder.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
pvary commented on PR #12298: URL: https://github.com/apache/iceberg/pull/12298#issuecomment-2674653648 I will start to collect the differences here between the different writer types for reference: - Writer context is different between delete and data files. This contains TablePropertie

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1965491428 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,450 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1965239684 ## core/src/main/java/org/apache/iceberg/io/datafile/ReaderBuilder.java: ## @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1965269911 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,450 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1965268175 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,450 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
pvary commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1965265739 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,450 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] WIP: Interface based DataFile reader and writer API [iceberg]

2025-02-21 Thread via GitHub
liurenjie1024 commented on code in PR #12298: URL: https://github.com/apache/iceberg/pull/12298#discussion_r1965202207 ## core/src/main/java/org/apache/iceberg/io/datafile/DataFileServiceRegistry.java: ## @@ -0,0 +1,450 @@ +/* + * Licensed to the Apache Software Foundation (ASF)