Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-31 Thread via GitHub
aokolnychyi commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2628566860 Thanks, @huaxingao! Thanks for reviewing, @parthchandra! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-31 Thread via GitHub
aokolnychyi merged PR #9841: URL: https://github.com/apache/iceberg/pull/9841 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-31 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2628572897 @aokolnychyi Thank you so much for your reviewing and merge this PR! Also thanks @parthchandra and @RussellSpitzer for reviewing! -- This is an automated message from the Apache Git S

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-31 Thread via GitHub
aokolnychyi commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2628434488 I am OK merging the change if we revert the default reader type and Comet experts approve the logic. I won't block this work because of the dependency on shaded APIs. We will n

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
parthchandra commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934904760 ## .baseline/checkstyle/checkstyle-suppressions.xml: ## @@ -48,4 +48,7 @@ + Review Comment: Thank you! -- This is an automated message from t

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934900306 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
parthchandra commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934889396 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934890980 ## .baseline/checkstyle/checkstyle-suppressions.xml: ## @@ -48,4 +48,7 @@ + Review Comment: I have created https://github.com/apache/datafusion-come

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934882377 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934882503 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934881973 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934875937 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934875784 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometVector.java: ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
parthchandra commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934545952 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
aokolnychyi commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2622487600 Sounds good. Other than what was mentioned in the review, the change looks good to me. We will have to adapt and run our JMH benchmarks as well. This can be done in a separate

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2622415260 Thanks a lot @aokolnychyi for your detailed review! I will 1. fix the shade problem 2. change the default to iceberg in the final version. I default to Comet only for testing

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
parthchandra commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934296900 ## .baseline/checkstyle/checkstyle-suppressions.xml: ## @@ -48,4 +48,7 @@ + Review Comment: The shaded imports can be removed. Comet has an API u

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-29 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1934285257 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933314814 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933314992 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933315559 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Software

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933315889 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Software

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933315384 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933315120 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometConstantColumnReader.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933314241 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933299355 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -27,6 +27,10 @@ private SparkSQLProperties() {} // Controls whether vecto

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933298067 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933273861 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933278905 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometConstantColumnReader.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933280739 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933281323 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933281031 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933279828 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometConstantColumnReader.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933278905 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometConstantColumnReader.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933275658 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933274637 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933272409 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933272409 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933270313 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933266867 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933264512 ## spark/v3.4/spark-runtime/src/integration/java/org/apache/iceberg/spark/SmokeTest.java: ## @@ -44,7 +45,7 @@ public void dropTable() { // Run through our Doc's

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933263274 ## .baseline/checkstyle/checkstyle-suppressions.xml: ## @@ -48,4 +48,7 @@ + Review Comment: I wish Comet would offer an API that wraps around shad

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933257006 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWriterV2.java: ## @@ -214,7 +215,7 @@ public void testWriteWithCaseSensitiveOption()

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933257006 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWriterV2.java: ## @@ -214,7 +215,7 @@ public void testWriteWithCaseSensitiveOption()

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933255066 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -27,6 +27,10 @@ private SparkSQLProperties() {} // Controls whether vec

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933253432 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatch.java: ## @@ -154,6 +172,17 @@ private boolean supportsParquetBatchReads(Types.NestedFi

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933166909 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -27,6 +27,10 @@ private SparkSQLProperties() {} // Controls whether vecto

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933168544 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933167720 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache Software

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933166269 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933114601 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometVector.java: ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933115883 ## .baseline/checkstyle/checkstyle-suppressions.xml: ## @@ -48,4 +48,7 @@ + Review Comment: Comet shades arrow, protobuf and guava. ``` impor

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933109421 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933113150 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933112941 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933112611 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933111963 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnarBatchReader.java: ## @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933109243 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -359,4 +359,12 @@ public boolean reportColumnStats() { .defaultValue(S

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933108566 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -27,6 +27,10 @@ private SparkSQLProperties() {} // Controls whether vec

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933105710 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933104797 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933104427 ## spark/v3.4/spark-runtime/src/integration/java/org/apache/iceberg/spark/SmokeTest.java: ## @@ -44,7 +45,7 @@ public void dropTable() { // Run through our Doc's G

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933101318 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933075636 ## spark/v3.4/spark-runtime/src/integration/java/org/apache/iceberg/spark/SmokeTest.java: ## @@ -44,7 +45,7 @@ public void dropTable() { // Run through our Doc's

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-28 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1933075325 ## .baseline/checkstyle/checkstyle-suppressions.xml: ## @@ -48,4 +48,7 @@ + Review Comment: I see there are imports of shaded classes in `CometCol

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-07 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1906083743 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchUtil.java: ## @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-07 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1906083743 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchUtil.java: ## @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2025-01-07 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1906083743 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchUtil.java: ## @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-12-29 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1899171737 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnVectorBuilder.java: ## @@ -46,8 +46,8 @@ public ColumnVector build(VectorHolder holde

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-12-19 Thread via GitHub
parthchandra commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2555961737 @aokolnychyi @RussellSpitzer @szehon-ho Is this getting closer to completion? asking for a friend :) Also, the Comet team is looking at providing complex type support in the re

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-12-18 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2552949729 @dpengpeng Thank you for your positive feedback! I’m glad to hear you’ve had good results with Comet on Spark. The current POC doesn't support writing Iceberg data yet, but we plan to a

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-12-18 Thread via GitHub
dpengpeng commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2552711280 @huaxingao I have used Comet on Spark and the results are very good. Now I see Iceberg and Comet working together, which is a great attempt. I look forward to Iceberg merging this POC c

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-12-17 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1889490436 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWriterV2.java: ## @@ -214,7 +215,7 @@ public void testWriteWithCaseSensitiveOption() th

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-12-06 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2523905361 @aokolnychyi Could you please take a look again? I have changed the default to `Comet` to make sure all the tests run successfully with `Comet`. I will switch back to the regular iceber

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-11-28 Thread via GitHub
github-actions[bot] commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2506874150 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-10-28 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2443300032 @bmorck Thanks for your interest! Currently, this PR only enables the CometBatchReader for batch reading; it does not yet turn on Comet's native operators. In the next step, I will make

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-10-27 Thread via GitHub
bmorck commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2440275506 @huaxingao Very interested in this work and thanks a bunch for taking this on! Wanted to see if this PR addresses all changes needed on the iceberg side needed to integrate comet with iceb

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-10-21 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2427738422 @aokolnychyi We finally have a Comet binary release, and I've updated the PR to use it. Could you please take a look at the PR when you have time? Thanks a lot! -- This is an automat

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-10-12 Thread via GitHub
findepi commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2408653233 thanks @huaxingao for this additional explanation. This makes sense. Can you please replicate that information in the PR description as well? thanks! -- This is an automated message fro

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-09-08 Thread via GitHub
manuzhang commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2337144632 @huaxingao I know that. What about alternatives? Is the Comet parquet reader the only native parquet reader implementation? -- This is an automated message from the Apache Git Service

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-09-08 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2337062384 @manuzhang We have native Parquet reader on Comet side to take advantage of the performance gain on native side. -- This is an automated message from the Apache Git Service. To respon

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-09-08 Thread via GitHub
manuzhang commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2336715251 As a newbie to the native land, I'm wondering why we integrate with Comet parquet reader here. Any alternatives? Is there an "official" parquet reader? -- This is an automated message

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-09-04 Thread via GitHub
PaulLiang1 commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2330600257 > @PaulLiang1 Thanks! I'll check with my colleague tomorrow to find out where we are in the binary release process. got it, thanks for letting me know. please feel free to let us

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-09-03 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2327947629 @PaulLiang1 Thanks! I'll check with my colleague tomorrow to find out where we are in the binary release process. -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-09-03 Thread via GitHub
PaulLiang1 commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2327877261 hey @huaxingao we are really interested in this feature, just wonder what can we help to getting this integrated? -- This is an automated message from the Apache Git Service. To

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-15 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1602698512 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/BatchReadConf.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-08 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1594829408 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/BatchReadConf.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-03 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1589730054 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-03 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1589715889 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -184,7 +185,7 @@ public boolean orcVectorizationEnabled() { .parse();

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-03 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1589703297 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -27,6 +27,10 @@ private SparkSQLProperties() {} // Controls whether vec

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585770312 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkColumnarReaderFactory.java: ## @@ -28,10 +29,12 @@ class SparkColumnarReaderFactory implemen

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585769857 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatch.java: ## @@ -115,11 +115,11 @@ private String[][] computePreferredLocations() { public

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585769554 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -32,23 +32,27 @@ import org.apache.iceberg.orc.ORC; import org.apache.

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585769266 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometColumnReader.java: ## @@ -0,0 +1,163 @@ +/* + * Licensed to the Apache Software

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585767596 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchReader.java: ## @@ -74,48 +71,23 @@ public final ColumnarBatch read(ColumnarBa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585767204 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585438330 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -32,23 +32,27 @@ import org.apache.iceberg.orc.ORC; import org.apach

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585438330 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -32,23 +32,27 @@ import org.apache.iceberg.orc.ORC; import org.apach

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585438330 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -32,23 +32,27 @@ import org.apache.iceberg.orc.ORC; import org.apach

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
aokolnychyi commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2086131659 Will check today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

  1   2   >