Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]

2024-04-18 Thread via GitHub
sdd commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1570115947 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]

2024-04-18 Thread via GitHub
sdd commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1570116819 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-18 Thread via GitHub
nastra commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1570117531 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2329,6 +2332,119 @@ public void multipleDiffsAgainstMultipleTablesLastFails() { assertTh

Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]

2024-04-18 Thread via GitHub
sdd commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1570118932 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-18 Thread via GitHub
nastra commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1570124615 ## core/src/main/java/org/apache/iceberg/rest/CatalogHandlers.java: ## @@ -117,6 +118,29 @@ public static ListNamespacesResponse listNamespaces( return ListNamespac

Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]

2024-04-18 Thread via GitHub
sdd commented on PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#issuecomment-2063172829 FAO @Fokko @marvinlanhenke @liurenjie1024: Comments addressed, ready for re-review -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Backport: Exclude `docutils!=0.21` as a dependency (#615) [iceberg-python]

2024-04-18 Thread via GitHub
HonahX commented on PR #616: URL: https://github.com/apache/iceberg-python/pull/616#issuecomment-2063206552 Sorry I missed this one and just opened and merged a duplicate PR: #617 :scream:. I am too eager to finish the release.. Thanks for preparing the backport PR! I will close i

Re: [PR] Backport: Exclude `docutils!=0.21` as a dependency (#615) [iceberg-python]

2024-04-18 Thread via GitHub
HonahX closed pull request #616: Backport: Exclude `docutils!=0.21` as a dependency (#615) URL: https://github.com/apache/iceberg-python/pull/616 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Tests: Fix unstable test_timestamp_to_date due to timezone [iceberg-python]

2024-04-18 Thread via GitHub
HonahX merged PR #612: URL: https://github.com/apache/iceberg-python/pull/612 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

[I] Fix dependabot issue [iceberg-python]

2024-04-18 Thread via GitHub
Fokko opened a new issue, #618: URL: https://github.com/apache/iceberg-python/issues/618 ### Feature Request / Improvement @HonahX spotted this, looks like it cannot parse the `pyproject.toml` for unknown reason: https://github.com/apache/iceberg-python/security/dependabot/16/

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-18 Thread via GitHub
javrasya commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-2063300863 Sorry for the inconvenience, didn't know how to run he checks locally, now I know and it is all good in my local. Pushed the changes. 🙏 @stevenzwu -- This is an automated message from

Re: [PR] Backport: Exclude `docutils!=0.21` as a dependency (#615) [iceberg-python]

2024-04-18 Thread via GitHub
Fokko commented on PR #616: URL: https://github.com/apache/iceberg-python/pull/616#issuecomment-2063317152 @HonahX No problem, thanks for the RC3 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Fix dependabot issue [iceberg-python]

2024-04-18 Thread via GitHub
Fokko commented on issue #618: URL: https://github.com/apache/iceberg-python/issues/618#issuecomment-2063318138 After reverting https://github.com/Fokko/iceberg-python/commit/55cdb150de1a25919927f98eb7095f88a7f695e7 Dependabot started working again -- This is an automated message from th

[PR] Move Ruff configuration to separate config file [iceberg-python]

2024-04-18 Thread via GitHub
Fokko opened a new pull request, #619: URL: https://github.com/apache/iceberg-python/pull/619 Dependabot stopped working after merging: https://github.com/apache/iceberg-python/commit/b829737940c5cf35d169a784ef45a6bddbc4a645 This broke dependabot since it is now encountering is

[PR] Kafka Connect: Add kerberos authentication option [iceberg]

2024-04-18 Thread via GitHub
Dawnpool opened a new pull request, #10173: URL: https://github.com/apache/iceberg/pull/10173 Hello, I am making the same PR as the one in the [original repository](https://github.com/tabular-io/iceberg-kafka-connect/pull/236) because I was told that it is being moved to this core reposi

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-18 Thread via GitHub
pvary merged PR #9464: URL: https://github.com/apache/iceberg/pull/9464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-18 Thread via GitHub
pvary commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-2063424953 Thanks for the PR @javrasya and @stevenzwu for the review! @javrasya: Please port the changes to v1.19, and v1.17 Thanks, Peter -- This is an automated message from the Ap

[I] Changes in describe behaviour of a table break partition info? [iceberg]

2024-04-18 Thread via GitHub
brysd opened a new issue, #10174: URL: https://github.com/apache/iceberg/issues/10174 ### Apache Iceberg version 1.4.1 ### Query engine Spark ### Please describe the bug 🐞 Related to #6290 we build upon the spark DESCRIBE statement to retrieve the partition

Re: [PR] Hive: turn off the stats gathering when iceberg.hive.keep.stats is false [iceberg]

2024-04-18 Thread via GitHub
pvary commented on PR #10148: URL: https://github.com/apache/iceberg/pull/10148#issuecomment-2063505375 Thanks @deniskuzZ for the info! Good to know that it will not hurt the Hive integration, to have this PR in. @stargrey102: Could we create a test which actually checks that the stat

Re: [I] Timestamp/Day transform returns Date as required type while days is actually stored integer [iceberg]

2024-04-18 Thread via GitHub
zinking commented on issue #10159: URL: https://github.com/apache/iceberg/issues/10159#issuecomment-2063528864 > @zinking You may check [this comment](https://github.com/apache/iceberg/issues/279#issuecomment-519620975) for the background. thanks -- This is an automated message fr

Re: [I] Timestamp/Day transform returns Date as required type while days is actually stored integer [iceberg]

2024-04-18 Thread via GitHub
zinking closed issue #10159: Timestamp/Day transform returns Date as required type while days is actually stored integer URL: https://github.com/apache/iceberg/issues/10159 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] ORC file format support [iceberg-python]

2024-04-18 Thread via GitHub
samuforvia commented on issue #20: URL: https://github.com/apache/iceberg-python/issues/20#issuecomment-2063540742 Is there any update on this? Specifically looking for support regarding Hive Catalog, which is usually managed via Trino. However, feature to ingest some data using pyiceberg w

Re: [PR] Move Ruff configuration to separate config file [iceberg-python]

2024-04-18 Thread via GitHub
Fokko merged PR #619: URL: https://github.com/apache/iceberg-python/pull/619 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[PR] Build: Bump idna from 3.6 to 3.7 [iceberg-python]

2024-04-18 Thread via GitHub
dependabot[bot] opened a new pull request, #620: URL: https://github.com/apache/iceberg-python/pull/620 Bumps [idna](https://github.com/kjd/idna) from 3.6 to 3.7. Release notes Sourced from https://github.com/kjd/idna/releases";>idna's releases. v3.7 What's Changed

Re: [PR] Build: Bump idna from 3.6 to 3.7 [iceberg-python]

2024-04-18 Thread via GitHub
Fokko merged PR #620: URL: https://github.com/apache/iceberg-python/pull/620 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]

2024-04-18 Thread via GitHub
marvinlanhenke commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1570491035 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more c

[I] java.lang.NoClassDefFoundError: scala/jdk/CollectionConverters$ [iceberg]

2024-04-18 Thread via GitHub
celltobig opened a new issue, #10175: URL: https://github.com/apache/iceberg/issues/10175 ### Apache Iceberg version 1.5.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 use sparkSesson.SQL run error maybe only spark3.3 has org.

Re: [I] java.lang.NoClassDefFoundError: scala/jdk/CollectionConverters$ [iceberg]

2024-04-18 Thread via GitHub
celltobig commented on issue #10175: URL: https://github.com/apache/iceberg/issues/10175#issuecomment-2063661245 ![Uploading image.png…]() -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] WIP: View Spec implementation [iceberg-rust]

2024-04-18 Thread via GitHub
c-thiel commented on PR #331: URL: https://github.com/apache/iceberg-rust/pull/331#issuecomment-2063797958 @Fokko thats is a hard topic. The idealist in me would like to eventually see something like substrait beeing used. Adoption of the project is very slow across engines though.

Re: [I] NPE During RewriteDataFiles Action with Nessie [iceberg]

2024-04-18 Thread via GitHub
ajantha-bhat commented on issue #10110: URL: https://github.com/apache/iceberg/issues/10110#issuecomment-2063857060 closing as it was a user error and user confirmed that it worked after correcting it. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] [WIP] Integration with Datafusion [iceberg-rust]

2024-04-18 Thread via GitHub
simonvandel commented on code in PR #324: URL: https://github.com/apache/iceberg-rust/pull/324#discussion_r1570757993 ## crates/integrations/datafusion/src/catalog.rs: ## @@ -0,0 +1,67 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
syun64 commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1570839749 ## pyiceberg/table/__init__.py: ## @@ -3537,6 +3537,58 @@ def update_partitions_map( schema=table_schema, ) +def files(self) -> "pa.Tabl

Re: [I] Integration tests performance degradation [iceberg-python]

2024-04-18 Thread via GitHub
Gowthami03B commented on issue #604: URL: https://github.com/apache/iceberg-python/issues/604#issuecomment-2063983365 Interesting, @kevinjqliu are you already working on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-18 Thread via GitHub
javrasya commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-2064193035 Is that manually done @pvary or is there a utility in the repo to do that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570983709 ## api/src/main/java/org/apache/iceberg/expressions/ExpressionUtil.java: ## @@ -564,6 +575,7 @@ private static String sanitizeDate(int days, int today) { return "(da

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570983215 ## api/src/main/java/org/apache/iceberg/expressions/ExpressionUtil.java: ## @@ -600,6 +612,12 @@ private static String sanitizeString(CharSequence value, long now, int t

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570986587 ## api/src/main/java/org/apache/iceberg/expressions/ExpressionUtil.java: ## @@ -600,6 +612,12 @@ private static String sanitizeString(CharSequence value, long now, int t

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570986131 ## api/src/main/java/org/apache/iceberg/transforms/Bucket.java: ## @@ -54,6 +54,7 @@ static & SerializableFunction> B get( return (B) new BucketInteger(numBucke

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570986975 ## api/src/main/java/org/apache/iceberg/transforms/Timestamps.java: ## @@ -89,17 +153,21 @@ public SerializableFunction bind(Type type) { @Override public boolea

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570987468 ## api/src/main/java/org/apache/iceberg/transforms/Timestamps.java: ## @@ -112,11 +180,11 @@ public boolean satisfiesOrderOf(Transform other) { } if (other in

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570988298 ## api/src/main/java/org/apache/iceberg/transforms/TransformUtil.java: ## @@ -62,6 +62,14 @@ static String humanTimestampWithoutZone(Long timestampMicros) { return

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570989064 ## api/src/main/java/org/apache/iceberg/transforms/Timestamps.java: ## @@ -31,54 +32,117 @@ import org.apache.iceberg.util.DateTimeUtil; import org.apache.iceberg.util.

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on PR #9008: URL: https://github.com/apache/iceberg/pull/9008#issuecomment-2064277011 @rdblue Thanks for the review! Please have another look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-04-18 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1570989849 ## api/src/test/java/org/apache/iceberg/expressions/TestMiscLiteralConversions.java: ## @@ -99,8 +117,10 @@ public void testInvalidBooleanConversions() { Types.D

Re: [PR] Spark 3.5: Spark action to compute the partition stats [iceberg]

2024-04-18 Thread via GitHub
ajantha-bhat commented on PR #9437: URL: https://github.com/apache/iceberg/pull/9437#issuecomment-2064341275 @aokolnychyi: Please find the new PR that just works on local algorithm. https://github.com/apache/iceberg/pull/10176 > - We should focus on the local implem

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-18 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1571017091 ## core/src/main/java/org/apache/iceberg/rest/CatalogHandlers.java: ## @@ -117,6 +118,29 @@ public static ListNamespacesResponse listNamespaces( return ListNamespa

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
kevinjqliu commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571023200 ## tests/conftest.py: ## @@ -2060,7 +2060,7 @@ def spark() -> "SparkSession": .config("spark.sql.catalog.hive.warehouse", "s3://warehouse/hive/")

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-04-18 Thread via GitHub
ajantha-bhat commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1571024420 ## data/src/main/java/org/apache/iceberg/data/GeneratePartitionStats.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-04-18 Thread via GitHub
ajantha-bhat commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1571025684 ## data/src/main/java/org/apache/iceberg/data/PartitionStatsWriterUtil.java: ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-04-18 Thread via GitHub
ajantha-bhat commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1571030511 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,213 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Kevinjqliu/poc parallelize tests [iceberg-python]

2024-04-18 Thread via GitHub
Fokko commented on PR #598: URL: https://github.com/apache/iceberg-python/pull/598#issuecomment-2064376042 Why do I still have to approve your runs? :D -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-04-18 Thread via GitHub
ajantha-bhat commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1571032953 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,213 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-04-18 Thread via GitHub
ajantha-bhat commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1571033881 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,213 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [I] Integration tests performance degradation [iceberg-python]

2024-04-18 Thread via GitHub
kevinjqliu commented on issue #604: URL: https://github.com/apache/iceberg-python/issues/604#issuecomment-2064388651 Here's what I'm blocked on specifically. Parallelize this test `test_query_filter_appended_null`, ``` PYTEST_ARGS="-n auto -k test_query_filter_appended_null" /us

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-04-18 Thread via GitHub
ajantha-bhat commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1571037034 ## data/src/main/java/org/apache/iceberg/data/PartitionStatsWriterUtil.java: ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[PR] Build: Bump aiohttp from 3.9.3 to 3.9.4 [iceberg-python]

2024-04-18 Thread via GitHub
dependabot[bot] opened a new pull request, #621: URL: https://github.com/apache/iceberg-python/pull/621 Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.9.3 to 3.9.4. Release notes Sourced from https://github.com/aio-libs/aiohttp/releases";>aiohttp's releases. 3.9

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-18 Thread via GitHub
mas-chen commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-2064457196 @javrasya manually but you can use this an example https://github.com/apache/iceberg/pull/9464 (see PR description) -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
Gowthami03B commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571063843 ## tests/conftest.py: ## @@ -2060,7 +2060,7 @@ def spark() -> "SparkSession": .config("spark.sql.catalog.hive.warehouse", "s3://warehouse/hive/")

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
kevinjqliu commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571070645 ## tests/conftest.py: ## @@ -2060,7 +2060,7 @@ def spark() -> "SparkSession": .config("spark.sql.catalog.hive.warehouse", "s3://warehouse/hive/")

Re: [PR] Flink: FlinkFileIO implementation [iceberg]

2024-04-18 Thread via GitHub
rodmeneses commented on code in PR #10151: URL: https://github.com/apache/iceberg/pull/10151#discussion_r1571082650 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkFileIO.java: ## @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Flink: FlinkFileIO implementation [iceberg]

2024-04-18 Thread via GitHub
rodmeneses commented on code in PR #10151: URL: https://github.com/apache/iceberg/pull/10151#discussion_r1571083046 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkFileIO.java: ## @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Flink: FlinkFileIO implementation [iceberg]

2024-04-18 Thread via GitHub
rodmeneses commented on code in PR #10151: URL: https://github.com/apache/iceberg/pull/10151#discussion_r1571086551 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkFileIO.java: ## @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Flink: FlinkFileIO implementation [iceberg]

2024-04-18 Thread via GitHub
rodmeneses commented on code in PR #10151: URL: https://github.com/apache/iceberg/pull/10151#discussion_r1571089828 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkFileIO.java: ## @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Flink: FlinkFileIO implementation [iceberg]

2024-04-18 Thread via GitHub
rodmeneses commented on code in PR #10151: URL: https://github.com/apache/iceberg/pull/10151#discussion_r1571092822 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkInputFile.java: ## @@ -0,0 +1,179 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Flink: FlinkFileIO implementation [iceberg]

2024-04-18 Thread via GitHub
rodmeneses commented on code in PR #10151: URL: https://github.com/apache/iceberg/pull/10151#discussion_r1571094073 ## flink/v1.18/flink/src/test/java/org/apache/iceberg/flink/FlinkFileIOTest.java: ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-18 Thread via GitHub
elkhand commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-2064558588 @javrasya Thanks for the PR. I think @mas-chen wanted to mention this PR: https://github.com/apache/iceberg/pull/9334 instead. -- This is an automated message from the Apache Git Servi

Re: [I] Fix dependabot issue [iceberg-python]

2024-04-18 Thread via GitHub
Fokko closed issue #618: Fix dependabot issue URL: https://github.com/apache/iceberg-python/issues/618 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: i

Re: [I] Fix dependabot issue [iceberg-python]

2024-04-18 Thread via GitHub
Fokko commented on issue #618: URL: https://github.com/apache/iceberg-python/issues/618#issuecomment-2064568680 Closing this, dependabot is back! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Bump to Spark 3.4.3 [iceberg-python]

2024-04-18 Thread via GitHub
HonahX merged PR #622: URL: https://github.com/apache/iceberg-python/pull/622 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-18 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1571188458 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2329,6 +2332,119 @@ public void multipleDiffsAgainstMultipleTablesLastFails() { assertT

Re: [PR] Flink: port #9464 to v1.17 and v1.19 [iceberg]

2024-04-18 Thread via GitHub
elkhand commented on PR #10177: URL: https://github.com/apache/iceberg/pull/10177#issuecomment-2064958765 cc: @stevenzwu @pvary will appreciate your review on https://github.com/apache/iceberg/pull/9464 backport into Flink 1.17 and Flink 1.19. cc: @javrasya -- This is an automated m

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
geruh commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571235918 ## pyiceberg/table/__init__.py: ## @@ -3537,6 +3537,58 @@ def update_partitions_map( schema=table_schema, ) +def files(self) -> "pa.Table

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
geruh commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571235918 ## pyiceberg/table/__init__.py: ## @@ -3537,6 +3537,58 @@ def update_partitions_map( schema=table_schema, ) +def files(self) -> "pa.Table

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
geruh commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571237708 ## pyiceberg/table/__init__.py: ## @@ -3537,6 +3537,58 @@ def update_partitions_map( schema=table_schema, ) +def files(self) -> "pa.Table

Re: [PR] Build: Bump aiohttp from 3.9.3 to 3.9.4 [iceberg-python]

2024-04-18 Thread via GitHub
Fokko commented on PR #621: URL: https://github.com/apache/iceberg-python/pull/621#issuecomment-2064985759 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
Fokko commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571241435 ## tests/conftest.py: ## @@ -2060,7 +2060,7 @@ def spark() -> "SparkSession": .config("spark.sql.catalog.hive.warehouse", "s3://warehouse/hive/") .

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-18 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1571188458 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2329,6 +2332,119 @@ public void multipleDiffsAgainstMultipleTablesLastFails() { assertT

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
geruh commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571245762 ## tests/integration/test_inspect_table.py: ## @@ -445,3 +445,65 @@ def check_pyiceberg_df_equals_spark_df(df: pa.Table, spark_df: DataFrame) -> Non df = t

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
geruh commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571245762 ## tests/integration/test_inspect_table.py: ## @@ -445,3 +445,65 @@ def check_pyiceberg_df_equals_spark_df(df: pa.Table, spark_df: DataFrame) -> Non df = t

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
geruh commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571245762 ## tests/integration/test_inspect_table.py: ## @@ -445,3 +445,65 @@ def check_pyiceberg_df_equals_spark_df(df: pa.Table, spark_df: DataFrame) -> Non df = t

Re: [PR] Add Files metadata table [iceberg-python]

2024-04-18 Thread via GitHub
geruh commented on code in PR #614: URL: https://github.com/apache/iceberg-python/pull/614#discussion_r1571245762 ## tests/integration/test_inspect_table.py: ## @@ -445,3 +445,65 @@ def check_pyiceberg_df_equals_spark_df(df: pa.Table, spark_df: DataFrame) -> Non df = t

Re: [PR] Build: Bump aiohttp from 3.9.3 to 3.9.4 [iceberg-python]

2024-04-18 Thread via GitHub
Fokko merged PR #621: URL: https://github.com/apache/iceberg-python/pull/621 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] Iceberg may occur data duplication when use flink to write data to iceberg and commit failed [iceberg]

2024-04-18 Thread via GitHub
pvary closed issue #10165: Iceberg may occur data duplication when use flink to write data to iceberg and commit failed URL: https://github.com/apache/iceberg/issues/10165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] Iceberg may occur data duplication when use flink to write data to iceberg and commit failed [iceberg]

2024-04-18 Thread via GitHub
pvary commented on issue #10165: URL: https://github.com/apache/iceberg/issues/10165#issuecomment-2065036455 Could you please describe the exact si -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Fix dependency with `deptry` [iceberg-python]

2024-04-18 Thread via GitHub
Fokko commented on code in PR #534: URL: https://github.com/apache/iceberg-python/pull/534#discussion_r1571256522 ## pyproject.toml: ## @@ -72,6 +72,10 @@ gcsfs = { version = ">=2023.1.0,<2024.1.0", optional = true } psycopg2-binary = { version = ">=2.9.6", optional = true } s

Re: [PR] Docs: Update features for Hive 4.0 [iceberg]

2024-04-18 Thread via GitHub
pvary commented on code in PR #10162: URL: https://github.com/apache/iceberg/pull/10162#discussion_r1571271577 ## docs/docs/hive.md: ## @@ -34,6 +34,32 @@ Iceberg compatibility with Hive 2.x and Hive 3.1.2/3 supports the following feat !!! warning DML operations work only

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571283635 ## spark/v3.4/build.gradle: ## @@ -70,8 +70,11 @@ project(":iceberg-spark:iceberg-spark-${sparkMajorVersion}_${scalaVersion}") { exclude group: 'io.netty

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571286633 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -196,6 +201,40 @@ private Duration toDuration(String time) { } }

Re: [PR] Docs: Update features for Hive 4.0 [iceberg]

2024-04-18 Thread via GitHub
pvary commented on code in PR #10162: URL: https://github.com/apache/iceberg/pull/10162#discussion_r1571289437 ## docs/docs/hive.md: ## @@ -431,12 +466,120 @@ ALTER TABLE t SET TBLPROPERTIES ('storage_handler'='org.apache.iceberg.mr.hive.H During the migration the data files a

Re: [PR] Flink: port #9464 to v1.17 and v1.19 [iceberg]

2024-04-18 Thread via GitHub
stevenzwu merged PR #10177: URL: https://github.com/apache/iceberg/pull/10177 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571293170 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/BaseColumnBatchLoader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571295057 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/BaseColumnBatchLoader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571301509 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometColumnReader.java: ## @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache Soft

Re: [PR] Flink: backport PR #9464 for being able to serialize splits with bigger payload [iceberg]

2024-04-18 Thread via GitHub
javrasya closed pull request #10178: Flink: backport PR #9464 for being able to serialize splits with bigger payload URL: https://github.com/apache/iceberg/pull/10178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Flink: backport PR #9464 for being able to serialize splits with bigger payload [iceberg]

2024-04-18 Thread via GitHub
javrasya commented on PR #10178: URL: https://github.com/apache/iceberg/pull/10178#issuecomment-2065211514 Closing since this was already done (Kudos to @elkhand ) https://github.com/apache/iceberg/pull/10177 -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Flink: port #9464 to v1.17 and v1.19 [iceberg]

2024-04-18 Thread via GitHub
javrasya commented on PR #10177: URL: https://github.com/apache/iceberg/pull/10177#issuecomment-2065212817 Thank you for helping @elkhand 🙏 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[PR] Verify release quality of life improvements [iceberg-python]

2024-04-18 Thread via GitHub
kevinjqliu opened a new pull request, #626: URL: https://github.com/apache/iceberg-python/pull/626 Minor edits to the "Verify Release" instructions. Added link to "verify release" in the release email. Here's what the "Cast the vote" section look like in Markdown ![Screenshot 20

Re: [PR] Verify release quality of life improvements [iceberg-python]

2024-04-18 Thread via GitHub
kevinjqliu commented on code in PR #626: URL: https://github.com/apache/iceberg-python/pull/626#discussion_r1571442239 ## mkdocs/docs/verify-release.md: ## @@ -105,15 +105,17 @@ make test To run the full integration tests: Review Comment: I did not fully understand L87-89.

[PR] Build: Bump adlfs from 2024.2.0 to 2024.4.1 [iceberg-python]

2024-04-18 Thread via GitHub
dependabot[bot] opened a new pull request, #627: URL: https://github.com/apache/iceberg-python/pull/627 Bumps [adlfs](https://github.com/fsspec/adlfs) from 2024.2.0 to 2024.4.1. Release notes Sourced from https://github.com/fsspec/adlfs/releases";>adlfs's releases. 2024.4.1

[PR] Build: Bump pyarrow from 15.0.0 to 15.0.2 [iceberg-python]

2024-04-18 Thread via GitHub
dependabot[bot] opened a new pull request, #628: URL: https://github.com/apache/iceberg-python/pull/628 Bumps [pyarrow](https://github.com/apache/arrow) from 15.0.0 to 15.0.2. Commits https://github.com/apache/arrow/commit/e03105efc38edca4ca429bf967a17b4d0fbebe40";>e03105e MINO

  1   2   >