Re: [I] Parallel Table.append [iceberg-python]

2024-05-13 Thread via GitHub
Fokko closed issue #428: Parallel Table.append URL: https://github.com/apache/iceberg-python/issues/428 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Parallel Table.append [iceberg-python]

2024-05-13 Thread via GitHub
Fokko commented on issue #428: URL: https://github.com/apache/iceberg-python/issues/428#issuecomment-2106799664 Fixed in #444 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [I] Faster ingestion from Parquet [iceberg-python]

2024-05-13 Thread via GitHub
Fokko closed issue #346: Faster ingestion from Parquet URL: https://github.com/apache/iceberg-python/issues/346 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Build: Bump nessie from 0.81.1 to 0.82.0 [iceberg]

2024-05-13 Thread via GitHub
ajantha-bhat commented on PR #10318: URL: https://github.com/apache/iceberg/pull/10318#issuecomment-2106836122 Flink flaky test: `TestIcebergSourceFailoverWithWatermarkExtractor > testBoundedWithSavepoint FAILED` -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Build: Bump nessie from 0.81.1 to 0.82.0 [iceberg]

2024-05-13 Thread via GitHub
ajantha-bhat commented on PR #10318: URL: https://github.com/apache/iceberg/pull/10318#issuecomment-2106836296 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Build: Bump nessie from 0.81.1 to 0.82.0 [iceberg]

2024-05-13 Thread via GitHub
dependabot[bot] commented on PR #10318: URL: https://github.com/apache/iceberg/pull/10318#issuecomment-2106836353 Sorry, only users with push access can use that command. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Add documentation for views [iceberg]

2024-05-13 Thread via GitHub
nastra closed issue #9846: Add documentation for views URL: https://github.com/apache/iceberg/issues/9846 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail

Re: [PR] Implement BoundPredicateVisitor trait for ManifestFilterVisitor [iceberg-rust]

2024-05-13 Thread via GitHub
marvinlanhenke commented on code in PR #367: URL: https://github.com/apache/iceberg-rust/pull/367#discussion_r1598017270 ## crates/iceberg/src/expr/visitors/manifest_evaluator.rs: ## @@ -103,98 +106,245 @@ impl BoundPredicateVisitor for ManifestFilterVisitor<'_> { refe

Re: [PR] Concurrent table scans [iceberg-rust]

2024-05-13 Thread via GitHub
marvinlanhenke commented on code in PR #373: URL: https://github.com/apache/iceberg-rust/pull/373#discussion_r1598053461 ## crates/iceberg/src/scan.rs: ## @@ -189,66 +195,20 @@ impl TableScan { self.case_sensitive, )?; -let mut partition_filter_ca

Re: [I] Using the Iceberg catalog in your file system [iceberg]

2024-05-13 Thread via GitHub
nastra commented on issue #10326: URL: https://github.com/apache/iceberg/issues/10326#issuecomment-2106958655 @911432 can you please elaborate what the goal here is? Everything you described is already possible today. -- This is an automated message from the Apache Git Service. To respond

Re: [I] catalog issue [iceberg]

2024-05-13 Thread via GitHub
nastra commented on issue #10324: URL: https://github.com/apache/iceberg/issues/10324#issuecomment-2106963864 > But if I tried to run a session with a different catalog such as demo_cat it's able to show all databases and table referred to localcat catalog. Do you have example output

Re: [PR] Build: Bump nessie from 0.81.1 to 0.82.0 [iceberg]

2024-05-13 Thread via GitHub
Fokko commented on PR #10318: URL: https://github.com/apache/iceberg/pull/10318#issuecomment-2107445552 https://github.com/dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Build: Bump software.amazon.awssdk:bom from 2.25.45 to 2.25.50 [iceberg]

2024-05-13 Thread via GitHub
Fokko merged PR #10323: URL: https://github.com/apache/iceberg/pull/10323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Build: Bump nessie from 0.81.1 to 0.82.0 [iceberg]

2024-05-13 Thread via GitHub
ajantha-bhat commented on PR #10318: URL: https://github.com/apache/iceberg/pull/10318#issuecomment-2107509238 Failure is independent of this change ``` * What went wrong: Execution failed for task ':iceberg-gcp:compileTestJava'. > Could not resolve all files for configuration

Re: [PR] Build: Bump nessie from 0.81.1 to 0.82.0 [iceberg]

2024-05-13 Thread via GitHub
ajantha-bhat commented on PR #10318: URL: https://github.com/apache/iceberg/pull/10318#issuecomment-2107510067 https://github.com/dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Build: Bump nessie from 0.81.1 to 0.82.0 [iceberg]

2024-05-13 Thread via GitHub
dependabot[bot] commented on PR #10318: URL: https://github.com/apache/iceberg/pull/10318#issuecomment-2107510177 Sorry, only users with push access can use that command. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Using the Iceberg catalog in your file system [iceberg]

2024-05-13 Thread via GitHub
911432 commented on issue #10326: URL: https://github.com/apache/iceberg/issues/10326#issuecomment-2107631861 I would like to store the query engine as a container image and the iceberg table and iceberg catalog as a file system. Let's take this [spark page](https://iceberg.apache.org/do

Re: [PR] Add create_namespace_if_not_exists method [iceberg-python]

2024-05-13 Thread via GitHub
ndrluis commented on PR #725: URL: https://github.com/apache/iceberg-python/pull/725#issuecomment-2107649058 @syun64 Thank you for your review. I have made the requested change! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Support creating tags by adding `set_ref_snapshot` API [iceberg-python]

2024-05-13 Thread via GitHub
syun64 commented on PR #728: URL: https://github.com/apache/iceberg-python/pull/728#issuecomment-2107718666 Yes - even if its small, I think it would still be good to have a unit test that verifies the behavior of the proposed table and transaction API There are some tests in https:/

Re: [I] Using the Iceberg catalog in your file system [iceberg]

2024-05-13 Thread via GitHub
911432 commented on issue #10326: URL: https://github.com/apache/iceberg/issues/10326#issuecomment-2107762849 I know `Spark.sql.catalog.hadoop_prod.uri` doesn't seem to exist. Similarly, for s3 and files, I hope `Spark.sql.catalog..warehouse` is sufficient even without `Spark.sql.catalog

Re: [I] Using the Iceberg catalog in your file system [iceberg]

2024-05-13 Thread via GitHub
911432 commented on issue #10326: URL: https://github.com/apache/iceberg/issues/10326#issuecomment-2107764935 I know `Spark.sql.catalog.hadoop_prod.uri` doesn't seem to exist. Similarly, for s3 and file, I hope `Spark.sql.catalog..warehouse` is sufficient even without `Spark.sql.catalog.

Re: [PR] Support partial deletes [iceberg-python]

2024-05-13 Thread via GitHub
syun64 commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1598626280 ## pyiceberg/table/__init__.py: ## @@ -443,6 +471,74 @@ def overwrite( for data_file in data_files: update_snapshot.append_dat

Re: [PR] Spark 3.5: Add validation to SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
manuzhang commented on code in PR #10315: URL: https://github.com/apache/iceberg/pull/10315#discussion_r1598686177 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -227,7 +225,26 @@ public ThisT tableProperty(String name) { return sel

Re: [PR] feat: support append data file and add e2e test [iceberg-rust]

2024-05-13 Thread via GitHub
ZENOTME commented on code in PR #349: URL: https://github.com/apache/iceberg-rust/pull/349#discussion_r1598713751 ## crates/iceberg/src/transaction.rs: ## @@ -121,6 +166,270 @@ impl<'a> Transaction<'a> { } } +/// FastAppendAction is a transaction action for fast append d

Re: [PR] feat: support append data file and add e2e test [iceberg-rust]

2024-05-13 Thread via GitHub
ZENOTME commented on code in PR #349: URL: https://github.com/apache/iceberg-rust/pull/349#discussion_r1598744515 ## crates/iceberg/src/transaction.rs: ## @@ -121,6 +166,270 @@ impl<'a> Transaction<'a> { } } +/// FastAppendAction is a transaction action for fast append d

Re: [PR] feat: support append data file and add e2e test [iceberg-rust]

2024-05-13 Thread via GitHub
ZENOTME commented on PR #349: URL: https://github.com/apache/iceberg-rust/pull/349#issuecomment-2108162712 Hi, I have tried to fix this PR. Some things may not be fixed well now: 1. https://github.com/apache/iceberg-rust/pull/349#discussion_r1580444775 I'm not sure whether my understa

Re: [PR] Support partial deletes [iceberg-python]

2024-05-13 Thread via GitHub
syun64 commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1598823358 ## pyiceberg/table/__init__.py: ## @@ -2931,14 +3161,52 @@ def _deleted_entries(self) -> List[ManifestEntry]: return [] -class OverwriteFiles(_MergingS

Re: [PR] Concurrent table scans [iceberg-rust]

2024-05-13 Thread via GitHub
sdd commented on code in PR #373: URL: https://github.com/apache/iceberg-rust/pull/373#discussion_r1598826218 ## crates/iceberg/src/scan.rs: ## @@ -302,13 +262,147 @@ impl TableScan { arrow_reader_builder.build().read(self.plan_files().await?) } +} + +#[derive(De

Re: [PR] Add kevinjqliu to collaborators [iceberg-python]

2024-05-13 Thread via GitHub
amogh-jahagirdar merged PR #729: URL: https://github.com/apache/iceberg-python/pull/729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr.

Re: [PR] Concurrent table scans [iceberg-rust]

2024-05-13 Thread via GitHub
sdd commented on code in PR #373: URL: https://github.com/apache/iceberg-rust/pull/373#discussion_r1598841588 ## crates/iceberg/src/scan.rs: ## @@ -189,66 +195,20 @@ impl TableScan { self.case_sensitive, )?; -let mut partition_filter_cache = Parti

Re: [I] Add runtime module to enable concurrent load of manifest files. [iceberg-rust]

2024-05-13 Thread via GitHub
sdd commented on issue #124: URL: https://github.com/apache/iceberg-rust/issues/124#issuecomment-2108489909 Using `try_for_each_concurrent` here rather than just spawning in a for loop will allow us to tune the concurrncy as it accepts a max concurrent tasks argument. I'd advocate for a dat

Re: [PR] Concurrent table scans [iceberg-rust]

2024-05-13 Thread via GitHub
sdd commented on code in PR #373: URL: https://github.com/apache/iceberg-rust/pull/373#discussion_r1598867675 ## crates/iceberg/src/scan.rs: ## @@ -302,13 +262,147 @@ impl TableScan { arrow_reader_builder.build().read(self.plan_files().await?) } +} + +#[derive(De

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-13 Thread via GitHub
rodmeneses commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1598876459 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java: ## @@ -0,0 +1,774 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [I] Snowflke can't read tables generated by pyicberg ? [iceberg-python]

2024-05-13 Thread via GitHub
sfc-gh-dhuo commented on issue #723: URL: https://github.com/apache/iceberg-python/issues/723#issuecomment-2108525436 Normally the abfss URI is expected to look like this, according to [azure docs](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction-abfs-uri)

Re: [PR] AWS: Change S3FileIO to use SHA1 based checksums [iceberg]

2024-05-13 Thread via GitHub
singhpk234 commented on PR #10293: URL: https://github.com/apache/iceberg/pull/10293#issuecomment-2108558359 > What do you think @muddyfish @singhpk234? Sounds good, @amogh-jahagirdar ! -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] AWS: Fix S3FileIO tests failing on ListObjects for Express buckets [iceberg]

2024-05-13 Thread via GitHub
singhpk234 commented on PR #10292: URL: https://github.com/apache/iceberg/pull/10292#issuecomment-2108568399 > According to this PR https://github.com/apache/iceberg/pull/7914, it doesn't seem that delete_orphan_files supports S3FileIO. If/when it does, it still might not work with S3 Expre

Re: [PR] Parquet: Remove redundant reading of file metadata when determining starting positions of row groups [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10328: URL: https://github.com/apache/iceberg/pull/10328#discussion_r1598902060 ## parquet/src/main/java/org/apache/iceberg/parquet/ReadConf.java: ## @@ -186,27 +186,14 @@ private Map generateOffsetToStartPos(Schema schema) { ret

Re: [PR] Parquet: Remove redundant reading of file metadata when determining starting positions of row groups [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10328: URL: https://github.com/apache/iceberg/pull/10328#discussion_r1598902060 ## parquet/src/main/java/org/apache/iceberg/parquet/ReadConf.java: ## @@ -186,27 +186,14 @@ private Map generateOffsetToStartPos(Schema schema) { ret

Re: [PR] Parquet: Remove redundant reading of file metadata when determining starting positions of row groups [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10328: URL: https://github.com/apache/iceberg/pull/10328#discussion_r1598971225 ## parquet/src/main/java/org/apache/iceberg/parquet/ReadConf.java: ## @@ -186,27 +184,14 @@ private Map generateOffsetToStartPos(Schema schema) { ret

Re: [PR] Parquet: Remove redundant reading of file metadata when determining starting positions of row groups [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10328: URL: https://github.com/apache/iceberg/pull/10328#discussion_r1598971225 ## parquet/src/main/java/org/apache/iceberg/parquet/ReadConf.java: ## @@ -186,27 +184,14 @@ private Map generateOffsetToStartPos(Schema schema) { ret

Re: [PR] Add EnumConfParser to SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
aokolnychyi commented on code in PR #10311: URL: https://github.com/apache/iceberg/pull/10311#discussion_r1598997615 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -197,6 +201,34 @@ private Duration toDuration(String time) { } }

Re: [PR] Parquet: Remove redundant reading of file metadata when determining starting positions of row groups [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10328: URL: https://github.com/apache/iceberg/pull/10328#discussion_r1598999097 ## parquet/src/main/java/org/apache/iceberg/parquet/ReadConf.java: ## @@ -186,27 +184,14 @@ private Map generateOffsetToStartPos(Schema schema) { ret

Re: [PR] Spark 3.5: Add validation to SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
aokolnychyi commented on code in PR #10315: URL: https://github.com/apache/iceberg/pull/10315#discussion_r1598999270 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -227,7 +225,26 @@ public ThisT tableProperty(String name) { return s

Re: [PR] Add EnumConfParser to SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
aokolnychyi commented on code in PR #10311: URL: https://github.com/apache/iceberg/pull/10311#discussion_r1599001671 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -70,6 +70,10 @@ public DurationConfParser durationConf() { return new

Re: [PR] Add EnumConfParser to SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
aokolnychyi commented on code in PR #10311: URL: https://github.com/apache/iceberg/pull/10311#discussion_r1599002997 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -302,14 +302,12 @@ public PlanningMode dataPlanningMode() { return LOC

Re: [PR] Parquet: Remove redundant reading of file metadata when determining starting positions of row groups [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar closed pull request #10328: Parquet: Remove redundant reading of file metadata when determining starting positions of row groups URL: https://github.com/apache/iceberg/pull/10328 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Add EnumConfParser to SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
aokolnychyi commented on code in PR #10311: URL: https://github.com/apache/iceberg/pull/10311#discussion_r1599002997 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -302,14 +302,12 @@ public PlanningMode dataPlanningMode() { return LOC

Re: [PR] Add bloom filter fpp config [iceberg]

2024-05-13 Thread via GitHub
aokolnychyi merged PR #10149: URL: https://github.com/apache/iceberg/pull/10149 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Add bloom filter fpp config [iceberg]

2024-05-13 Thread via GitHub
huaxingao commented on PR #10149: URL: https://github.com/apache/iceberg/pull/10149#issuecomment-2108679771 Thanks @aokolnychyi @manuzhang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Support special chars in S3URI [iceberg]

2024-05-13 Thread via GitHub
danielcweeks commented on PR #10283: URL: https://github.com/apache/iceberg/pull/10283#issuecomment-2108825075 @dimas-b I just put up https://github.com/apache/iceberg/pull/10329 to address the field name encoding. This should also address the quotes issue as well since it will be encoded.

Re: [PR] Add EnumConfParser to SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
aokolnychyi merged PR #10311: URL: https://github.com/apache/iceberg/pull/10311 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Support special chars in S3URI [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on PR #10283: URL: https://github.com/apache/iceberg/pull/10283#issuecomment-2108850538 > Overall, I think it's better to fail fast where interoperability is a concern as that's more important than supporting the full s3 key space. After seeing a variety of

Re: [PR] Add EnumConfParser to SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
huaxingao commented on PR #10311: URL: https://github.com/apache/iceberg/pull/10311#issuecomment-2108854132 Thanks @aokolnychyi @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Url encode field names for partition paths [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10329: URL: https://github.com/apache/iceberg/pull/10329#discussion_r1599127317 ## core/src/test/java/org/apache/iceberg/TestLocationProvider.java: ## @@ -285,4 +286,22 @@ public void testObjectStorageWithinTableLocation() { assertT

Re: [PR] Table commit retries based on table properties [iceberg-python]

2024-05-13 Thread via GitHub
syun64 commented on code in PR #330: URL: https://github.com/apache/iceberg-python/pull/330#discussion_r1598837560 ## tests/table/test_init.py: ## @@ -1125,3 +1128,122 @@ def test_serialize_commit_table_request() -> None: deserialized_request = CommitTableRequest.model_v

[PR] Build: Bump pypa/cibuildwheel from 2.17.0 to 2.18.0 [iceberg-python]

2024-05-13 Thread via GitHub
dependabot[bot] opened a new pull request, #730: URL: https://github.com/apache/iceberg-python/pull/730 Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.17.0 to 2.18.0. Release notes Sourced from https://github.com/pypa/cibuildwheel/releases";>pypa/cibuildwhee

[PR] Build: Bump griffe from 0.44.0 to 0.45.0 [iceberg-python]

2024-05-13 Thread via GitHub
dependabot[bot] opened a new pull request, #731: URL: https://github.com/apache/iceberg-python/pull/731 Bumps [griffe](https://github.com/mkdocstrings/griffe) from 0.44.0 to 0.45.0. Release notes Sourced from https://github.com/mkdocstrings/griffe/releases";>griffe's releases.

[PR] Build: Bump mkdocs-material from 9.5.21 to 9.5.22 [iceberg-python]

2024-05-13 Thread via GitHub
dependabot[bot] opened a new pull request, #732: URL: https://github.com/apache/iceberg-python/pull/732 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.21 to 9.5.22. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mk

[PR] Build: Bump moto from 5.0.6 to 5.0.7 [iceberg-python]

2024-05-13 Thread via GitHub
dependabot[bot] opened a new pull request, #733: URL: https://github.com/apache/iceberg-python/pull/733 Bumps [moto](https://github.com/getmoto/moto) from 5.0.6 to 5.0.7. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog. 5.0.

[PR] Spark3.4: Add support for enums in SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
huaxingao opened a new pull request, #10330: URL: https://github.com/apache/iceberg/pull/10330 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Docs: Update vendor information for Cloudera [iceberg]

2024-05-13 Thread via GitHub
bartash commented on PR #10278: URL: https://github.com/apache/iceberg/pull/10278#issuecomment-2108929825 @nastra thanks for the review, could you push this in please when you get a chance? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] Support partial deletes [iceberg-python]

2024-05-13 Thread via GitHub
jqin61 commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1599183250 ## pyiceberg/table/__init__.py: ## @@ -2897,12 +2987,152 @@ def _commit(self) -> UpdatesAndRequirements: ), ( AssertTabl

Re: [PR] Support partial deletes [iceberg-python]

2024-05-13 Thread via GitHub
jqin61 commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1599183250 ## pyiceberg/table/__init__.py: ## @@ -2897,12 +2987,152 @@ def _commit(self) -> UpdatesAndRequirements: ), ( AssertTabl

Re: [PR] Support partial deletes [iceberg-python]

2024-05-13 Thread via GitHub
jqin61 commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1599183250 ## pyiceberg/table/__init__.py: ## @@ -2897,12 +2987,152 @@ def _commit(self) -> UpdatesAndRequirements: ), ( AssertTabl

Re: [PR] AWS: Retain Glue Catalog table description after updating Iceberg table [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10199: URL: https://github.com/apache/iceberg/pull/10199#discussion_r1599192267 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueTableOperations.java: ## @@ -316,6 +316,11 @@ void persistGlueTable( .skipArchive(awsProp

Re: [PR] Url encode field names for partition paths [iceberg]

2024-05-13 Thread via GitHub
danielcweeks commented on code in PR #10329: URL: https://github.com/apache/iceberg/pull/10329#discussion_r1599193529 ## core/src/test/java/org/apache/iceberg/TestLocationProvider.java: ## @@ -285,4 +286,22 @@ public void testObjectStorageWithinTableLocation() { assertThat(

Re: [PR] Url encode field names for partition paths [iceberg]

2024-05-13 Thread via GitHub
danielcweeks commented on code in PR #10329: URL: https://github.com/apache/iceberg/pull/10329#discussion_r1599193529 ## core/src/test/java/org/apache/iceberg/TestLocationProvider.java: ## @@ -285,4 +286,22 @@ public void testObjectStorageWithinTableLocation() { assertThat(

Re: [PR] AWS: Retain Glue Catalog table description after updating Iceberg table [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10199: URL: https://github.com/apache/iceberg/pull/10199#discussion_r1599192463 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueTableOperations.java: ## @@ -316,6 +316,11 @@ void persistGlueTable( .skipArchive(awsProp

Re: [PR] Url encode field names for partition paths [iceberg]

2024-05-13 Thread via GitHub
dimas-b commented on code in PR #10329: URL: https://github.com/apache/iceberg/pull/10329#discussion_r1599203081 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -189,7 +189,7 @@ public String partitionToPath(StructLike data) { if (i > 0) { sb.ap

Re: [PR] Url encode field names for partition paths [iceberg]

2024-05-13 Thread via GitHub
dimas-b commented on code in PR #10329: URL: https://github.com/apache/iceberg/pull/10329#discussion_r1599208577 ## core/src/test/java/org/apache/iceberg/TestLocationProvider.java: ## @@ -285,4 +286,22 @@ public void testObjectStorageWithinTableLocation() { assertThat(parts

Re: [PR] Url encode field names for partition paths [iceberg]

2024-05-13 Thread via GitHub
dimas-b commented on code in PR #10329: URL: https://github.com/apache/iceberg/pull/10329#discussion_r1599208577 ## core/src/test/java/org/apache/iceberg/TestLocationProvider.java: ## @@ -285,4 +286,22 @@ public void testObjectStorageWithinTableLocation() { assertThat(parts

Re: [PR] Spark3.4: Add support for enums in SparkConfParser [iceberg]

2024-05-13 Thread via GitHub
huaxingao commented on PR #10330: URL: https://github.com/apache/iceberg/pull/10330#issuecomment-2109008119 cc @aokolnychyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Url encode field names for partition paths [iceberg]

2024-05-13 Thread via GitHub
dimas-b commented on code in PR #10329: URL: https://github.com/apache/iceberg/pull/10329#discussion_r1599215769 ## core/src/test/java/org/apache/iceberg/TestLocationProvider.java: ## @@ -285,4 +286,22 @@ public void testObjectStorageWithinTableLocation() { assertThat(parts

Re: [I] Snowflke can't read tables generated by pyicberg ? [iceberg-python]

2024-05-13 Thread via GitHub
djouallah commented on issue #723: URL: https://github.com/apache/iceberg-python/issues/723#issuecomment-2109038972 the URI is `'abfss://account_name.dfs.core.windows.net/data/iceberg_dwh/scada/metadata/snap-2728627078701324745-0-7c1d442e-7321-46f8-aa06-5d5f94cde607.avro'` btw it wor

Re: [PR] Support special chars in S3URI [iceberg]

2024-05-13 Thread via GitHub
dimas-b commented on PR #10283: URL: https://github.com/apache/iceberg/pull/10283#issuecomment-2109039866 > I still feel we want to discourage (if not disallow) special characters in paths due to cross compatibility issues. I made a comment under #10329 proposing an escaping method th

Re: [PR] Flink: refactor sink shuffling statistics collection [iceberg]

2024-05-13 Thread via GitHub
stevenzwu commented on PR #10331: URL: https://github.com/apache/iceberg/pull/10331#issuecomment-2109051590 Moved `DataStatistics` away from generic and use a type to distinguish btw Map and Sketch statistics. One main reason is to support auto migration/promotion of Map stats to Sketch if

Re: [PR] Flink: refactor sink shuffling statistics collection [iceberg]

2024-05-13 Thread via GitHub
stevenzwu commented on code in PR #10331: URL: https://github.com/apache/iceberg/pull/10331#discussion_r1599242761 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/AggregatedStatistics.java: ## @@ -19,53 +19,87 @@ package org.apache.iceberg.flink.sink.sh

Re: [PR] Flink: refactor sink shuffling statistics collection [iceberg]

2024-05-13 Thread via GitHub
stevenzwu commented on code in PR #10331: URL: https://github.com/apache/iceberg/pull/10331#discussion_r1599247256 ## flink/v1.19/build.gradle: ## @@ -66,6 +66,8 @@ project(":iceberg-flink:iceberg-flink-${flinkMajorVersion}") { exclude group: 'org.slf4j' } +imp

Re: [PR] Make proxy endpoint configurable for s3 Http clients [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10332: URL: https://github.com/apache/iceberg/pull/10332#discussion_r1599249297 ## aws/src/main/java/org/apache/iceberg/aws/HttpClientProperties.java: ## @@ -52,6 +52,13 @@ public class HttpClientProperties implements Serializable { p

Re: [PR] Make proxy endpoint configurable for s3 Http clients [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10332: URL: https://github.com/apache/iceberg/pull/10332#discussion_r1599254560 ## aws/src/main/java/org/apache/iceberg/aws/HttpClientProperties.java: ## @@ -52,6 +52,13 @@ public class HttpClientProperties implements Serializable { p

Re: [PR] Make proxy endpoint configurable for s3 Http clients [iceberg]

2024-05-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #10332: URL: https://github.com/apache/iceberg/pull/10332#discussion_r1599254560 ## aws/src/main/java/org/apache/iceberg/aws/HttpClientProperties.java: ## @@ -52,6 +52,13 @@ public class HttpClientProperties implements Serializable { p

Re: [PR] Flink: Maintenance - MonitorSource [iceberg]

2024-05-13 Thread via GitHub
stevenzwu commented on code in PR #10308: URL: https://github.com/apache/iceberg/pull/10308#discussion_r1599253290 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/SingleThreadedIteratorSource.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apa

Re: [PR] Make proxy endpoint configurable for s3 Http clients [iceberg]

2024-05-13 Thread via GitHub
flyrain commented on code in PR #10332: URL: https://github.com/apache/iceberg/pull/10332#discussion_r1599301463 ## aws/src/main/java/org/apache/iceberg/aws/HttpClientProperties.java: ## @@ -52,6 +52,13 @@ public class HttpClientProperties implements Serializable { public sta

Re: [PR] Make proxy endpoint configurable for s3 Http clients [iceberg]

2024-05-13 Thread via GitHub
flyrain commented on PR #10332: URL: https://github.com/apache/iceberg/pull/10332#issuecomment-2109204549 Thanks @amogh-jahagirdar for the review. Resolved your comments. Would you like to take another look? -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] AWS: Retain Glue Catalog table description after updating Iceberg table [iceberg]

2024-05-13 Thread via GitHub
aajisaka commented on code in PR #10199: URL: https://github.com/apache/iceberg/pull/10199#discussion_r1599418533 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueTableOperations.java: ## @@ -316,6 +316,11 @@ void persistGlueTable( .skipArchive(awsProperties.g

Re: [PR] AWS: Retain Glue Catalog table description after updating Iceberg table [iceberg]

2024-05-13 Thread via GitHub
aajisaka commented on code in PR #10199: URL: https://github.com/apache/iceberg/pull/10199#discussion_r1599418688 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -206,6 +208,15 @@ public void testUpdateTable() { .isEqualTo("EXTER

Re: [PR] AWS: Retain Glue Catalog table description after updating Iceberg table [iceberg]

2024-05-13 Thread via GitHub
aajisaka commented on code in PR #10199: URL: https://github.com/apache/iceberg/pull/10199#discussion_r1599419966 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueTableOperations.java: ## @@ -316,6 +316,11 @@ void persistGlueTable( .skipArchive(awsProperties.g

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599425277 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java: ## @@ -0,0 +1,780 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599426224 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkAggregator.java: ## @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599428479 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkAggregator.java: ## @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599429291 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkAggregator.java: ## @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599431142 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommittable.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599432484 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommittableSerializer.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599433519 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommitter.java: ## @@ -0,0 +1,439 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599434778 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommitter.java: ## @@ -0,0 +1,439 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599434778 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommitter.java: ## @@ -0,0 +1,439 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599436116 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommitter.java: ## @@ -0,0 +1,439 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599438514 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/writer/IcebergSinkWriter.java: ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1599443353 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSinkV2.java: ## @@ -140,8 +140,8 @@ public void testCheckAndGetEqualityFieldIds() {

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-05-13 Thread via GitHub
pvary commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2109381711 Looks promising to me. @stevenzwu, could you please review too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

  1   2   >