Re: [I] Add apply interface in transaction [iceberg-rust]

2024-10-17 Thread via GitHub
liurenjie1024 commented on issue #596: URL: https://github.com/apache/iceberg-rust/issues/596#issuecomment-2419030826 Thanks @ZENOTME 's explaination. I think I've got your point, we need sth like `commit` in transaction action so that later transaction action could take into account previo

[PR] Spark 3.5: Display write metrics on SQL UI [iceberg]

2024-10-17 Thread via GitHub
manuzhang opened a new pull request, #11340: URL: https://github.com/apache/iceberg/pull/11340 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Spec: Support geo type [iceberg]

2024-10-17 Thread via GitHub
dmeaux commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1804508892 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaNs a

Re: [I] EPIC: Rust Based Compaction [iceberg-rust]

2024-10-17 Thread via GitHub
liurenjie1024 commented on issue #624: URL: https://github.com/apache/iceberg-rust/issues/624#issuecomment-2419144850 Hi, @camuel >Does anyone has any insights on how computation heavy is the compaction workload really? Like on a beefy machine what compaction rate will be possible?

Re: [I] Spark:read iceberg table data error [iceberg]

2024-10-17 Thread via GitHub
beyond-up commented on issue #11336: URL: https://github.com/apache/iceberg/issues/11336#issuecomment-2418877797 > @beyond-up can you share the full stack trace please? Usually there's some more info in other parts of the stack trace that show what went wrong I have found the cause of

Re: [I] Exploring Enhanced Compaction Support in Rust [iceberg-rust]

2024-10-17 Thread via GitHub
liurenjie1024 commented on issue #657: URL: https://github.com/apache/iceberg-rust/issues/657#issuecomment-2419089754 Thanks @amitgilad3 for raising this. I think compaction is a relatively complex topic, and we are somehow far from doing this. For example, we don't support reading deletion

Re: [PR] feat: Implement Decimal from/to bytes represents [iceberg-rust]

2024-10-17 Thread via GitHub
liurenjie1024 commented on code in PR #665: URL: https://github.com/apache/iceberg-rust/pull/665#discussion_r1804486414 ## crates/iceberg/src/spec/values.rs: ## @@ -3031,6 +3061,31 @@ mod tests { check_avro_bytes_serde(bytes, Datum::string("iceberg"), &PrimitiveType::S

Re: [PR] Config File Handling [iceberg-go]

2024-10-17 Thread via GitHub
alex-kar commented on PR #156: URL: https://github.com/apache/iceberg-go/pull/156#issuecomment-2419117570 @nastra Rebased onto the latest `main`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[I] ERROR when executing UPDATE/DELETE queries in Iceberg 1.6.0: "Cannot add fieldId 1 as an identifier field" [iceberg]

2024-10-17 Thread via GitHub
a8356555 opened a new issue, #11341: URL: https://github.com/apache/iceberg/issues/11341 ### Apache Iceberg version 1.6.0 ### Query engine Spark ### Please describe the bug ๐Ÿž Description: I'm encountering an issue when running UPDATE or DELETE queries aft

Re: [I] Spark:read iceberg table data error [iceberg]

2024-10-17 Thread via GitHub
nastra commented on issue #11336: URL: https://github.com/apache/iceberg/issues/11336#issuecomment-2419268189 @beyond-up so far the NPE seems to be coming from Spark itself, not from Iceberg. Do you have a small reproducible example? -- This is an automated message from the Apache Git Ser

Re: [PR] AWS: Fix S3InputStream retry policy [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on PR #11335: URL: https://github.com/apache/iceberg/pull/11335#issuecomment-2419600165 Thanks @edgarRd, I'll go ahead and merge. Thank you @singhpk234 for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] AWS: Fix S3InputStream retry policy [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar merged PR #11335: URL: https://github.com/apache/iceberg/pull/11335 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Spec: Support geo type [iceberg]

2024-10-17 Thread via GitHub
dmeaux commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1804508892 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaNs a

Re: [I] Spark:read iceberg table data error [iceberg]

2024-10-17 Thread via GitHub
nastra commented on issue #11336: URL: https://github.com/apache/iceberg/issues/11336#issuecomment-2419271110 Which exact Spark version are you using? A similar issue was reported in https://issues.apache.org/jira/browse/SPARK-39061 and was already fixed in Spark 3.3.1 -- This is an auto

Re: [PR] Build: Bump junit from 5.10.1 to 5.10.2 [iceberg]

2024-10-17 Thread via GitHub
dependabot[bot] commented on PR #9699: URL: https://github.com/apache/iceberg/pull/9699#issuecomment-2419278307 OK, I won't notify you again about this release, but will get in touch when a new version is available. You can also ignore all major, minor, or patch releases for a dependency by

Re: [PR] Core: Make CatalogHandlers.commit public [iceberg]

2024-10-17 Thread via GitHub
nastra closed pull request #9789: Core: Make CatalogHandlers.commit public URL: https://github.com/apache/iceberg/pull/9789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Build: Bump junit from 5.10.1 to 5.10.2 [iceberg]

2024-10-17 Thread via GitHub
nastra closed pull request #9699: Build: Bump junit from 5.10.1 to 5.10.2 URL: https://github.com/apache/iceberg/pull/9699 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] Nessie Iceberg REST catalog and writing to localstack raises `OSError: When initiating multiple part upload` [iceberg-python]

2024-10-17 Thread via GitHub
smsmith97 commented on issue #1087: URL: https://github.com/apache/iceberg-python/issues/1087#issuecomment-2419628740 Yeah, that would be great @PetrasTYR, I am facing the same issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420232911 > @sikehish can you fix the CI lint issue? `make lint` should work > > There are other "good first issue"s, please take a look https://github.com/apache/iceberg-python/issue

Re: [PR] Spec: add variant type [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on PR #10831: URL: https://github.com/apache/iceberg/pull/10831#issuecomment-2420252802 And an entry https://github.com/apache/iceberg/blob/main/format/spec.md#parquet -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] Document Custom FileIO [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on issue #1233: URL: https://github.com/apache/iceberg-python/issues/1233#issuecomment-2420246567 > ### Feature Request / Improvement > Add documentation for custom FileIO, similar to [custom catalog](https://py.iceberg.apache.org/configuration/#custom-catalog-implemen

Re: [PR] Spec: add variant type [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on PR #10831: URL: https://github.com/apache/iceberg/pull/10831#issuecomment-2420248887 This needs some notes in `Partition Transforms` , I think explicitly we should disallow identity For Appendix B - We should define something or state explicitly we don't

Re: [I] flink:FlinkSink support dynamically changed schema [iceberg]

2024-10-17 Thread via GitHub
ottomata commented on issue #4190: URL: https://github.com/apache/iceberg/issues/4190#issuecomment-2420588912 Ah, thanks! FWIW, I think schema evolution support is worth the tradeoff of extra bytes per record :) -- This is an automated message from the Apache Git Service. T

Re: [PR] Core: fix NPE with HadoopFileIO because FileIOParser doesn't serialize Hadoop configuration [iceberg]

2024-10-17 Thread via GitHub
stevenzwu merged PR #10926: URL: https://github.com/apache/iceberg/pull/10926 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark: Add RewriteTablePath action interface [iceberg]

2024-10-17 Thread via GitHub
szehon-ho commented on code in PR #10920: URL: https://github.com/apache/iceberg/pull/10920#discussion_r1805455886 ## api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java: ## @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] Core: fix NPE with HadoopFileIO because FileIOParser doesn't serialize Hadoop configuration [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on PR #10926: URL: https://github.com/apache/iceberg/pull/10926#issuecomment-2420612439 thanks @nastra @pvary @Fokko @rdblue @ashvina for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on PR #1234: URL: https://github.com/apache/iceberg-python/pull/1234#issuecomment-2420893458 hey @omkenge, thanks for the PR. What use case is this for? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Document Custom FileIO [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on issue #1233: URL: https://github.com/apache/iceberg-python/issues/1233#issuecomment-2420894695 Assigned to you. I think we can add it under [the `FileIO` section](https://py.iceberg.apache.org/configuration/#fileio) as something like "Custom FileIO Implementations"

Re: [I] It sometimes throws exception java.lang.AssertionError: assertion failed after upgrade to Iceberg 1.3.1 + Spark 3.4.1 [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] closed issue #9092: It sometimes throws exception java.lang.AssertionError: assertion failed after upgrade to Iceberg 1.3.1 + Spark 3.4.1 URL: https://github.com/apache/iceberg/issues/9092 -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [I] It sometimes throws exception java.lang.AssertionError: assertion failed after upgrade to Iceberg 1.3.1 + Spark 3.4.1 [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9092: URL: https://github.com/apache/iceberg/issues/9092#issuecomment-2420904393 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Metrics for Manifest file caching [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9093: URL: https://github.com/apache/iceberg/issues/9093#issuecomment-2420904412 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] hive iceberg [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] closed issue #9094: hive iceberg URL: https://github.com/apache/iceberg/issues/9094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Core: checkpoint validation in BaseOverwriteFiles [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9718: URL: https://github.com/apache/iceberg/issues/9718#issuecomment-2420906840 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] How to insert overwrite with a single commit [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9720: URL: https://github.com/apache/iceberg/issues/9720#issuecomment-2420906919 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Dynamically support Spark native engine in Iceberg [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on PR #9721: URL: https://github.com/apache/iceberg/pull/9721#issuecomment-2420906960 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatโ€™s incorrect or this pull

Re: [I] Metrics for Manifest file caching [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] closed issue #9093: Metrics for Manifest file caching URL: https://github.com/apache/iceberg/issues/9093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] hive iceberg [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9094: URL: https://github.com/apache/iceberg/issues/9094#issuecomment-2420904428 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Improve read times and reduce size of metadata.json by storing schemas in external files [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9734: URL: https://github.com/apache/iceberg/issues/9734#issuecomment-2420907068 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Suppress duplicate OAuth token fetching in rest catalog client [iceberg-python]

2024-10-17 Thread via GitHub
github-actions[bot] closed issue #587: Suppress duplicate OAuth token fetching in rest catalog client URL: https://github.com/apache/iceberg-python/issues/587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Suppress duplicate OAuth token fetching in rest catalog client [iceberg-python]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #587: URL: https://github.com/apache/iceberg-python/issues/587#issuecomment-2420913318 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apac

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
gene-db commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805378000 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
gene-db commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805396309 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-10-17 Thread via GitHub
pvary commented on PR #10935: URL: https://github.com/apache/iceberg/pull/10935#issuecomment-2420515789 > First of all, we need to discuss the expected behavior: > > * Do we want to resolve equality deletes and map them into data files? Or should we add a new task and output the conte

Re: [PR] Task: Simulating OOM error during merge equality deletes [iceberg]

2024-10-17 Thread via GitHub
nicole-martinez closed pull request #11320: Task: Simulating OOM error during merge equality deletes URL: https://github.com/apache/iceberg/pull/11320 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Revert "Support wasb[s] paths in ADLSFileIO" [iceberg]

2024-10-17 Thread via GitHub
danielcweeks commented on PR #11344: URL: https://github.com/apache/iceberg/pull/11344#issuecomment-2420655723 I'm not actually opposed to the WASB path support, just concerned about the introduction of the URI class for parsing locations. Is it possible to just revert to the old parsing o

Re: [PR] Revert "Support wasb[s] paths in ADLSFileIO" [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on PR #11344: URL: https://github.com/apache/iceberg/pull/11344#issuecomment-2420666877 @mrcnc Can chime in on that, but I think that's fine. Can we have some examples though to guard against these changes in the future? Strings that won't parse correctly? -- Thi

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-17 Thread via GitHub
mrcnc commented on PR #11294: URL: https://github.com/apache/iceberg/pull/11294#issuecomment-2420675717 > Hi @mrcnc, @RussellSpitzer . To confirm this PR solution to 10127: any existing (or new) iceberg tables stored in azure with wasbs + .blob. will be interpreted interchangeably with abfs

[PR] Flink: make FLIP-27 default in SQL and mark the old FlinkSource as deprecated [iceberg]

2024-10-17 Thread via GitHub
stevenzwu opened a new pull request, #11345: URL: https://github.com/apache/iceberg/pull/11345 SQL config default change is only applied to 1.20. See the dev ML discussion [here](https://lists.apache.org/api/plain?thread=2r34z5drgkn1fqbvktwfzhr0fj39p3th). -- This is an automated me

Re: [PR] Arrow: Fix indexing in Parquet dictionary encoded values readers [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11247: URL: https://github.com/apache/iceberg/pull/11247#discussion_r1805583085 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/parquet/vectorized/TestParquetDictionaryEncodedVectorizedReads.java: ## @@ -93,4 +125,64 @@ p

Re: [PR] Arrow: Fix indexing in Parquet dictionary encoded values readers [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11247: URL: https://github.com/apache/iceberg/pull/11247#discussion_r1805582734 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/parquet/vectorized/TestParquetDictionaryEncodedVectorizedReads.java: ## @@ -93,4 +125,64 @@ p

Re: [PR] Revert "Support wasb[s] paths in ADLSFileIO" [iceberg]

2024-10-17 Thread via GitHub
danielcweeks commented on PR #11344: URL: https://github.com/apache/iceberg/pull/11344#issuecomment-2420833542 > Can we have some examples though to guard against these changes in the future? Strings that won't parse correctly? There are a lot of subtle issues like hashCode and equali

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420244677 https://github.com/apache/iceberg-python/actions/runs/11389073844/job/31690659534?pr=1232 -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Small fix to TestSerializableTypes.java [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer merged PR #11342: URL: https://github.com/apache/iceberg/pull/11342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420292280 > https://github.com/apache/iceberg-python/actions/runs/11389073844/job/31690659534?pr=1232 Yup, linting is in place now. Thanks for the reminder! -- This is an automated

[PR] Bump mkdocstrings from 0.26.1 to 0.26.2 [iceberg-python]

2024-10-17 Thread via GitHub
dependabot[bot] opened a new pull request, #1235: URL: https://github.com/apache/iceberg-python/pull/1235 Bumps [mkdocstrings](https://github.com/mkdocstrings/mkdocstrings) from 0.26.1 to 0.26.2. Release notes Sourced from https://github.com/mkdocstrings/mkdocstrings/releases";>mkd

[PR] Bump mypy-boto3-glue from 1.35.23 to 1.35.25 [iceberg-python]

2024-10-17 Thread via GitHub
dependabot[bot] opened a new pull request, #1236: URL: https://github.com/apache/iceberg-python/pull/1236 Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.35.23 to 1.35.25. Commits See full diff in https://github.com/youtype/mypy_boto3_builder/commi

[PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-17 Thread via GitHub
omkenge opened a new pull request, #1234: URL: https://github.com/apache/iceberg-python/pull/1234 ### Support for Additional Date Format Summary This PR extends the date/time handling functions by adding support for one additional formats: - `MMDD` (e.g., `20241018`)

Re: [PR] Revert "Support wasb[s] paths in ADLSFileIO" [iceberg]

2024-10-17 Thread via GitHub
mrcnc commented on PR #11344: URL: https://github.com/apache/iceberg/pull/11344#issuecomment-2420687278 I'll follow up with another PR that doesn't use java.net.URI for parsing. I'm happy to have more ๐Ÿ‘€ on this -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] Bump pypa/cibuildwheel from 2.21.1 to 2.21.3 [iceberg-python]

2024-10-17 Thread via GitHub
sungwy merged PR #1224: URL: https://github.com/apache/iceberg-python/pull/1224 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Bump getdaft from 0.3.2 to 0.3.8 [iceberg-python]

2024-10-17 Thread via GitHub
sungwy merged PR #1228: URL: https://github.com/apache/iceberg-python/pull/1228 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Bump moto from 5.0.14 to 5.0.17 [iceberg-python]

2024-10-17 Thread via GitHub
sungwy merged PR #1230: URL: https://github.com/apache/iceberg-python/pull/1230 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

[PR] [KafkaConnect] Fix RecordConverter [iceberg]

2024-10-17 Thread via GitHub
singhpk234 opened a new pull request, #11346: URL: https://github.com/apache/iceberg/pull/11346 ## About the change The UUID type in the parquet writer expects ByteBuffer rather than UUID otherwise writer fails with : ``` class java.util.UUID cannot be cast to class [B (jav

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2420723401 @pvary I have created a PR to disable the flaky test for now. https://github.com/apache/iceberg/pull/11347 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Flink: make FLIP-27 default in SQL and mark the old FlinkSource as deprecated [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on code in PR #11345: URL: https://github.com/apache/iceberg/pull/11345#discussion_r1805519628 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/FlinkConfigOptions.java: ## @@ -88,7 +88,7 @@ private FlinkConfigOptions() {} public static final Con

Re: [I] Support writing to a branch [iceberg-python]

2024-10-17 Thread via GitHub
vinjai commented on issue #306: URL: https://github.com/apache/iceberg-python/issues/306#issuecomment-2420730890 PR is ready for review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Glue and Hive catalog return only Iceberg tables [iceberg-python]

2024-10-17 Thread via GitHub
sungwy commented on code in PR #1145: URL: https://github.com/apache/iceberg-python/pull/1145#discussion_r1805681185 ## pyiceberg/catalog/dynamodb.py: ## @@ -393,7 +393,7 @@ def drop_namespace(self, namespace: Union[str, Identifier]) -> None: raise NoSuchNamespaceE

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-10-17 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2420559048 @dwilson1988 Sounds good, I've made the changes, please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] flink:FlinkSink support dynamically changed schema [iceberg]

2024-10-17 Thread via GitHub
pvary commented on issue #4190: URL: https://github.com/apache/iceberg/issues/4190#issuecomment-2420417129 It sends the schema along with every record. I'm playing around with a somewhat similar, but more performant solution, where we send only the schemaId instead of the full schema. The t

Re: [PR] Spark: Add RewriteTablePath action interface [iceberg]

2024-10-17 Thread via GitHub
flyrain commented on code in PR #10920: URL: https://github.com/apache/iceberg/pull/10920#discussion_r1805439452 ## api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java: ## @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-17 Thread via GitHub
szehon-ho commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1805653101 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct valu

Re: [PR] Add flag to allow disabling creation of catalog tables [iceberg-python]

2024-10-17 Thread via GitHub
sungwy merged PR #1155: URL: https://github.com/apache/iceberg-python/pull/1155 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Add flag to allow disabling creation of catalog tables [iceberg-python]

2024-10-17 Thread via GitHub
sungwy commented on PR #1155: URL: https://github.com/apache/iceberg-python/pull/1155#issuecomment-2420987590 Thank you again for working on this PR @isc-patrick ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Core: delete temp metadata file when version already exists [iceberg]

2024-10-17 Thread via GitHub
leesf commented on PR #11350: URL: https://github.com/apache/iceberg/pull/11350#issuecomment-2421372412 @rdblue please help to review this PR, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] flink:FlinkSink support dynamically changed schema [iceberg]

2024-10-17 Thread via GitHub
pvary commented on issue #4190: URL: https://github.com/apache/iceberg/issues/4190#issuecomment-2421426631 The current tradeoff is more like doubled CPU time (we need caching and an extra serialization/deserialization step, which is on an already well optimized hot path). We are still looki

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-17 Thread via GitHub
arkadius commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2421498966 > @arkadius: The flaky test will be handled. In the meantime, could you please retrigger the tests, so we can have a green run? It looks like I have no right to do it - this butt

Re: [PR] Flink Support for TIMESTAMP_NANOS data type for PARQUET [iceberg]

2024-10-17 Thread via GitHub
pvary commented on PR #11348: URL: https://github.com/apache/iceberg/pull/11348#issuecomment-2421446880 Do the schema conversions handle the nano ts correctly? Do we get a `Timestamp(9)` when we convert to Flink schema, or a nano timestamp when we convert from the Iceberg schema? -- This

Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2024-10-17 Thread via GitHub
yunlou11 commented on PR #7914: URL: https://github.com/apache/iceberg/pull/7914#issuecomment-2421517117 ```sql CALL nessie.system.remove_orphan_files(table => 'nessie.robot_dev.robot_data') ``` ```text Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileS

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2024-10-17 Thread via GitHub
namrathamyske commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-2421151917 Would love to have this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805821651 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] Spec: Support geo type [iceberg]

2024-10-17 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1805831055 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaN

Re: [I] Support Vended Credentials for Azure Data Lake Store [iceberg-python]

2024-10-17 Thread via GitHub
sfc-gh-tbenroeck commented on issue #1146: URL: https://github.com/apache/iceberg-python/issues/1146#issuecomment-2421306564 @sungwy and @ndrluis [#961](https://github.com/apache/iceberg-python/pull/961) isn't a fix for this issue. There are two aspects, first the prefix `adls.sas-token` v

Re: [PR] Flink: Add RowConverter for Iceberg Source [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on code in PR #11301: URL: https://github.com/apache/iceberg/pull/11301#discussion_r1805835532 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/source/reader/RowConverter.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Flink: Add RowConverter for Iceberg Source [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on code in PR #11301: URL: https://github.com/apache/iceberg/pull/11301#discussion_r1805836604 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceBoundedRow.java: ## @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Software Fo

[PR] Core: delete temp metadata file when version already exists [iceberg]

2024-10-17 Thread via GitHub
leesf opened a new pull request, #11350: URL: https://github.com/apache/iceberg/pull/11350 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] Spark 3.5: Display write metrics on SQL UI [iceberg]

2024-10-17 Thread via GitHub
manuzhang commented on PR #11340: URL: https://github.com/apache/iceberg/pull/11340#issuecomment-2421230380 Add `number of total data files` to write command `AppendData` ![CleanShot 2024-10-18 at 11 31 55@2x](https://github.com/user-attachments/assets/e3cbb4b1-9649-4fe5-b2fe-624abb6a

[I] feat: abstract the MetricsEvaluator [iceberg-rust]

2024-10-17 Thread via GitHub
sundy-li opened a new issue, #674: URL: https://github.com/apache/iceberg-rust/issues/674 Every query engine has its expression framework to prune files. The current `Predicate` in `iceberg-rust` is very simple, it's not powerful yet. For example, it does not support Cast Expre

Re: [PR] Flink: make FLIP-27 default in SQL and mark the old FlinkSource as deprecated [iceberg]

2024-10-17 Thread via GitHub
pvary commented on code in PR #11345: URL: https://github.com/apache/iceberg/pull/11345#discussion_r1805883034 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -86,7 +85,6 @@ import org.slf4j.Logger; import org.slf4j.LoggerFactory;

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-17 Thread via GitHub
pvary commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2421434091 @arkadius: The flaky test will be handled. In the meantime, could you please retrigger the tests, so we can have a green run? -- This is an automated message from the Apache Git Servic

Re: [PR] Flink: disable the flaky range distribution bucketing tests for now [iceberg]

2024-10-17 Thread via GitHub
pvary commented on PR #11347: URL: https://github.com/apache/iceberg/pull/11347#issuecomment-2421432093 Nit: could we add a comment why this was ignored? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Core: Add credentials to loadTable / loadView responses [iceberg]

2024-10-17 Thread via GitHub
nastra commented on code in PR #11173: URL: https://github.com/apache/iceberg/pull/11173#discussion_r1804920592 ## core/src/main/java/org/apache/iceberg/rest/credentials/CredentialParser.java: ## @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Core: Add credentials to loadTable / loadView responses [iceberg]

2024-10-17 Thread via GitHub
nastra commented on code in PR #11173: URL: https://github.com/apache/iceberg/pull/11173#discussion_r1804920592 ## core/src/main/java/org/apache/iceberg/rest/credentials/CredentialParser.java: ## @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on code in PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#discussion_r1805007235 ## mkdocs/docs/configuration.md: ## @@ -30,16 +30,25 @@ Iceberg tables support table properties to configure table behavior. ### Write options -| Key

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2419887779 I noticed these 3 options are missing https://github.com/apache/iceberg-python/blob/7cf0c225c3cdb32ac5e390de06b7b0e4fe7de92e/pyiceberg/table/__init__.py#L197-L204 -- This

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1805015450 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -154,6 +164,12 @@ void caseSensitive(boolean newCaseSensitive) { void delete(F

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2419893328 Also curious if you have suggestion to prevent documentation drift in the future -- This is an automated message from the Apache Git Service. To respond to the message, please

[PR] feat: Add 'Create Namespace' command to CLI [iceberg-go]

2024-10-17 Thread via GitHub
alex-kar opened a new pull request, #179: URL: https://github.com/apache/iceberg-go/pull/179 As `ListNamespaces` already implemented in `RestCatalog`, we could add that command to CLI. ``` iceberg create [options] (namespace | table) IDENTIFIER ``` -- This is an automated mes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1805018372 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -162,11 +178,13 @@ void delete(F file) { void delete(CharSequence path) {

Re: [PR] Spec: add variant type [iceberg]

2024-10-17 Thread via GitHub
aihuaxu commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1805030809 ## format/spec.md: ## @@ -178,6 +178,8 @@ A **`list`** is a collection of values with some element type. The element field A **`map`** is a collection of key-valu

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1805055837 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -77,9 +78,12 @@ public String partition() { private boolean failMissingDeletePa

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-17 Thread via GitHub
danielcweeks commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1805082199 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Cre

  1   2   >