Re: [I] [feat] Support update table's sort order [iceberg-python]

2024-11-14 Thread via GitHub
Fokko commented on issue #1245: URL: https://github.com/apache/iceberg-python/issues/1245#issuecomment-2478154332 For inspiration we can look at the Java side, there is a method on the table called `replaceSortOrder()`: I would expect something like, similar to [Java](https://github

Re: [PR] Core,Open-API: Don't expose the `last-column-id` [iceberg]

2024-11-14 Thread via GitHub
Fokko commented on PR #11514: URL: https://github.com/apache/iceberg/pull/11514#issuecomment-2478125759 @danielcweeks You're right! I wanted to show that after updating the code, all the existing tests still pass. I've updated the tests in a separate commit https://github.com/apache

Re: [PR] Core,Open-API: Don't expose the `last-column-id` [iceberg]

2024-11-14 Thread via GitHub
Fokko commented on code in PR #11514: URL: https://github.com/apache/iceberg/pull/11514#discussion_r1843296963 ## open-api/rest-catalog-open-api.yaml: ## @@ -2692,7 +2692,14 @@ components: $ref: '#/components/schemas/Schema' last-column-id: type: i

[PR] Flink: make `StatisticsOrRecord` to be correctly serialized and deserโ€ฆ [iceberg]

2024-11-14 Thread via GitHub
huyuanfeng2018 opened a new pull request, #11557: URL: https://github.com/apache/iceberg/pull/11557 ## background When I configure distribution-mode=RANGE in the flink task, the task can process the data as expected, but the processing speed is insufficient. Like #7393, there is a seri

Re: [I] Ignore downcasting of column types when "mergeSchema" is set. [iceberg]

2024-11-14 Thread via GitHub
nastra closed issue #4849: Ignore downcasting of column types when "mergeSchema" is set. URL: https://github.com/apache/iceberg/issues/4849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] API, Core, Spark: Ignore schema merge updates from long -> int [iceberg]

2024-11-14 Thread via GitHub
nastra merged PR #11419: URL: https://github.com/apache/iceberg/pull/11419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[PR] Spark 3.5: Implement RewriteTablePath [iceberg]

2024-11-14 Thread via GitHub
szehon-ho opened a new pull request, #11555: URL: https://github.com/apache/iceberg/pull/11555 This is the implementation for #10920 (an action to prepare metadata for an Iceberg table for DR copy) This has been used in production for awhile in our setup, although support for rewrite

Re: [PR] Remove Hive 2 [iceberg]

2024-11-14 Thread via GitHub
nastra commented on code in PR #10996: URL: https://github.com/apache/iceberg/pull/10996#discussion_r1843196936 ## .github/workflows/hive-ci.yml: ## @@ -66,34 +66,34 @@ concurrency: cancel-in-progress: ${{ github.event_name == 'pull_request' }} jobs: - hive2-tests: + mr-

Re: [I] Parsing and Writing Tests for V3 Metadata [iceberg]

2024-11-14 Thread via GitHub
HonahX commented on issue #10764: URL: https://github.com/apache/iceberg/issues/10764#issuecomment-2477837411 Hi @RussellSpitzer, may I take this one of no one else has already started? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Spark: Remove extra columns for ColumnBatch [iceberg]

2024-11-14 Thread via GitHub
huaxingao commented on code in PR #11551: URL: https://github.com/apache/iceberg/pull/11551#discussion_r1843189307 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchReader.java: ## @@ -45,11 +45,23 @@ public class ColumnarBatchReader extends

Re: [PR] feat: support append data file and add e2e test [iceberg-rust]

2024-11-14 Thread via GitHub
ZENOTME commented on PR #349: URL: https://github.com/apache/iceberg-rust/pull/349#issuecomment-2477887517 Hi, I think this PR has been blocked for a long time. Recently AFAIK there have been some users who want to get write ability. Can we make progress on this? cc @Fokko @liurenjie1024 @

[PR] feat(FileIO): Adds user extensible FileIO [iceberg-rust]

2024-11-14 Thread via GitHub
BlakeOrth opened a new pull request, #699: URL: https://github.com/apache/iceberg-rust/pull/699 - Stubs out the initial required items to build a backwards compatible FileIO extension ## Based on the discussion in #172 it seemed like a backwards compatible FileIO extension ma

[I] Remove usage of deprecated functions from the codebase [iceberg-python]

2024-11-14 Thread via GitHub
kevinjqliu opened a new issue, #1327: URL: https://github.com/apache/iceberg-python/issues/1327 ### Apache Iceberg version None ### Please describe the bug ๐Ÿž Functions that are marked deprecated emit a warning. We should remove the use of deprecated functions from the co

Re: [I] [Spark Integration Tests] TestCreateTable::testCreateTableCommitProperties won't work on RESTCatalog [iceberg]

2024-11-14 Thread via GitHub
haizhou-zhao commented on issue #11554: URL: https://github.com/apache/iceberg/issues/11554#issuecomment-2477732940 cc: @dramaticlly (author of the test in the issue) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[I] [Spark Integration Tests] [iceberg]

2024-11-14 Thread via GitHub
haizhou-zhao opened a new issue, #11554: URL: https://github.com/apache/iceberg/issues/11554 ### Apache Iceberg version None ### Query engine None ### Please describe the bug ๐Ÿž Part of: https://github.com/apache/iceberg/issues/11079 ## Intro The te

Re: [I] Restrict generated locations to URI syntax [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10168: Restrict generated locations to URI syntax URL: https://github.com/apache/iceberg/issues/10168 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Iceberg Kafka Connect :: Writer Per Topic Partition Design [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #11290: URL: https://github.com/apache/iceberg/pull/11290#issuecomment-2477676002 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatโ€™s incorrect or this pul

Re: [PR] Ability to build for all Scala versions [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10606: Ability to build for all Scala versions URL: https://github.com/apache/iceberg/pull/10606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Fixed an incorrect example [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10627: URL: https://github.com/apache/iceberg/pull/10627#issuecomment-2477675507 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [I] `rewrite_data_files` does not respect table sort order [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10346: URL: https://github.com/apache/iceberg/issues/10346#issuecomment-2477675248 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Core: improve DefaultErrorHandler message for unhandled codes [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10640: Core: improve DefaultErrorHandler message for unhandled codes URL: https://github.com/apache/iceberg/pull/10640 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Support create multiple element ns together for nessie [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10630: URL: https://github.com/apache/iceberg/pull/10630#issuecomment-2477675538 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Fixed an incorrect example [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10627: Fixed an incorrect example URL: https://github.com/apache/iceberg/pull/10627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Core: improve DefaultErrorHandler message for unhandled codes [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10640: URL: https://github.com/apache/iceberg/pull/10640#issuecomment-2477675576 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Core: Fix possible proken in `Tasks.Builder.runSingleThreaded` [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10613: URL: https://github.com/apache/iceberg/pull/10613#issuecomment-2477675482 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Core: Fix possible proken in `Tasks.Builder.runSingleThreaded` [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10613: Core: Fix possible proken in `Tasks.Builder.runSingleThreaded` URL: https://github.com/apache/iceberg/pull/10613 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] Doc: Spark quickstart needs to create context directory first [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10572: URL: https://github.com/apache/iceberg/pull/10572#issuecomment-2477675351 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] S3 InputsStream: Reopen connection on Connection Reset [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10470: URL: https://github.com/apache/iceberg/pull/10470#issuecomment-2477675294 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] S3 InputsStream: Reopen connection on Connection Reset [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10470: S3 InputsStream: Reopen connection on Connection Reset URL: https://github.com/apache/iceberg/pull/10470 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Spec: Make NDV blob metadata property required [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10549: URL: https://github.com/apache/iceberg/pull/10549#issuecomment-2477675322 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Spec: Add GCS and ADLS configuration to REST table load [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10576: URL: https://github.com/apache/iceberg/pull/10576#issuecomment-2477675386 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Support create multiple element ns together for nessie [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10630: Support create multiple element ns together for nessie URL: https://github.com/apache/iceberg/pull/10630 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] spark.table() raises warn: Unclosed S3FileIO instance in HadoopTableOperations [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10145: spark.table() raises warn: Unclosed S3FileIO instance in HadoopTableOperations URL: https://github.com/apache/iceberg/issues/10145 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Changes in describe behaviour of a table break partition info? [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10174: Changes in describe behaviour of a table break partition info? URL: https://github.com/apache/iceberg/issues/10174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Iceberg may occur data duplication when use flink to write data to iceberg and commit failed [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10165: URL: https://github.com/apache/iceberg/issues/10165#issuecomment-2477675057 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Newly generated Positional Delete file has lowerbound & upperbound values as empty after running rewrite_position_delete_files spark procedure [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10146: URL: https://github.com/apache/iceberg/issues/10146#issuecomment-2477674997 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] Ability to build for all Scala versions [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on PR #10606: URL: https://github.com/apache/iceberg/pull/10606#issuecomment-2477675438 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [I] Does the FlushOnEveryBlock feature in Avro affect Iceberg data integrity? [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10142: Does the FlushOnEveryBlock feature in Avro affect Iceberg data integrity? URL: https://github.com/apache/iceberg/issues/10142 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] Cannot insert table created by spark temp into iceberg table [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10164: URL: https://github.com/apache/iceberg/issues/10164#issuecomment-2477675028 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] spark.table() raises warn: Unclosed S3FileIO instance in NessieTableOperations [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10144: URL: https://github.com/apache/iceberg/issues/10144#issuecomment-2477674940 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] spark.table() raises warn: Unclosed S3FileIO instance in HadoopTableOperations [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10145: URL: https://github.com/apache/iceberg/issues/10145#issuecomment-2477674966 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] Doc: Spark quickstart needs to create context directory first [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10572: Doc: Spark quickstart needs to create context directory first URL: https://github.com/apache/iceberg/pull/10572 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Spec: Add GCS and ADLS configuration to REST table load [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10576: Spec: Add GCS and ADLS configuration to REST table load URL: https://github.com/apache/iceberg/pull/10576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] spark.table() raises warn: Unclosed S3FileIO instance in NessieTableOperations [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10144: spark.table() raises warn: Unclosed S3FileIO instance in NessieTableOperations URL: https://github.com/apache/iceberg/issues/10144 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Does the FlushOnEveryBlock feature in Avro affect Iceberg data integrity? [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10142: URL: https://github.com/apache/iceberg/issues/10142#issuecomment-2477674912 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] Spec: Make NDV blob metadata property required [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed pull request #10549: Spec: Make NDV blob metadata property required URL: https://github.com/apache/iceberg/pull/10549 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [I] Newly generated Positional Delete file has lowerbound & upperbound values as empty after running rewrite_position_delete_files spark procedure [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10146: Newly generated Positional Delete file has lowerbound & upperbound values as empty after running rewrite_position_delete_files spark procedure URL: https://github.com/apache/iceberg/issues/10146 -- This is an automated message from the Apache Git Serv

Re: [I] Support for writing Parquet files from the Iceberg Java API without the Hadoop Configuration class [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10180: URL: https://github.com/apache/iceberg/issues/10180#issuecomment-2477675152 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Support for writing Parquet files from the Iceberg Java API without the Hadoop Configuration class [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10180: Support for writing Parquet files from the Iceberg Java API without the Hadoop Configuration class URL: https://github.com/apache/iceberg/issues/10180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Changes in describe behaviour of a table break partition info? [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10174: URL: https://github.com/apache/iceberg/issues/10174#issuecomment-2477675125 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Iceberg may occur data duplication when use flink to write data to iceberg and commit failed [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10165: Iceberg may occur data duplication when use flink to write data to iceberg and commit failed URL: https://github.com/apache/iceberg/issues/10165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] Restrict generated locations to URI syntax [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] commented on issue #10168: URL: https://github.com/apache/iceberg/issues/10168#issuecomment-2477675093 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Cannot insert table created by spark temp into iceberg table [iceberg]

2024-11-14 Thread via GitHub
github-actions[bot] closed issue #10164: Cannot insert table created by spark temp into iceberg table URL: https://github.com/apache/iceberg/issues/10164 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Deprecate `adlfs.*` configuration properties in favor of `adls.*` [iceberg-python]

2024-11-14 Thread via GitHub
kevinjqliu closed issue #866: Deprecate `adlfs.*` configuration properties in favor of `adls.*` URL: https://github.com/apache/iceberg-python/issues/866 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Deprecate `adlfs.*` configuration properties in favor of `adls.*` [iceberg-python]

2024-11-14 Thread via GitHub
kevinjqliu commented on issue #866: URL: https://github.com/apache/iceberg-python/issues/866#issuecomment-2477665161 closed by #961 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[I] Iceberg fails to read the parquet file while rewriting data files. [iceberg]

2024-11-14 Thread via GitHub
himani1126 opened a new issue, #11553: URL: https://github.com/apache/iceberg/issues/11553 ### Apache Iceberg version 1.2.1 ### Query engine Spark ### Please describe the bug ๐Ÿž While compacting data files using spark 3.3, I am seeing the following error as

Re: [PR] REST: Use HEAD request to check table existence [iceberg]

2024-11-14 Thread via GitHub
ebyhr commented on PR #10999: URL: https://github.com/apache/iceberg/pull/10999#issuecomment-2477617070 @Fokko Thank you for your review. CI is green now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] Docs: Use the correct YAML text block indicator to prevent formatting issues [iceberg]

2024-11-14 Thread via GitHub
neodon opened a new pull request, #11552: URL: https://github.com/apache/iceberg/pull/11552 In the Spark Quickstart guide, the initial docker-compose example specifies an entrypoint script in a multi-line string. The string is mistakenly started with the 'folded style' indicator `>`, which

[PR] Bump coverage from 7.6.4 to 7.6.5 [iceberg-python]

2024-11-14 Thread via GitHub
dependabot[bot] opened a new pull request, #1325: URL: https://github.com/apache/iceberg-python/pull/1325 Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.6.4 to 7.6.5. Changelog Sourced from https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst";>coverage's cha

[PR] Bump mkdocstrings from 0.26.2 to 0.27.0 [iceberg-python]

2024-11-14 Thread via GitHub
dependabot[bot] opened a new pull request, #1324: URL: https://github.com/apache/iceberg-python/pull/1324 Bumps [mkdocstrings](https://github.com/mkdocstrings/mkdocstrings) from 0.26.2 to 0.27.0. Release notes Sourced from https://github.com/mkdocstrings/mkdocstrings/releases";>mkd

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-11-14 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2477543077 @dwilson1988 made the suggested changes, there's a deprecation warning on the S3 config EndpointResolver methods that I haven't had time to look into, maybe you could take a look?

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-11-14 Thread via GitHub
loicalleyne commented on code in PR #176: URL: https://github.com/apache/iceberg-go/pull/176#discussion_r1842965194 ## io/blob.go: ## @@ -0,0 +1,311 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-11-14 Thread via GitHub
loicalleyne commented on code in PR #176: URL: https://github.com/apache/iceberg-go/pull/176#discussion_r1842949566 ## io/blob.go: ## @@ -0,0 +1,311 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] Docs: 4 Spaces are Required for Sublists [iceberg]

2024-11-14 Thread via GitHub
RussellSpitzer merged PR #11549: URL: https://github.com/apache/iceberg/pull/11549 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Parquet: Use native getRowIndexOffset support instead of calculating it [iceberg]

2024-11-14 Thread via GitHub
wypoon commented on PR #11520: URL: https://github.com/apache/iceberg/pull/11520#issuecomment-2477450381 @huaxingao @Fokko I have updated the PR; please review again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Use Snapshot's statistics file in SparkScan [iceberg]

2024-11-14 Thread via GitHub
jeesou commented on PR #11040: URL: https://github.com/apache/iceberg/pull/11040#issuecomment-2477244116 Hi @karuppayya , @amogh-jahagirdar as per our discussion to introduce a config to let users decide if they are fine with best effort search, I was thinking of adding a kind of threshold

Re: [PR] Bug Fix: `metadata_location` to be optional in `TableResponse` [iceberg-python]

2024-11-14 Thread via GitHub
kevinjqliu merged PR #1321: URL: https://github.com/apache/iceberg-python/pull/1321 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Bug Fix: `metadata_location` to be optional in `TableResponse` [iceberg-python]

2024-11-14 Thread via GitHub
sungwy commented on code in PR #1321: URL: https://github.com/apache/iceberg-python/pull/1321#discussion_r1842812452 ## iceberg: ## Review Comment: good question - I'll get it removed -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Bug Fix: `metadata_location` to be optional in `TableResponse` [iceberg-python]

2024-11-14 Thread via GitHub
kevinjqliu commented on code in PR #1321: URL: https://github.com/apache/iceberg-python/pull/1321#discussion_r1842796318 ## iceberg: ## Review Comment: nit: whats this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-11-14 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1842726587 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,20 +87,29 @@ protected static Object[][] parameters() { }

[I] RC verification script should use artifacts fro apache repo [iceberg-go]

2024-11-14 Thread via GitHub
kevinjqliu opened a new issue, #204: URL: https://github.com/apache/iceberg-go/issues/204 ### Apache Iceberg version None ### Please describe the bug ๐Ÿž https://github.com/apache/iceberg-go/blob/main/dev/release/verify_rc.sh#L38 should reference https://dist.apache.org/

Re: [PR] Spark: Remove extra columns for ColumnBatch [iceberg]

2024-11-14 Thread via GitHub
singhpk234 commented on code in PR #11551: URL: https://github.com/apache/iceberg/pull/11551#discussion_r1842720884 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchReader.java: ## @@ -45,11 +45,23 @@ public class ColumnarBatchReader extend

Re: [PR] Core,Open-API: Don't expose the `last-column-id` [iceberg]

2024-11-14 Thread via GitHub
danielcweeks commented on PR #11514: URL: https://github.com/apache/iceberg/pull/11514#issuecomment-2477166647 @Fokko as part of the proposed deprecation, we should update all tests that use this (I found multiple references). -- This is an automated message from the Apache Git Service. T

Re: [PR] Bug Fix: `metadata_location` to be optional in `TableResponse` [iceberg-python]

2024-11-14 Thread via GitHub
sungwy commented on PR #1321: URL: https://github.com/apache/iceberg-python/pull/1321#issuecomment-2477173721 > I think it would be good to have a test to cover this. It should be pretty straightforward by adding another fixture that covers this case: Just a test for testing the creat

Re: [PR] Core,Open-API: Don't expose the `last-column-id` [iceberg]

2024-11-14 Thread via GitHub
danielcweeks commented on code in PR #11514: URL: https://github.com/apache/iceberg/pull/11514#discussion_r1842725865 ## open-api/rest-catalog-open-api.yaml: ## @@ -2692,7 +2692,14 @@ components: $ref: '#/components/schemas/Schema' last-column-id:

Re: [PR] Spark: Remove extra columns for ColumnBatch [iceberg]

2024-11-14 Thread via GitHub
singhpk234 commented on code in PR #11551: URL: https://github.com/apache/iceberg/pull/11551#discussion_r1842720884 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchReader.java: ## @@ -45,11 +45,23 @@ public class ColumnarBatchReader extend

Re: [PR] Ignore schema merge updates from long -> int [iceberg]

2024-11-14 Thread via GitHub
rocco408 commented on code in PR #11419: URL: https://github.com/apache/iceberg/pull/11419#discussion_r1842672093 ## core/src/main/java/org/apache/iceberg/schema/UnionByNameVisitor.java: ## @@ -180,6 +180,17 @@ private void updateColumn(Types.NestedField field, Types.NestedFiel

Re: [PR] Remove Hive 2 [iceberg]

2024-11-14 Thread via GitHub
nastra commented on code in PR #10996: URL: https://github.com/apache/iceberg/pull/10996#discussion_r1842669520 ## mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergInputFormat.java: ## @@ -63,19 +61,15 @@ public class HiveIcebergInputFormat extends MapredIcebergInputForma

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-11-14 Thread via GitHub
danielcweeks commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1842636900 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,20 +87,29 @@ protected static Object[][] parameters() { }

Re: [PR] API: Add Variant data type [iceberg]

2024-11-14 Thread via GitHub
RussellSpitzer commented on PR #11324: URL: https://github.com/apache/iceberg/pull/11324#issuecomment-2477045093 @rdblue I think you are the last reviewer on this, do you have any further comments for this one? -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] Support Parquet Files with Delta Encoding and other Parquet V2 Features [iceberg]

2024-11-14 Thread via GitHub
RussellSpitzer commented on issue #11371: URL: https://github.com/apache/iceberg/issues/11371#issuecomment-2477036769 Cool, I think @JonasJ-ap may also be checking it out so coordinate if you can :) -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] Spark: Remove extra columns for ColumnBatch [iceberg]

2024-11-14 Thread via GitHub
huaxingao commented on code in PR #11551: URL: https://github.com/apache/iceberg/pull/11551#discussion_r1842606127 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderDeletes.java: ## @@ -622,6 +624,41 @@ public void testPosDeletesOnParquetFileWithM

[PR] Spark: Remove extra columns for ColumnBatch [iceberg]

2024-11-14 Thread via GitHub
huaxingao opened a new pull request, #11551: URL: https://github.com/apache/iceberg/pull/11551 In Equality Delete, we build `ColumnarBatchReader` for the equality delete filter columns to read their values and determine which rows are deleted. If these filter columns are not among the reque

Re: [PR] Core, Spark: Refactor RewriteFileGroup planner to core [iceberg]

2024-11-14 Thread via GitHub
RussellSpitzer commented on code in PR #11513: URL: https://github.com/apache/iceberg/pull/11513#discussion_r1842585965 ## core/src/main/java/org/apache/iceberg/actions/RewriteFileGroupPlanner.java: ## @@ -0,0 +1,175 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Support WASB scheme in ADLSFileIO [iceberg]

2024-11-14 Thread via GitHub
bryanck commented on PR #11504: URL: https://github.com/apache/iceberg/pull/11504#issuecomment-2476519972 Thanks for the PR @mrcnc ! And for the reviews @RussellSpitzer @amogh-jahagirdar @jbonofre ! -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Docs: 4 Spaces are Required for Sublists [iceberg]

2024-11-14 Thread via GitHub
RussellSpitzer commented on PR #11549: URL: https://github.com/apache/iceberg/pull/11549#issuecomment-2476874193 Preview https://github.com/user-attachments/assets/cf16dc80-56cb-4a2e-b5b2-af1a3a628d33";> -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] Remove Dependency on Hadoop's Filesystem Class from Remove Orphan Files [iceberg]

2024-11-14 Thread via GitHub
danielcweeks commented on issue #11541: URL: https://github.com/apache/iceberg/issues/11541#issuecomment-2476866028 @amogh-jahagirdar I don't think #7914 is a good approach to addressing this as it's not scalable. This trades off distributed listing for single iteration. To support p

Re: [I] Flink Merge On Read Behavior? Equality & Positional Deletes [iceberg]

2024-11-14 Thread via GitHub
FranMorilloAWS commented on issue #11535: URL: https://github.com/apache/iceberg/issues/11535#issuecomment-2476650638 Why we need to use both? Is there an example scenario we can go over? Thanks in advanced for answering me :) -- This is an automated message from the Apache Git Serv

Re: [PR] Support WASB scheme in ADLSFileIO [iceberg]

2024-11-14 Thread via GitHub
bryanck merged PR #11504: URL: https://github.com/apache/iceberg/pull/11504 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Build: Bump kafka from 3.8.1 to 3.9.0 [iceberg]

2024-11-14 Thread via GitHub
bryanck commented on PR #11508: URL: https://github.com/apache/iceberg/pull/11508#issuecomment-2476513155 Thanks for the reviews @nastra and @jbonofre ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Build: Bump kafka from 3.8.1 to 3.9.0 [iceberg]

2024-11-14 Thread via GitHub
bryanck merged PR #11508: URL: https://github.com/apache/iceberg/pull/11508 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [I] Store min/max stats per column per partition [iceberg]

2024-11-14 Thread via GitHub
tbaeg commented on issue #11083: URL: https://github.com/apache/iceberg/issues/11083#issuecomment-2476490616 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] Spec: Add cross-region bucket access property to config [iceberg]

2024-11-14 Thread via GitHub
munendrasn commented on PR #11260: URL: https://github.com/apache/iceberg/pull/11260#issuecomment-2476506543 @nastra Could you please take a look.. would this be correct place to add this info? -- This is an automated message from the Apache Git Service. To respond to the message, please

[PR] Core, Rest: Enable useSystemProperties on RESTClient [iceberg]

2024-11-14 Thread via GitHub
munendrasn opened a new pull request, #11548: URL: https://github.com/apache/iceberg/pull/11548 * When `useSystemProperties` on the httpClient, it reads the proxy settings if not configured explicitly from system properties. * Introduce new rest client property enable or disable consumin

Re: [I] Flink Merge On Read Behavior? Equality & Positional Deletes [iceberg]

2024-11-14 Thread via GitHub
pvary commented on issue #11535: URL: https://github.com/apache/iceberg/issues/11535#issuecomment-2476482562 Equality delete: - Written if the ID first deleted during a checkpoint Positional delete: - A record is inserted with a given ID, and then it is deleted during the same c

Re: [PR] Core, Spark: Refactor RewriteFileGroup planner to core [iceberg]

2024-11-14 Thread via GitHub
pvary commented on PR #11513: URL: https://github.com/apache/iceberg/pull/11513#issuecomment-2476466989 @RussellSpitzer: This is ready to another round if you have time. Thanks, Peter -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[PR] Docs: Fix level of Deletion Vectors [iceberg]

2024-11-14 Thread via GitHub
manuzhang opened a new pull request, #11547: URL: https://github.com/apache/iceberg/pull/11547 `Deletion Vectors` should be at the same level as `Position Delete Files` and `Equality Delete Files`. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Core: Add support for `view-default` property in catalog [iceberg]

2024-11-14 Thread via GitHub
ebyhr commented on code in PR #11064: URL: https://github.com/apache/iceberg/pull/11064#discussion_r1842199190 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -1200,6 +1200,8 @@ private RESTViewBuilder(SessionContext context, TableIdentifier identif

Re: [PR] TableMetadataBuilder [iceberg-rust]

2024-11-14 Thread via GitHub
Fokko commented on code in PR #587: URL: https://github.com/apache/iceberg-rust/pull/587#discussion_r1842164987 ## crates/iceberg/src/spec/table_metadata_builder.rs: ## @@ -0,0 +1,2074 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] TableMetadataBuilder [iceberg-rust]

2024-11-14 Thread via GitHub
Fokko commented on code in PR #587: URL: https://github.com/apache/iceberg-rust/pull/587#discussion_r1842113157 ## crates/iceberg/src/spec/table_metadata_builder.rs: ## @@ -0,0 +1,2074 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

  1   2   >