Re: [PR] Core: Add a util to compute partition stats [iceberg]

2024-09-20 Thread via GitHub
aokolnychyi commented on code in PR #11146: URL: https://github.com/apache/iceberg/pull/11146#discussion_r1769469167 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -289,4 +289,21 @@ private static Schema lazyColumnProjection(TableScanContext context, Schema sche

Re: [I] metadata.json delete [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8007: URL: https://github.com/apache/iceberg/issues/8007#issuecomment-2364773275 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Spark 3.4: Incremental scan specify branch [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8384: Spark 3.4: Incremental scan specify branch URL: https://github.com/apache/iceberg/pull/8384 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Nessie: Fix possible table-metadata loss (backport #8413) [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8414: URL: https://github.com/apache/iceberg/pull/8414#issuecomment-2364773574 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] API: Build accessor from struct directly [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8367: URL: https://github.com/apache/iceberg/pull/8367#issuecomment-2364773496 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Fix Iceberg to handle literal short and byte [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8412: URL: https://github.com/apache/iceberg/pull/8412#issuecomment-2364773558 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spec: deprecate distinct_counts in data_file [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8395: Spec: deprecate distinct_counts in data_file URL: https://github.com/apache/iceberg/pull/8395 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Spec: deprecate distinct_counts in data_file [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8395: URL: https://github.com/apache/iceberg/pull/8395#issuecomment-2364773527 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark 3.4, Docs: Add RemoveOrphanFiles time-interval specification and testing option to the exception message [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8324: URL: https://github.com/apache/iceberg/pull/8324#issuecomment-2364773475 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Core: Support changing compression codec for ManifestWriter [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8284: Core: Support changing compression codec for ManifestWriter URL: https://github.com/apache/iceberg/pull/8284 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Fix Iceberg to handle literal short and byte [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8412: Fix Iceberg to handle literal short and byte URL: https://github.com/apache/iceberg/pull/8412 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] CDC vectorized reader [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8089: URL: https://github.com/apache/iceberg/issues/8089#issuecomment-2364773376 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Move field into place when adding during schema evolution [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8409: URL: https://github.com/apache/iceberg/pull/8409#issuecomment-2364773545 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [I] Cannot add data files to target table because that table is partitioned and contains non-identity partition transforms which will not be compatible [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8095: URL: https://github.com/apache/iceberg/issues/8095#issuecomment-2364773393 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Move field into place when adding during schema evolution [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8409: Move field into place when adding during schema evolution URL: https://github.com/apache/iceberg/pull/8409 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Nessie: Fix possible table-metadata loss (backport #8413) [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8414: Nessie: Fix possible table-metadata loss (backport #8413) URL: https://github.com/apache/iceberg/pull/8414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Spark 3.4, Docs: Add RemoveOrphanFiles time-interval specification and testing option to the exception message [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8324: Spark 3.4, Docs: Add RemoveOrphanFiles time-interval specification and testing option to the exception message URL: https://github.com/apache/iceberg/pull/8324 -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] API: Build accessor from struct directly [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8367: API: Build accessor from struct directly URL: https://github.com/apache/iceberg/pull/8367 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Spark 3.4: Incremental scan specify branch [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8384: URL: https://github.com/apache/iceberg/pull/8384#issuecomment-2364773514 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Core: Support changing compression codec for ManifestWriter [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8284: URL: https://github.com/apache/iceberg/pull/8284#issuecomment-2364773456 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] relativePath [wip] [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed pull request #8260: relativePath [wip] URL: https://github.com/apache/iceberg/pull/8260 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [I] Api:Fix add the same listener to the same listeners queue multiple times [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8107: URL: https://github.com/apache/iceberg/issues/8107#issuecomment-2364773402 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] How to decide bucket number [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8087: How to decide bucket number URL: https://github.com/apache/iceberg/issues/8087 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [I] Flink: Implements SupportsDynamicFiltering interface [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8048: Flink: Implements SupportsDynamicFiltering interface URL: https://github.com/apache/iceberg/issues/8048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] The data of the same table is distributed across different file systems [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8055: URL: https://github.com/apache/iceberg/issues/8055#issuecomment-2364773340 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] How Can safely delete small files after executed rewriteDataFiles [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8066: How Can safely delete small files after executed rewriteDataFiles URL: https://github.com/apache/iceberg/issues/8066 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] remove_orphan_files throws reached maximum depth exception in AWS EMR-6.11.0 [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8022: remove_orphan_files throws reached maximum depth exception in AWS EMR-6.11.0 URL: https://github.com/apache/iceberg/issues/8022 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] performance degradation after migrating to spark 3.3.1 when using iceberg merge into [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #7998: performance degradation after migrating to spark 3.3.1 when using iceberg merge into URL: https://github.com/apache/iceberg/issues/7998 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Write ordered by within unique physical partitions folder (exclude hash path). [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8008: URL: https://github.com/apache/iceberg/issues/8008#issuecomment-2364773292 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Write ordered by within unique physical partitions folder (exclude hash path). [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8008: Write ordered by within unique physical partitions folder (exclude hash path). URL: https://github.com/apache/iceberg/issues/8008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] metadata.json delete [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8007: metadata.json delete URL: https://github.com/apache/iceberg/issues/8007 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] use tez can't write data [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #7990: URL: https://github.com/apache/iceberg/issues/7990#issuecomment-2364773253 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Docs: Add YouTube to the Apache website. [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #7967: Docs: Add YouTube to the Apache website. URL: https://github.com/apache/iceberg/issues/7967 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] I cannot package my application as uberjar using maven shade plugin. [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #7953: URL: https://github.com/apache/iceberg/issues/7953#issuecomment-2364773193 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Add FileIO docs [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #7966: URL: https://github.com/apache/iceberg/issues/7966#issuecomment-2364773212 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Improve Documentation on getting started with GCS [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #7948: Improve Documentation on getting started with GCS URL: https://github.com/apache/iceberg/issues/7948 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Migrate/ snapshot action should exclude file that does not contain any record [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #7949: URL: https://github.com/apache/iceberg/issues/7949#issuecomment-2364773181 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Improve Documentation on getting started with GCS [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #7948: URL: https://github.com/apache/iceberg/issues/7948#issuecomment-2364773164 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] CDC vectorized reader [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8089: CDC vectorized reader URL: https://github.com/apache/iceberg/issues/8089 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] The data of the same table is distributed across different file systems [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8055: The data of the same table is distributed across different file systems URL: https://github.com/apache/iceberg/issues/8055 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [I] Flink: Implements SupportsDynamicFiltering interface [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8048: URL: https://github.com/apache/iceberg/issues/8048#issuecomment-2364773331 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Usage of Hidden Partitioning [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8031: Usage of Hidden Partitioning URL: https://github.com/apache/iceberg/issues/8031 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] relativePath [wip] [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on PR #8260: URL: https://github.com/apache/iceberg/pull/8260#issuecomment-2364773434 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [I] Api:Fix add the same listener to the same listeners queue multiple times [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8107: Api:Fix add the same listener to the same listeners queue multiple times URL: https://github.com/apache/iceberg/issues/8107 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [I] Migrate/ snapshot action should exclude file that does not contain any record [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #7949: Migrate/ snapshot action should exclude file that does not contain any record URL: https://github.com/apache/iceberg/issues/7949 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Cannot add data files to target table because that table is partitioned and contains non-identity partition transforms which will not be compatible [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #8095: Cannot add data files to target table because that table is partitioned and contains non-identity partition transforms which will not be compatible URL: https://github.com/apache/iceberg/issues/8095 -- This is an automated message from the Apache Git S

Re: [I] remove_orphan_files throws reached maximum depth exception in AWS EMR-6.11.0 [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8022: URL: https://github.com/apache/iceberg/issues/8022#issuecomment-2364773304 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] How Can safely delete small files after executed rewriteDataFiles [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8066: URL: https://github.com/apache/iceberg/issues/8066#issuecomment-2364773350 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Usage of Hidden Partitioning [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #8031: URL: https://github.com/apache/iceberg/issues/8031#issuecomment-2364773313 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] use tez can't write data [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #7990: use tez can't write data URL: https://github.com/apache/iceberg/issues/7990 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [I] Docs: Add YouTube to the Apache website. [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] commented on issue #7967: URL: https://github.com/apache/iceberg/issues/7967#issuecomment-2364773236 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Add FileIO docs [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #7966: Add FileIO docs URL: https://github.com/apache/iceberg/issues/7966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [I] I cannot package my application as uberjar using maven shade plugin. [iceberg]

2024-09-20 Thread via GitHub
github-actions[bot] closed issue #7953: I cannot package my application as uberjar using maven shade plugin. URL: https://github.com/apache/iceberg/issues/7953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Core: Remove unused code for streaming position deletes [iceberg]

2024-09-20 Thread via GitHub
wypoon commented on PR #11175: URL: https://github.com/apache/iceberg/pull/11175#issuecomment-2364760327 @amogh-jahagirdar I have restored the removed public static methods in `Deletes` and deprecated them instead. However, I have replaced their implementations with the equivalent of using

Re: [PR] Parquet: update PruneColumns to inherit from TypeWithSchemaVisitor to have Iceberg type [iceberg]

2024-09-20 Thread via GitHub
aihuaxu commented on PR #11179: URL: https://github.com/apache/iceberg/pull/11179#issuecomment-2364749714 cc @rdblue and @RussellSpitzer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-20 Thread via GitHub
huaxingao commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1769350856 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -198,27 +198,31 @@ protected Statistics estimateStatistics(Snapshot snapsho

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-20 Thread via GitHub
huaxingao commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1769350721 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -198,27 +198,31 @@ protected Statistics estimateStatistics(Snapshot snapsho

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-20 Thread via GitHub
slessard commented on PR #10953: URL: https://github.com/apache/iceberg/pull/10953#issuecomment-2364726209 @amogh-jahagirdar This PR is ready for your review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-20 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1769332765 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/ArrowReaderTest.java: ## @@ -262,6 +264,120 @@ public void testReadColumnFilter2() throws Exception {

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-20 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1769332669 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/ArrowReaderTest.java: ## @@ -262,6 +264,120 @@ public void testReadColumnFilter2() throws Exception {

[I] Add Variant type to iceberg [iceberg]

2024-09-20 Thread via GitHub
aihuaxu opened a new issue, #11178: URL: https://github.com/apache/iceberg/issues/11178 ### Feature Request / Improvement As discussed in the mailing list, and [described in this doc](https://docs.google.com/document/d/1QjhpG_SVNPZh3anFcpicMQx90ebwjL7rmzFYfUP89Iw/edit), I'd like to a

[PR] [DRAFT] Remove unused code for streaming position deletes [iceberg]

2024-09-20 Thread via GitHub
wypoon opened a new pull request, #11177: URL: https://github.com/apache/iceberg/pull/11177 For testing only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] API: Deprecate ContentFile#path API and add location API which returns String [iceberg]

2024-09-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #11092: URL: https://github.com/apache/iceberg/pull/11092#discussion_r1769326601 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -43,9 +43,19 @@ public interface ContentFile { */ FileContent content(); - /** Retu

Re: [PR] API: Deprecate ContentFile#path API and add location API which returns String [iceberg]

2024-09-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #11092: URL: https://github.com/apache/iceberg/pull/11092#discussion_r1769326601 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -43,9 +43,19 @@ public interface ContentFile { */ FileContent content(); - /** Retu

[I] Retry logic in JDBC catalog fails with class cast exception if driver exception class does not extend SQLTransientException [iceberg]

2024-09-20 Thread via GitHub
asolovey opened a new issue, #11176: URL: https://github.com/apache/iceberg/issues/11176 ### Apache Iceberg version 1.6.1 (latest release) ### Query engine None ### Please describe the bug 🐞 If retryable error codes are configured, and JDBC catalog connectio

Re: [PR] API: Deprecate ContentFile#path API and add location API which returns String [iceberg]

2024-09-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #11092: URL: https://github.com/apache/iceberg/pull/11092#discussion_r1769276929 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -43,9 +43,19 @@ public interface ContentFile { */ FileContent content(); - /** Retu

Re: [PR] Core: Remove unused code for streaming position deletes [iceberg]

2024-09-20 Thread via GitHub
wypoon commented on code in PR #11175: URL: https://github.com/apache/iceberg/pull/11175#discussion_r1769273124 ## core/src/main/java/org/apache/iceberg/deletes/Deletes.java: ## @@ -192,29 +185,6 @@ public static PositionDeleteIndex toPositionIndex(CloseableIterable posDel

Re: [PR] Use `cachetools's LRUCache` to cache manifest list [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu commented on code in PR #1187: URL: https://github.com/apache/iceberg-python/pull/1187#discussion_r1769271462 ## pyiceberg/manifest.py: ## @@ -620,6 +623,13 @@ def fetch_manifest_entry(self, io: FileIO, discard_deleted: bool = True) -> List ] +@cache

Re: [PR] Use `cachetools's LRUCache` to cache manifest list [iceberg-python]

2024-09-20 Thread via GitHub
sungwy commented on code in PR #1187: URL: https://github.com/apache/iceberg-python/pull/1187#discussion_r1769270495 ## pyiceberg/manifest.py: ## @@ -620,6 +623,13 @@ def fetch_manifest_entry(self, io: FileIO, discard_deleted: bool = True) -> List ] +@cached(ca

Re: [PR] Use `cachetools's LRUCache` to cache manifest list [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu commented on code in PR #1187: URL: https://github.com/apache/iceberg-python/pull/1187#discussion_r1769269027 ## pyiceberg/manifest.py: ## @@ -620,6 +623,13 @@ def fetch_manifest_entry(self, io: FileIO, discard_deleted: bool = True) -> List ] +@cache

Re: [I] Remove python 3.8 support [iceberg-python]

2024-09-20 Thread via GitHub
sungwy commented on issue #1121: URL: https://github.com/apache/iceberg-python/issues/1121#issuecomment-2364626655 I started the voting thread @kevinjqliu 🙂 https://lists.apache.org/thread/d9r91hgq4wr30c2qm5y2zbkqb1nhjngh -- This is an automated message from the Apache Git Servic

Re: [I] Add Docstrings to `pyiceberg/table/__init__.py` [iceberg-python]

2024-09-20 Thread via GitHub
sungwy closed issue #1190: Add Docstrings to `pyiceberg/table/__init__.py` URL: https://github.com/apache/iceberg-python/issues/1190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Use `cachetools's LRUCache` to cache manifest list [iceberg-python]

2024-09-20 Thread via GitHub
sungwy commented on code in PR #1187: URL: https://github.com/apache/iceberg-python/pull/1187#discussion_r1769236181 ## pyiceberg/manifest.py: ## @@ -620,6 +623,13 @@ def fetch_manifest_entry(self, io: FileIO, discard_deleted: bool = True) -> List ] +@cached(ca

Re: [PR] [feature] reimplement Snapshot manifest cache [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu closed pull request #1185: [feature] reimplement Snapshot manifest cache URL: https://github.com/apache/iceberg-python/pull/1185 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [feature] reimplement Snapshot manifest cache [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu commented on PR #1185: URL: https://github.com/apache/iceberg-python/pull/1185#issuecomment-2364588372 Closing in favor of #1187 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Add Docstrings to `pyiceberg/table/__init__.py` [iceberg-python]

2024-09-20 Thread via GitHub
sungwy merged PR #1189: URL: https://github.com/apache/iceberg-python/pull/1189 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Add Docstrings to `pyiceberg/table/__init__.py` [iceberg-python]

2024-09-20 Thread via GitHub
sungwy commented on PR #1189: URL: https://github.com/apache/iceberg-python/pull/1189#issuecomment-2364575670 Thanks for the review @kevinjqliu 🙂 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Core: Remove unused code for streaming position deletes [iceberg]

2024-09-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #11175: URL: https://github.com/apache/iceberg/pull/11175#discussion_r1769208491 ## core/src/main/java/org/apache/iceberg/deletes/Deletes.java: ## @@ -192,29 +185,6 @@ public static PositionDeleteIndex toPositionIndex(CloseableIterable p

Re: [PR] Core: Remove unused code for streaming position deletes [iceberg]

2024-09-20 Thread via GitHub
wypoon commented on PR #11175: URL: https://github.com/apache/iceberg/pull/11175#issuecomment-2364531622 @amogh-jahagirdar @szehon-ho @aokolnychyi please review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Use Snapshot's statistics file in SparkScan [iceberg]

2024-09-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #11040: URL: https://github.com/apache/iceberg/pull/11040#discussion_r1769162105 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,9 +195,9 @@ protected Statistics estimateStatistics(Snapshot sna

Re: [PR] Docs: Uppercase keyword in branching [iceberg]

2024-09-20 Thread via GitHub
amogh-jahagirdar merged PR #11172: URL: https://github.com/apache/iceberg/pull/11172 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Docs: Uppercase keyword in branching [iceberg]

2024-09-20 Thread via GitHub
amogh-jahagirdar commented on PR #11172: URL: https://github.com/apache/iceberg/pull/11172#issuecomment-2364498611 Thanks @ebyhr ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Use Snapshot's statistics file in SparkScan [iceberg]

2024-09-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #11040: URL: https://github.com/apache/iceberg/pull/11040#discussion_r1769162105 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,9 +195,9 @@ protected Statistics estimateStatistics(Snapshot sna

[PR] [DRAFT] Remove unused code for streaming position deletes [iceberg]

2024-09-20 Thread via GitHub
wypoon opened a new pull request, #11175: URL: https://github.com/apache/iceberg/pull/11175 Follow up to https://github.com/apache/iceberg/pull/9117. Prior to that PR, there were two code paths in `DeleteFilter`: if the number of position deletes were below a certain number, we use the Po

Re: [PR] Support for ns [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu commented on code in PR #1188: URL: https://github.com/apache/iceberg-python/pull/1188#discussion_r1768989235 ## pyiceberg/io/pyarrow.py: ## @@ -1068,8 +1068,17 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return StringType()

Re: [PR] Support for ns [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu commented on code in PR #1188: URL: https://github.com/apache/iceberg-python/pull/1188#discussion_r1768984958 ## pyiceberg/io/pyarrow.py: ## @@ -1068,8 +1068,17 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return StringType()

Re: [PR] Use ArrowScan.to_table to replace project_table [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu commented on PR #1180: URL: https://github.com/apache/iceberg-python/pull/1180#issuecomment-2364203606 Thank you @JE-Chen for the contribution and @sungwy for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] `project_table` is deprecated, remove references [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu closed issue #1119: `project_table` is deprecated, remove references URL: https://github.com/apache/iceberg-python/issues/1119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] `project_table` is deprecated, remove references [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu commented on issue #1119: URL: https://github.com/apache/iceberg-python/issues/1119#issuecomment-2364202778 Closed by #1180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Use ArrowScan.to_table to replace project_table [iceberg-python]

2024-09-20 Thread via GitHub
kevinjqliu merged PR #1180: URL: https://github.com/apache/iceberg-python/pull/1180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Support for ns [iceberg-python]

2024-09-20 Thread via GitHub
zaryab-ali commented on PR #1188: URL: https://github.com/apache/iceberg-python/pull/1188#issuecomment-2364191346 @kevinjqliu can you please review my PR and let me know if i made a mistake (please check with extra caution as it is my first contribution to any open source project) -- Th

Re: [I] [feature request] Support Time64Type[ns] [iceberg-python]

2024-09-20 Thread via GitHub
zaryab-ali commented on issue #1169: URL: https://github.com/apache/iceberg-python/issues/1169#issuecomment-2364188341 @kevinjqliu can you please review my PR and let me know if i made a mistake (please check with extra caution as it is my first contribution to any open source project)

[PR] Support for ns [iceberg-python]

2024-09-20 Thread via GitHub
zaryab-ali opened a new pull request, #1188: URL: https://github.com/apache/iceberg-python/pull/1188 added support for ns and enabled downcasting similar to the similar to the ` pa.types.is_timestamp(primitive):` , didn't enable upcasting as it wasn't mentioned in the issue -- This is

Re: [PR] Spec: Support geo type [iceberg]

2024-09-20 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1768951335 ## format/spec.md: ## @@ -1117,27 +1136,28 @@ Schemas are serialized as a JSON object with the same fields as a struct in the Types are serialized according to thi

Re: [PR] Spec: Support geo type [iceberg]

2024-09-20 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1768949325 ## format/spec.md: ## @@ -1084,14 +1100,16 @@ The 32-bit hash implementation is 32-bit Murmur3 hash, x86 variant, seeded with | **`uuid`** | `hashBytes(uuidB

Re: [PR] Spec: Support geo type [iceberg]

2024-09-20 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1768947593 ## format/spec.md: ## @@ -968,13 +977,13 @@ Maps with non-string keys must use an array representation with the `map` logica |**`struct`**|`record`|| |**`list`**|`a

Re: [PR] Spec: Support geo type [iceberg]

2024-09-20 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1768947092 ## format/spec.md: ## @@ -454,28 +465,28 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the follo

Re: [PR] Spec: Support geo type [iceberg]

2024-09-20 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1768946521 ## format/spec.md: ## @@ -454,28 +465,28 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the follo

Re: [PR] Spec: Support geo type [iceberg]

2024-09-20 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1768943555 ## format/spec.md: ## @@ -323,16 +327,17 @@ Partition field IDs must be reused if an existing partition spec contains an equ Partition Transforms -| Transfo

Re: [PR] Spec: Support geo type [iceberg]

2024-09-20 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1768939469 ## format/spec.md: ## @@ -200,12 +200,15 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

  1   2   >