[PR] fix: page index evaluator min/max args inverted [iceberg-rust]

2024-09-24 Thread via GitHub
sdd opened a new pull request, #648: URL: https://github.com/apache/iceberg-rust/pull/648 Fixes :https://github.com/apache/iceberg-rust/issues/647 I didn't catch all the places where I needed to change the order of the min / max args in an earlier refactor. Added a few more tes

Re: [I] Table has more than one bucket keys, but "show create table xxx" only displays one [iceberg]

2024-09-24 Thread via GitHub
madeirak commented on issue #11090: URL: https://github.com/apache/iceberg/issues/11090#issuecomment-2373143509 > The `show create table` result is following Spark SQL syntax, which only supports one bucket field. ok, fine. It would be better if it could be as shown in the Iceberg doc

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-09-24 Thread via GitHub
ashvina commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1774518287 ## format/spec.md: ## @@ -298,16 +298,143 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is: -|

Re: [I] Table has more than one bucket keys, but "show create table xxx" only displays one [iceberg]

2024-09-24 Thread via GitHub
manuzhang commented on issue #11090: URL: https://github.com/apache/iceberg/issues/11090#issuecomment-2372888905 The `show create table` result is following Spark SQL syntax, which only supports one bucket field. -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] feat: Reassign field ids for schema [iceberg-rust]

2024-09-24 Thread via GitHub
liurenjie1024 commented on code in PR #615: URL: https://github.com/apache/iceberg-rust/pull/615#discussion_r1774372777 ## crates/iceberg/src/spec/schema.rs: ## @@ -943,6 +965,122 @@ impl SchemaVisitor for PruneColumn { } } +struct ReassignFieldIds { +next_field_id:

Re: [I] Table has more than one bucket keys, but "show create table xxx" only displays one [iceberg]

2024-09-24 Thread via GitHub
madeirak commented on issue #11090: URL: https://github.com/apache/iceberg/issues/11090#issuecomment-2372852500 > > create table dbxx.tbxx (id INT COMMENT '11', name STRING COMMENT '') USING iceberg PARTITIONED BY (name, bucket(10, name), bucket(10, id )); > > insert into tbxx values (1

Re: [I] Table has more than one bucket keys, but "show create table xxx" only displays one [iceberg]

2024-09-24 Thread via GitHub
lurnagao-dahua commented on issue #11090: URL: https://github.com/apache/iceberg/issues/11090#issuecomment-2372845647 > create table dbxx.tbxx (id INT COMMENT '11', name STRING COMMENT '') USING iceberg PARTITIONED BY (name, bucket(10, name), bucket(10, id )); > insert into tbxx values

Re: [PR] Flink: Avoid metaspace memory leak by not registering ShutdownHook for ExecutorService in Flink [iceberg]

2024-09-24 Thread via GitHub
stevenzwu commented on PR #11073: URL: https://github.com/apache/iceberg/pull/11073#issuecomment-2372844534 @fengjiajie I think Ryan had a good point in the email thread that we probably shouldn't be using the `ThreadPools.newWorkerPool()` with explicit lifecycle management. I think we can

Re: [PR] feat: Reassign field ids for schema [iceberg-rust]

2024-09-24 Thread via GitHub
liurenjie1024 commented on PR #615: URL: https://github.com/apache/iceberg-rust/pull/615#issuecomment-2372832599 > cc @liurenjie1024 would you like to take a look too? Thanks for pinging me, I'll take a review. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] update PartitionSpec with snapshot'schema [iceberg]

2024-09-24 Thread via GitHub
lurnagao-dahua commented on PR #11196: URL: https://github.com/apache/iceberg/pull/11196#issuecomment-2372822799 Could you please take a review when you have time? @pvary @nastra @Fokko I would greatly appreciate it! -- This is an automated message from the Apache Git Service. To respon

Re: [PR] Flink: Avoid metaspace memory leak by not registering ShutdownHook for ExecutorService in Flink [iceberg]

2024-09-24 Thread via GitHub
fengjiajie commented on PR #11073: URL: https://github.com/apache/iceberg/pull/11073#issuecomment-2372812766 @pvary @rdblue @danielcweeks @stevenzwu Thanks everyone for pushing this forward. I've followed the email thread and the review discussions, and it seems there's still some disagr

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1774350481 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -81,6 +81,7 @@ public String partition() { // cache filtered manifests to avo

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1774350481 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -81,6 +81,7 @@ public String partition() { // cache filtered manifests to avo

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-09-24 Thread via GitHub
wypoon commented on PR #10935: URL: https://github.com/apache/iceberg/pull/10935#issuecomment-2372741865 @pvary can you please help move this forward then? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Core: Add ContentFileSet and ContentFileWrapper [iceberg]

2024-09-24 Thread via GitHub
aokolnychyi commented on PR #11195: URL: https://github.com/apache/iceberg/pull/11195#issuecomment-2372638766 Will take a look tomorrow. Thanks, @nastra! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Request Timeout API to RestCatalog's HTTPClient is provided by Iceberg SDK [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8915: URL: https://github.com/apache/iceberg/issues/8915#issuecomment-2372612489 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Consumer Latency Monitoring Support in Iceberg ? [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8903: URL: https://github.com/apache/iceberg/issues/8903#issuecomment-2372612406 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] NamedReference::bind performance issue [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8196: URL: https://github.com/apache/iceberg/issues/8196#issuecomment-2372611237 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] NamedReference::bind performance issue [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] closed issue #8196: NamedReference::bind performance issue URL: https://github.com/apache/iceberg/issues/8196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Iceberg Java Api - S3 Session Token - 403 Forbidden exception [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8190: URL: https://github.com/apache/iceberg/issues/8190#issuecomment-2372611216 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Schema issue between Arrow and PyIceberg [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8913: URL: https://github.com/apache/iceberg/issues/8913#issuecomment-2372612463 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] operations fail after upgrading to spark 3.4 [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8904: URL: https://github.com/apache/iceberg/issues/8904#issuecomment-2372612423 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Pushdown SUBSTRING filter when equivalent to STARTSWITH [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8911: URL: https://github.com/apache/iceberg/issues/8911#issuecomment-2372612443 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Hive's performance for querying the Iceberg table is very poor. [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8901: URL: https://github.com/apache/iceberg/issues/8901#issuecomment-2372612385 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] BrotliDecompressor throwing precondition error on PySpark job with UDF and limit [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8211: URL: https://github.com/apache/iceberg/issues/8211#issuecomment-2372611290 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] BrotliDecompressor throwing precondition error on PySpark job with UDF and limit [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] closed issue #8211: BrotliDecompressor throwing precondition error on PySpark job with UDF and limit URL: https://github.com/apache/iceberg/issues/8211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Request to add KLL Datasketch and hive ColumnStatisticsObj and as standard blob types to puffin file. [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] commented on issue #8198: URL: https://github.com/apache/iceberg/issues/8198#issuecomment-2372611269 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Request to add KLL Datasketch and hive ColumnStatisticsObj and as standard blob types to puffin file. [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] closed issue #8198: Request to add KLL Datasketch and hive ColumnStatisticsObj and as standard blob types to puffin file. URL: https://github.com/apache/iceberg/issues/8198 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] Iceberg Java Api - S3 Session Token - 403 Forbidden exception [iceberg]

2024-09-24 Thread via GitHub
github-actions[bot] closed issue #8190: Iceberg Java Api - S3 Session Token - 403 Forbidden exception URL: https://github.com/apache/iceberg/issues/8190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-24 Thread via GitHub
aokolnychyi commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1774236061 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -81,6 +81,7 @@ public String partition() { // cache filtered manifests to avoid ex

Re: [PR] Core: Support iterating over positions in PositionDeleteIndex [iceberg]

2024-09-24 Thread via GitHub
aokolnychyi commented on PR #11202: URL: https://github.com/apache/iceberg/pull/11202#issuecomment-2372591151 > Just for my understanding, is the roaring bitmap inherently sorted? Correct, @anuragmantri. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Core: Add a util to compute partition stats [iceberg]

2024-09-24 Thread via GitHub
aokolnychyi commented on code in PR #11146: URL: https://github.com/apache/iceberg/pull/11146#discussion_r1774082285 ## core/src/jmh/java/org/apache/iceberg/PartitionStatsUtilBenchmark.java: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[PR] Bump getdaft from 0.3.2 to 0.3.3 [iceberg-python]

2024-09-24 Thread via GitHub
dependabot[bot] opened a new pull request, #1204: URL: https://github.com/apache/iceberg-python/pull/1204 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.3.2 to 0.3.3. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

[PR] Bump moto from 5.0.14 to 5.0.15 [iceberg-python]

2024-09-24 Thread via GitHub
dependabot[bot] opened a new pull request, #1203: URL: https://github.com/apache/iceberg-python/pull/1203 Bumps [moto](https://github.com/getmoto/moto) from 5.0.14 to 5.0.15. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog.

Re: [PR] HA HMS support [iceberg-python]

2024-09-24 Thread via GitHub
awdavidson commented on code in PR #752: URL: https://github.com/apache/iceberg-python/pull/752#discussion_r1774140098 ## pyiceberg/catalog/hive.py: ## @@ -271,6 +271,19 @@ def __init__(self, name: str, **properties: str): DEFAULT_LOCK_CHECK_RETRIES, ) +

Re: [PR] HA HMS support [iceberg-python]

2024-09-24 Thread via GitHub
kevinjqliu commented on code in PR #752: URL: https://github.com/apache/iceberg-python/pull/752#discussion_r1774112353 ## pyiceberg/catalog/hive.py: ## @@ -271,6 +271,19 @@ def __init__(self, name: str, **properties: str): DEFAULT_LOCK_CHECK_RETRIES, ) +

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-24 Thread via GitHub
ookumuso commented on PR #2: URL: https://github.com/apache/iceberg/pull/2#issuecomment-2372409237 > > @danielcweeks let me know what you think about this. One alternative is to maybe provide both as in having S3LocationProvider as is and a base2 option for the ObjectStoreLocationPr

Re: [I] Flink SQL with Iceberg snapshots doesn't react if table has upsert [iceberg]

2024-09-24 Thread via GitHub
CJDrew commented on issue #9948: URL: https://github.com/apache/iceberg/issues/9948#issuecomment-2372401940 @pvary Hi Peter, I am also interested in this feature. I see in iceberg-spark we have the ability to use the procedure "create_changelog_view" and I wonder how difficult it wou

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-09-24 Thread via GitHub
stevenzwu commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1774095086 ## format/spec.md: ## @@ -298,16 +298,143 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is:

Re: [PR] Core: Add rewritten delete files to write results [iceberg]

2024-09-24 Thread via GitHub
aokolnychyi commented on code in PR #11203: URL: https://github.com/apache/iceberg/pull/11203#discussion_r1774073091 ## core/src/main/java/org/apache/iceberg/io/DeleteWriteResult.java: ## @@ -32,25 +32,39 @@ public class DeleteWriteResult { private final List deleteFiles;

Re: [PR] HA HMS support [iceberg-python]

2024-09-24 Thread via GitHub
kevinjqliu commented on code in PR #752: URL: https://github.com/apache/iceberg-python/pull/752#discussion_r1774066513 ## pyiceberg/catalog/hive.py: ## @@ -271,6 +271,19 @@ def __init__(self, name: str, **properties: str): DEFAULT_LOCK_CHECK_RETRIES, ) +

Re: [PR] Spark: Add RewriteTablePath action interface [iceberg]

2024-09-24 Thread via GitHub
laithalzyoud commented on PR #10920: URL: https://github.com/apache/iceberg/pull/10920#issuecomment-2372335859 @nastra can you rerun the workflows please? I fixed a formatting error -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[PR] Core: Add rewritten delete files to write results [iceberg]

2024-09-24 Thread via GitHub
aokolnychyi opened a new pull request, #11203: URL: https://github.com/apache/iceberg/pull/11203 This PR adds a way to pass around rewritten delete files from writers to enable sync maintenance. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1773993913 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -49,7 +50,21 @@ public static , K> K copy( } } + /** + * @deprecated s

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1773993913 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -49,7 +50,21 @@ public static , K> K copy( } } + /** + * @deprecated s

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1773993913 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -49,7 +50,21 @@ public static , K> K copy( } } + /** + * @deprecated s

[PR] Core: Support iterating over positions in PositionDeleteIndex [iceberg]

2024-09-24 Thread via GitHub
aokolnychyi opened a new pull request, #11202: URL: https://github.com/apache/iceberg/pull/11202 This PR adds support for iterating over positions in `PositionDeleteIndex`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-24 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1773673837 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ## @@ -140,12 +141,18 @@ public static class ConstantVectorHolder extends VectorHolder

Re: [PR] Add Support for Dynamic Overwrite [iceberg-python]

2024-09-24 Thread via GitHub
sungwy commented on PR #931: URL: https://github.com/apache/iceberg-python/pull/931#issuecomment-2372038772 Hi @Fokko - this PR looks good from my end. Would you have some time to take a look? Since this is a new API (which comes with another level of caution), I'd love to get your r

[PR] Core: Add support for view-default property in catalog [iceberg]

2024-09-24 Thread via GitHub
nk1506 opened a new pull request, #11200: URL: https://github.com/apache/iceberg/pull/11200 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [I] Remove python 3.8 support [iceberg-python]

2024-09-24 Thread via GitHub
kevinjqliu commented on issue #1121: URL: https://github.com/apache/iceberg-python/issues/1121#issuecomment-2371931643 This was voted on via the devlist and passed https://lists.apache.org/thread/50jjtdyfovwqbw8mp5m6rfgmmbxv7qcr -- This is an automated message from the Apache Git Servi

Re: [PR] fix: fixing tests to work with s3Express [iceberg]

2024-09-24 Thread via GitHub
steveloughran commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1773779337 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -194,9 +194,9 @@ public void testNewInputStreamWithAccessPoint() throw

Re: [I] PyIceberg Cookbook [iceberg-python]

2024-09-24 Thread via GitHub
kevinjqliu commented on issue #1201: URL: https://github.com/apache/iceberg-python/issues/1201#issuecomment-2371910478 Copying over from community sync Cookbook suggestions * Support for incremental processing with "change table" ([link](https://netflixtechblog.com/incremental-pro

[I] PyIceberg Cookbook [iceberg-python]

2024-09-24 Thread via GitHub
kevinjqliu opened a new issue, #1201: URL: https://github.com/apache/iceberg-python/issues/1201 ### Feature Request / Improvement It was brought up at the [recent community sync](https://docs.google.com/document/d/1oMKodaZJrOJjPfc8PDVAoTdl02eGQKHlhwuggiw7s9U/edit#bookmark=kix.76h0j5pw

Re: [I] Support data files compaction [iceberg-python]

2024-09-24 Thread via GitHub
sungwy commented on issue #1092: URL: https://github.com/apache/iceberg-python/issues/1092#issuecomment-2371906242 Unassigning to work on other near-term priorities -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-24 Thread via GitHub
flyrain commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1773760835 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [I] Merge into / Upsert [iceberg-python]

2024-09-24 Thread via GitHub
sungwy commented on issue #402: URL: https://github.com/apache/iceberg-python/issues/402#issuecomment-2371902578 Hi @Minfante377 sorry for the delayed response, and thank you for the interest! Unfortunately, this is still an open issue on PyIceberg with no assignee. MERGE INTO with t

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-24 Thread via GitHub
flyrain commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1773760835 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with String [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1773720327 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -75,4 +76,32 @@ public static CharSequence referencedDataFile(DeleteFile deleteFil

[PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with String [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar opened a new pull request, #11199: URL: https://github.com/apache/iceberg/pull/11199 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-24 Thread via GitHub
jackye1995 commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1773717401 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-24 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1773708587 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-24 Thread via GitHub
danielcweeks commented on PR #2: URL: https://github.com/apache/iceberg/pull/2#issuecomment-2371796750 > @danielcweeks let me know what you think about this. One alternative is to maybe provide both as in having S3LocationProvider as is and a base2 option for the ObjectStoreLocation

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-24 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1773673837 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ## @@ -140,12 +141,18 @@ public static class ConstantVectorHolder extends VectorHolder

Re: [PR] AWS: Set better defaults for S3 retry behaviour [iceberg]

2024-09-24 Thread via GitHub
ookumuso commented on code in PR #11052: URL: https://github.com/apache/iceberg/pull/11052#discussion_r1773679309 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -393,6 +403,21 @@ public class S3FileIOProperties implements Serializable { */ p

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-24 Thread via GitHub
danielcweeks commented on code in PR #2: URL: https://github.com/apache/iceberg/pull/2#discussion_r1773678341 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3LocationProvider.java: ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] API, AWS: Retry S3InputStream reads [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar merged PR #10433: URL: https://github.com/apache/iceberg/pull/10433 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [I] javax.net.ssl.SSLException: Connection reset on S3 w/ S3FileIO and Apache HTTP client [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar closed issue #10340: javax.net.ssl.SSLException: Connection reset on S3 w/ S3FileIO and Apache HTTP client URL: https://github.com/apache/iceberg/issues/10340 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] API, AWS: Retry S3InputStream reads [iceberg]

2024-09-24 Thread via GitHub
amogh-jahagirdar commented on PR #10433: URL: https://github.com/apache/iceberg/pull/10433#issuecomment-2371767216 Thanks for the reviews @danielcweeks! Merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-24 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1773673837 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ## @@ -140,12 +141,18 @@ public static class ConstantVectorHolder extends VectorHolder

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-24 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1773673837 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ## @@ -140,12 +141,18 @@ public static class ConstantVectorHolder extends VectorHolder

Re: [PR] Bump pydantic from 2.9.1 to 2.9.2 [iceberg-python]

2024-09-24 Thread via GitHub
sungwy commented on PR #1182: URL: https://github.com/apache/iceberg-python/pull/1182#issuecomment-2371686195 The version upgrade is casting format-version: `1` to `true` in the following test: ``` FAILED tests/table/test_metadata.py::test_serialize_v1 - assert '{"location":...d-i

Re: [PR] Bump mkdocs-material from 9.5.35 to 9.5.36 [iceberg-python]

2024-09-24 Thread via GitHub
sungwy merged PR #1195: URL: https://github.com/apache/iceberg-python/pull/1195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [I] [bug?] cannot run integration test [iceberg-python]

2024-09-24 Thread via GitHub
sungwy closed issue #1162: [bug?] cannot run integration test URL: https://github.com/apache/iceberg-python/issues/1162 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] Use `cachetools's LRUCache` to cache manifest list [iceberg-python]

2024-09-24 Thread via GitHub
sungwy merged PR #1187: URL: https://github.com/apache/iceberg-python/pull/1187 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [I] NumPy Hardpin 1.26 issue [iceberg-python]

2024-09-24 Thread via GitHub
sungwy commented on issue #1198: URL: https://github.com/apache/iceberg-python/issues/1198#issuecomment-2371675786 Hi @anuraggautam14 thanks for raising this issue. This was actually intentionally imposed to solve an issue with pandas and numpy version compatibility. I'll take anothe

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-24 Thread via GitHub
jackye1995 commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1773613570 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [I] Support writing to a branch [iceberg-python]

2024-09-24 Thread via GitHub
sungwy commented on issue #306: URL: https://github.com/apache/iceberg-python/issues/306#issuecomment-2371644536 Hi @vinjai thank you very much for working on this issue. I'm just working through the list of open items to check if they are still actively being worked on. Are you still inter

Re: [PR] Bump thrift from 0.20.0 to 0.21.0 [iceberg-python]

2024-09-24 Thread via GitHub
sungwy merged PR #1197: URL: https://github.com/apache/iceberg-python/pull/1197 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Access delegation [iceberg-python]

2024-09-24 Thread via GitHub
sungwy commented on code in PR #1033: URL: https://github.com/apache/iceberg-python/pull/1033#discussion_r1773578960 ## pyiceberg/catalog/rest.py: ## @@ -532,7 +534,7 @@ def _config_headers(self, session: Session) -> None: session.headers["Content-type"] = "application/

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-09-24 Thread via GitHub
rodmeneses commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2371596204 no code has been implemented for this. I see at least 2 options: 1. Implement a brand new `IcebergTableSink` that uses the new `IcebergSink`. 2. Control what underlying sink to u

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-09-24 Thread via GitHub
arkadius commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2371580187 Thank you for the quick response. By coexistence do you mean that it will be possible to pick the new implementation for dynamic tables for example by some property in the catalog conf

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-09-24 Thread via GitHub
rodmeneses commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2371560378 > Hi, Is there a plan to replace the previous implementation (`FlinkSink`) with the new one (`IcebergSink`) also for dynamic tables (in `FlinkDynamicTableFactory`)? When it will happ

Re: [PR] Access delegation [iceberg-python]

2024-09-24 Thread via GitHub
guitcastro commented on code in PR #1033: URL: https://github.com/apache/iceberg-python/pull/1033#discussion_r1773534032 ## pyiceberg/catalog/rest.py: ## @@ -532,7 +534,7 @@ def _config_headers(self, session: Session) -> None: session.headers["Content-type"] = "applicat

Re: [PR] feat: implement IcebergTableProviderFactory for datafusion [iceberg-rust]

2024-09-24 Thread via GitHub
matthewmturner commented on code in PR #600: URL: https://github.com/apache/iceberg-rust/pull/600#discussion_r1773470190 ## crates/integrations/datafusion/src/table/table_provider_factory.rs: ## @@ -0,0 +1,312 @@ +// Licensed to the Apache Software Foundation (ASF) under one +//

[I] NumPy Hardpin 1.26 issue [iceberg-python]

2024-09-24 Thread via GitHub
anuraggautam14 opened a new issue, #1198: URL: https://github.com/apache/iceberg-python/issues/1198 ### Apache Iceberg version 0.7.1 (latest release) ### Please describe the bug 🐞 issue : In the latest version (0.7.1) , we notice that NumPy is hard-pinned at 1.26 (py

[I] Kryo serialization problem for `GenericDataFile` [iceberg]

2024-09-24 Thread via GitHub
arkadius opened a new issue, #11197: URL: https://github.com/apache/iceberg/issues/11197 ### Query engine Flink 1.19.1 ### Question Hi, I'm using Flink with Iceberg 1.6.1 (I also tried the current snapshot). While inserting the `Row` into a Iceberg table I'm getting an e

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-24 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1773438348 ## open-api/rest-catalog-open-api.yaml: ## @@ -3129,6 +3204,11 @@ components: - `s3.secret-access-key`: secret for credentials that provide access to data i

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-24 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1773432500 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-09-24 Thread via GitHub
arkadius commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2371390372 Hi, Is there a plan to replace the previous implementation (`FlinkSink`) with the new one (`IcebergSink`) also for dynamic tables (in `FlinkDynamicTableFactory`)? When it will happen?

Re: [PR] Core: Add a util to compute partition stats [iceberg]

2024-09-24 Thread via GitHub
ajantha-bhat commented on code in PR #11146: URL: https://github.com/apache/iceberg/pull/11146#discussion_r1773361634 ## core/src/main/java/org/apache/iceberg/PartitionStats.java: ## @@ -0,0 +1,249 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Core: Add a util to compute partition stats [iceberg]

2024-09-24 Thread via GitHub
ajantha-bhat commented on code in PR #11146: URL: https://github.com/apache/iceberg/pull/11146#discussion_r1773357839 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [I] Add apply interface in transaction [iceberg-rust]

2024-09-24 Thread via GitHub
ZENOTME commented on issue #596: URL: https://github.com/apache/iceberg-rust/issues/596#issuecomment-2371283315 > Sorry, I don't quite get the point, if updates are sent to rest catalog server, why we need to update it in local first? E.g. the user wants to batch multiple updates and

Re: [PR] fix: compile error due to merge stale PR [iceberg-rust]

2024-09-24 Thread via GitHub
Xuanwo commented on PR #646: URL: https://github.com/apache/iceberg-rust/pull/646#issuecomment-2371271788 > @Xuanwo @liurenjie1024 @sdd Should we add "require PR to be up to date" or "merge queue"? (I prefer merge queue if possible) I will try contact with the INFRA for this. -- Th

Re: [PR] fix: compile error due to merge stale PR [iceberg-rust]

2024-09-24 Thread via GitHub
Xuanwo merged PR #646: URL: https://github.com/apache/iceberg-rust/pull/646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] fix: compile error due to merge stale PR [iceberg-rust]

2024-09-24 Thread via GitHub
xxchan commented on PR #646: URL: https://github.com/apache/iceberg-rust/pull/646#issuecomment-2371017171 @Xuanwo @liurenjie1024 @sdd Should we add "require PR to be up to date" or "merge queue"? (I prefer merge queue if possible) -- This is an automated message from the Apache Git Servic

Re: [PR] fix: compile error due to merge stale PR [iceberg-rust]

2024-09-24 Thread via GitHub
xxchan commented on code in PR #646: URL: https://github.com/apache/iceberg-rust/pull/646#discussion_r1773169760 ## crates/iceberg/src/arrow/reader.rs: ## @@ -245,7 +245,7 @@ impl ArrowReader { record_batch_stream_builder.metadata(), &se

Re: [PR] fix: compile error due to merge stale PR [iceberg-rust]

2024-09-24 Thread via GitHub
xxchan commented on code in PR #646: URL: https://github.com/apache/iceberg-rust/pull/646#discussion_r1773169008 ## crates/iceberg/src/expr/visitors/page_index_evaluator.rs: ## @@ -24,14 +24,14 @@ use ordered_float::OrderedFloat; use parquet::arrow::arrow_reader::{RowSelection,

Re: [I] Table has more than one bucket keys, but "show create table xxx" only displays one [iceberg]

2024-09-24 Thread via GitHub
madeirak commented on issue #11090: URL: https://github.com/apache/iceberg/issues/11090#issuecomment-2370966772 > Sorry, I missed `name_bucket_10` part. How did you create your table? With which catalog? With HiveCatalog -- This is an automated message from the Apache Git Service.

Re: [PR] Core: Add a util to compute partition stats [iceberg]

2024-09-24 Thread via GitHub
ajantha-bhat commented on code in PR #11146: URL: https://github.com/apache/iceberg/pull/11146#discussion_r1773135088 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

  1   2   >