Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10603: URL: https://github.com/apache/iceberg/pull/10603#issuecomment-2224954819 The vote has been passed: https://lists.apache.org/thread/o4qmrm5jx50mk1mqws0t9f1z2op4gvvm -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Set properties boolean value to lowercase string [iceberg-python]

2024-07-12 Thread via GitHub
soumya-ghosh commented on issue #919: URL: https://github.com/apache/iceberg-python/issues/919#issuecomment-2224996587 @kevinjqliu could I pick this up? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] Nit: fix/suppress false-positivie errorprone warning [iceberg]

2024-07-12 Thread via GitHub
snazy opened a new pull request, #10690: URL: https://github.com/apache/iceberg/pull/10690 ``` .../hive3/src/main/java/org/apache/iceberg/mr/hive/serde/objectinspector/IcebergTimestampObjectInspectorHive3.java:60: warning: [JavaLocalDateTimeGetNano] localDateTime.getNano() only accesss t

Re: [PR] Add row-level operation benchmarks [iceberg]

2024-07-12 Thread via GitHub
ajantha-bhat commented on code in PR #10687: URL: https://github.com/apache/iceberg/pull/10687#discussion_r1675471181 ## benchmark/src/main/scala/org/apache/iceberg/spark/source/EqDeltaWriter.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Add row-level operation benchmarks [iceberg]

2024-07-12 Thread via GitHub
ajantha-bhat commented on code in PR #10687: URL: https://github.com/apache/iceberg/pull/10687#discussion_r1675471181 ## benchmark/src/main/scala/org/apache/iceberg/spark/source/EqDeltaWriter.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-12 Thread via GitHub
nastra commented on code in PR #10680: URL: https://github.com/apache/iceberg/pull/10680#discussion_r1675475160 ## core/src/test/java/org/apache/iceberg/TestReplacePartitions.java: ## @@ -67,6 +67,34 @@ public class TestReplacePartitions extends TestBase { .withRecord

Re: [PR] Core: Expose table incremental scan for appends API in SerializableTable [iceberg]

2024-07-12 Thread via GitHub
nastra merged PR #10682: URL: https://github.com/apache/iceberg/pull/10682 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Access by column position in parquet data files [iceberg]

2024-07-12 Thread via GitHub
AndreiZhoraven closed issue #10582: Access by column position in parquet data files URL: https://github.com/apache/iceberg/issues/10582 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-12 Thread via GitHub
Fokko merged PR #10603: URL: https://github.com/apache/iceberg/pull/10603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10603: URL: https://github.com/apache/iceberg/pull/10603#issuecomment-2225148000 Thanks everyone, moving this forward for the 1.6.0 release đź‘Ť -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Bump Nessie from 0.91.3 to 0.92.0 [iceberg]

2024-07-12 Thread via GitHub
Fokko merged PR #10689: URL: https://github.com/apache/iceberg/pull/10689 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Bump Nessie from 0.91.3 to 0.92.0 [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10689: URL: https://github.com/apache/iceberg/pull/10689#issuecomment-2225149031 Thanks @snazy and @jbonofre & @ajantha-bhat for the prompt review 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
advancedxy commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1675556183 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map> specToDeletes =

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on code in PR #10680: URL: https://github.com/apache/iceberg/pull/10680#discussion_r1675557564 ## core/src/test/java/org/apache/iceberg/TestReplacePartitions.java: ## @@ -67,6 +67,34 @@ public class TestReplacePartitions extends TestBase { .withRecordC

Re: [I] Javadoc issues [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on issue #10378: URL: https://github.com/apache/iceberg/issues/10378#issuecomment-2225157354 I'll bump this to 1.7.0. It would be best if we could fix this using the macro plugin in combination with the monorepo plugin: https://github.com/backstage/mkdocs-monorepo-plugin/iss

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on code in PR #10680: URL: https://github.com/apache/iceberg/pull/10680#discussion_r1675562849 ## core/src/test/java/org/apache/iceberg/TestReplacePartitions.java: ## @@ -67,6 +67,34 @@ public class TestReplacePartitions extends TestBase { .withRecordC

Re: [PR] Ignore iceberg-build.properties file loading exception [iceberg]

2024-07-12 Thread via GitHub
EugeneChung commented on PR #10520: URL: https://github.com/apache/iceberg/pull/10520#issuecomment-2225167268 Fortunately, EMR 7.1.0 fixed the problem. ``` Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_

Re: [PR] Ignore iceberg-build.properties file loading exception [iceberg]

2024-07-12 Thread via GitHub
EugeneChung closed pull request #10520: Ignore iceberg-build.properties file loading exception URL: https://github.com/apache/iceberg/pull/10520 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [I] How to get one table partition values without doing a query, from metadata [iceberg-python]

2024-07-12 Thread via GitHub
doctormohamed closed issue #916: How to get one table partition values without doing a query, from metadata URL: https://github.com/apache/iceberg-python/issues/916 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1675581619 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +181,25 @@ public Statistics estimateStatistics() { protected Statis

Re: [PR] fix DataFileStats invalidation logic [iceberg-python]

2024-07-12 Thread via GitHub
Fokko merged PR #911: URL: https://github.com/apache/iceberg-python/pull/911 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675598848 ## core/src/test/java/org/apache/iceberg/lock/HttpLockManagerTest.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675598848 ## core/src/test/java/org/apache/iceberg/lock/HttpLockManagerTest.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675602524 ## core/src/test/java/org/apache/iceberg/lock/RedisLockManagerTest.java: ## @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] View Spec implementation [iceberg-rust]

2024-07-12 Thread via GitHub
nastra commented on code in PR #331: URL: https://github.com/apache/iceberg-rust/pull/331#discussion_r1675578086 ## crates/iceberg/src/spec/view_metadata.rs: ## @@ -0,0 +1,682 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agree

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675607925 ## core/src/main/java/org/apache/iceberg/lock/ServerSideHttpLockManager.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675609730 ## core/src/main/java/org/apache/iceberg/lock/RedissonLockManager.java: ## @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-12 Thread via GitHub
nastra commented on code in PR #10680: URL: https://github.com/apache/iceberg/pull/10680#discussion_r1675611984 ## core/src/main/java/org/apache/iceberg/util/PartitionSet.java: ## @@ -200,7 +200,7 @@ public String toString() { StringBuilder partitionStringBuilder = ne

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675609730 ## core/src/main/java/org/apache/iceberg/lock/RedissonLockManager.java: ## @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on PR #10688: URL: https://github.com/apache/iceberg/pull/10688#issuecomment-2225235649 tips: According to the original plan, I originally intended to extend a zookeeper-based lockManager. However, I found that the whole iceberg project's dependency on zookeeper is quit

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on code in PR #10680: URL: https://github.com/apache/iceberg/pull/10680#discussion_r1675638785 ## core/src/main/java/org/apache/iceberg/util/PartitionSet.java: ## @@ -200,7 +200,7 @@ public String toString() { StringBuilder partitionStringBuilder = new

Re: [PR] Core: Exclude unexpected namespaces JdbcCatalog.listNamespaces [iceberg]

2024-07-12 Thread via GitHub
Fokko merged PR #10498: URL: https://github.com/apache/iceberg/pull/10498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [I] Iceberg rest catalog with postgres - List namespaces with parent returns wrong children namespaces [iceberg]

2024-07-12 Thread via GitHub
Fokko closed issue #10213: Iceberg rest catalog with postgres - List namespaces with parent returns wrong children namespaces URL: https://github.com/apache/iceberg/issues/10213 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Core: Exclude unexpected namespaces JdbcCatalog.listNamespaces [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10498: URL: https://github.com/apache/iceberg/pull/10498#issuecomment-2225265862 Thanks @ebyhr for working on this đź‘Ť -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675609730 ## core/src/main/java/org/apache/iceberg/lock/RedissonLockManager.java: ## @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675648520 ## core/src/test/java/org/apache/iceberg/lock/TestHttpServer.java: ## @@ -0,0 +1,148 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1675655377 ## core/src/main/java/org/apache/iceberg/lock/ServerSideHttpLockManager.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Glue endpoint config variable, continue #530 [iceberg-python]

2024-07-12 Thread via GitHub
Fokko commented on code in PR #920: URL: https://github.com/apache/iceberg-python/pull/920#discussion_r1675671820 ## pyiceberg/catalog/glue.py: ## @@ -109,6 +109,10 @@ GLUE_SKIP_ARCHIVE = "glue.skip-archive" GLUE_SKIP_ARCHIVE_DEFAULT = True +# Configure an alternative endpoi

Re: [I] Forward incompatible types introduced when writing Iceberg data [iceberg-python]

2024-07-12 Thread via GitHub
Fokko commented on issue #887: URL: https://github.com/apache/iceberg-python/issues/887#issuecomment-2225318343 Closing this one since https://github.com/apache/iceberg-python/pull/902 has been merged. Thanks @syun64 for reporting this 🙌 -- This is an automated message from the Apache Gi

Re: [I] Set properties boolean value to lowercase string [iceberg-python]

2024-07-12 Thread via GitHub
Fokko commented on issue #919: URL: https://github.com/apache/iceberg-python/issues/919#issuecomment-2225349338 @soumya-ghosh assigned it to you đź‘Ť -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Javadoc issues [iceberg]

2024-07-12 Thread via GitHub
jbonofre commented on issue #10378: URL: https://github.com/apache/iceberg/issues/10378#issuecomment-2225436992 @Fokko thanks ! It makes sense. I will take a look on this one for 1.7.0 (definitely not a blocker for 1.6.0). -- This is an automated message from the Apache Git Service. To re

Re: [PR] Flink: Migrate remaining source classes to JUnit5 [iceberg]

2024-07-12 Thread via GitHub
tomtongue commented on PR #10684: URL: https://github.com/apache/iceberg/pull/10684#issuecomment-2225457564 Fixing the test failure of failover classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Deprecate to_requested_schema [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on PR #918: URL: https://github.com/apache/iceberg-python/pull/918#issuecomment-2225467379 > #910 Yeah good question - this is on our list of things to do before we hit 1.0.0 milestone: https://github.com/apache/iceberg-python/issues/334 > How about also adding

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-12 Thread via GitHub
Fokko merged PR #10680: URL: https://github.com/apache/iceberg/pull/10680 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10680: URL: https://github.com/apache/iceberg/pull/10680#issuecomment-2225472776 Thanks for fixing this @boroknagyz , and thanks for the review @ajantha-bhat, @jbonofre, @deniskuzZ and @nastra -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Deprecate to_requested_schema [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on PR #918: URL: https://github.com/apache/iceberg-python/pull/918#issuecomment-2225485718 > Do we want to delete the tests added in #910? We do have tests for internal functions, and I think the casting behavior of timestamps through `_to_requested_schema` is one tha

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
dekimir commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1675856578 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -20,65 +20,68 @@ import java.io.Closeable; import java.io.IOException; +import java.uti

Re: [PR] Glue endpoint config variable, continue #530 [iceberg-python]

2024-07-12 Thread via GitHub
sebpretzer commented on PR #920: URL: https://github.com/apache/iceberg-python/pull/920#issuecomment-2225578133 > This PR builds on @sebpretzer 's great work #530. It rebases on main and add doc for glue catalog properties > > @sebpretzer I hope you don’t mind me taking this further

Re: [PR] Glue endpoint config variable [iceberg-python]

2024-07-12 Thread via GitHub
sebpretzer commented on PR #530: URL: https://github.com/apache/iceberg-python/pull/530#issuecomment-2225579524 Close in favor of https://github.com/apache/iceberg-python/pull/920 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Glue endpoint config variable [iceberg-python]

2024-07-12 Thread via GitHub
sebpretzer closed pull request #530: Glue endpoint config variable URL: https://github.com/apache/iceberg-python/pull/530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1675920451 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -20,65 +20,68 @@ import java.io.Closeable; import java.io.IOException; +import java.uti

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1675922119 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -20,65 +20,68 @@ import java.io.Closeable; import java.io.IOException; +import java.uti

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1675927724 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -88,7 +91,18 @@ private ParallelIterator( @Override public void close() {

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225619730 I don't have enough knowledge of this piece of this piece of code to merge this without any benchmarks or profiling. Maybe @rdblue? -- This is an automated message from the Apache Git S

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225620332 @Heltman @losipiuk @alexjo2144 you might want to take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225622443 @Fokko fair. Also, note this is not a performance improvement or something. It's "just" bounding memory usage to prevent OOM. As such, I wound't expect this change to require benchmark

Re: [PR] Core: Add param to limit manifest parallel reader queue size [iceberg]

2024-07-12 Thread via GitHub
findepi commented on PR #7844: URL: https://github.com/apache/iceberg/pull/7844#issuecomment-2225627726 I created a PR aiming to make the queue bounded, but without requiring separate executor pool. The change is effectively transparent to class consumers. Please see https://github.com/apac

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225632947 My thinking here is that we bound the queue. The `ParallelIterable` is often used at places where it is IO intensive. This will limit the parallelism of calls to the object stores and tha

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
dekimir commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225641596 > My thinking here is that we bound the queue. The `ParallelIterable` is often used at places where it is IO intensive. This will limit the parallelism of calls to the object stores and

Re: [PR] feat(visitors): Implement basic boolean expression visitors [iceberg-go]

2024-07-12 Thread via GitHub
Fokko commented on code in PR #108: URL: https://github.com/apache/iceberg-go/pull/108#discussion_r1675971746 ## exprs.go: ## @@ -538,11 +557,11 @@ func (up *unboundUnaryPredicate) Bind(schema *Schema, caseSensitive bool) (Boole // fast case optimizations switch

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225662200 > If the consumer of `ParallelIterable` is fast enough, then this limit should have no impact, right? Yes, I agree, but on the other hand, if the limit is too high, it doesn't help

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
dekimir commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225671754 > on the other hand, if the limit is too high, it doesn't help either. Can't the caller set a lower limit then, by calling the new constructor overload? -- This is an automate

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
Fokko commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225760543 > Can't the caller set a lower limit then, by calling the new constructor overload? Yes, that's possible but then you already have to inherit quite a few classes to overload certai

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
stevenzwu commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676050363 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -88,7 +91,18 @@ private ParallelIterator( @Override public void close() {

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
stevenzwu commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676065032 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -192,4 +209,65 @@ public synchronized T next() { return queue.poll(); } }

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
stevenzwu commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676065032 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -192,4 +209,65 @@ public synchronized T next() { return queue.poll(); } }

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
stevenzwu commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225793651 @findepi thanks for working on this. we also ran into the memory issue internally when some manifest files are super large (like hundreds of MBs or GBs). Curious if you have d

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
nastra commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676095102 ## core/src/test/java/org/apache/iceberg/TestFastAppend.java: ## @@ -317,6 +317,60 @@ public void testRecoveryWithoutManifestList() { assertThat(metadata.current

Re: [PR] OpenAPI: Express server capabilities via /config endpoint [iceberg]

2024-07-12 Thread via GitHub
dimas-b commented on code in PR #9940: URL: https://github.com/apache/iceberg/pull/9940#discussion_r1676113409 ## open-api/rest-catalog-open-api.yaml: ## @@ -100,6 +121,16 @@ paths: Common catalog configuration settings are documented at https://iceberg.apac

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on PR #10523: URL: https://github.com/apache/iceberg/pull/10523#issuecomment-2225850994 Sorry for the delay on reviewing this @grantatspothero I'm taking a look with fresh eyes on the latest updates since the approach is different now -- This is an automated mes

[I] MERGE INTO SQL clause using java API [iceberg]

2024-07-12 Thread via GitHub
salah-djb opened a new issue, #10692: URL: https://github.com/apache/iceberg/issues/10692 ### Query engine - Spark 3.3.4 - Iceberg 1.5.2 ### Question I would like to perform merge into using java API (not SQL MERGE INTO clause) to update records in an iceberg table ba

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-12 Thread via GitHub
karuppayya commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1676148283 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +181,25 @@ public Statistics estimateStatistics() { protected Sta

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225902783 > > Can't the caller set a lower limit then, by calling the new constructor overload? > > Yes, that's possible but then you already have to inherit quite a few classes to overloa

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225908686 @stevenzwu thanks for your comments! > Curious if you have done any performance testing. echo to another comment. wondering if the default queue size of 10K would affect the thro

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676163120 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -192,4 +209,65 @@ public synchronized T next() { return queue.poll(); } } +

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676170877 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -88,7 +91,18 @@ private ParallelIterator( @Override public void close() {

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676163120 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -192,4 +209,65 @@ public synchronized T next() { return queue.poll(); } } +

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1676177423 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +181,25 @@ public Statistics estimateStatistics() { protected Statis

Re: [PR] Docs: Update defaults for distribution mode [iceberg]

2024-07-12 Thread via GitHub
szehon-ho commented on code in PR #10575: URL: https://github.com/apache/iceberg/pull/10575#discussion_r1676206823 ## docs/docs/configuration.md: ## @@ -67,7 +67,7 @@ Iceberg tables support table properties to configure table behavior, like the de | write.metadata.metrics.colu

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1676213273 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map> specToDel

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1676213273 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map> specToDel

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1676213273 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map> specToDel

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-12 Thread via GitHub
sl255051 commented on PR #10678: URL: https://github.com/apache/iceberg/pull/10678#issuecomment-2225993288 > @sl255051 appreciate you are taking the stub for the PR. > > But I am wondering why do you think column name case insensitivity is the right behavior when building PartitionSpe

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1676213273 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map> specToDel

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-12 Thread via GitHub
dramaticlly commented on PR #10678: URL: https://github.com/apache/iceberg/pull/10678#issuecomment-2226008338 > > @sl255051 appreciate you are taking the stub for the PR. > > But I am wondering why do you think column name case insensitivity is the right behavior when building PartitionSp

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676113425 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -565,6 +570,10 @@ protected boolean canInheritSnapshotId() { return canInheritSnap

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on PR #10523: URL: https://github.com/apache/iceberg/pull/10523#issuecomment-2226118154 Thanks @grantatspothero the overall approach makes sense and this time it is closely dependent on the internal state of `FastAppend` which combined with the new tests should ma

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676297839 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -198,6 +198,14 @@ protected void cleanUncommitted(Set committed) { } } + @Overrid

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1676303612 ## build.gradle: ## @@ -358,6 +358,7 @@ project(':iceberg-core') { implementation libs.jackson.databind implementation libs.caffeine implementation libs

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676308188 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -198,6 +198,14 @@ protected void cleanUncommitted(Set committed) { } } + @Overrid

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676308188 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -198,6 +198,14 @@ protected void cleanUncommitted(Set committed) { } } + @Overrid

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
danielcweeks commented on PR #10688: URL: https://github.com/apache/iceberg/pull/10688#issuecomment-2226150038 I don't think we should be adding new `LockManager` implementations. The general discussion has been that we want to deprecate catalog implementations that rely on external lockin

Re: [I] How to move Iceberg table from one location to another [iceberg]

2024-07-12 Thread via GitHub
namangoel31 commented on issue #3142: URL: https://github.com/apache/iceberg/issues/3142#issuecomment-2226154660 @cccs-jc, how do you determine the schema for writing to avro? I'm not able to get anything useful. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on PR #10352: URL: https://github.com/apache/iceberg/pull/10352#issuecomment-2226155639 Sorry about the delay on this, got busy and forgot I had this open! I've seen more related issue reports to this, so I'm going to prioritize it. -- This is an automated messa

Re: [PR] Allow writing dataframes that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
kevinjqliu commented on code in PR #829: URL: https://github.com/apache/iceberg-python/pull/829#discussion_r1676323868 ## pyiceberg/table/__init__.py: ## @@ -484,10 +484,6 @@ def append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT) _check_schema

Re: [I] provide night build [iceberg-python]

2024-07-12 Thread via GitHub
kevinjqliu closed issue #734: provide night build URL: https://github.com/apache/iceberg-python/issues/734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [I] provide night build [iceberg-python]

2024-07-12 Thread via GitHub
kevinjqliu commented on issue #734: URL: https://github.com/apache/iceberg-python/issues/734#issuecomment-2226162839 Closing in favor of #872 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Allow writing dataframes that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #829: URL: https://github.com/apache/iceberg-python/pull/829#discussion_r1676337956 ## pyiceberg/table/__init__.py: ## @@ -484,10 +484,6 @@ def append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT) _check_schema_com

Re: [I] Chnage the description in Table metadata spec about the cardinality/mapping between snapshot and puffin [iceberg]

2024-07-12 Thread via GitHub
karuppayya commented on issue #10693: URL: https://github.com/apache/iceberg/issues/10693#issuecomment-2226179436 cc: @findepi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-12 Thread via GitHub
HonahX merged PR #910: URL: https://github.com/apache/iceberg-python/pull/910 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

  1   2   >