Re: [I] Add anchors to sections in "Configuration" documentation page [iceberg-python]

2024-06-11 Thread via GitHub
Anirudh-Narra commented on issue #808: URL: https://github.com/apache/iceberg-python/issues/808#issuecomment-2162206856 other than the FileIO section what other sections can be changed . -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Add accessor for Schema identifier_field_ids [iceberg-rust]

2024-06-11 Thread via GitHub
liurenjie1024 merged PR #388: URL: https://github.com/apache/iceberg-rust/pull/388 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [I] JdbcCatalog fails to initialize with MS SQL Server [iceberg]

2024-06-11 Thread via GitHub
jbonofre commented on issue #10068: URL: https://github.com/apache/iceberg/issues/10068#issuecomment-2162180915 @OElabed I'm working on 2 PRs: - short term, I'm implementing backend adapters as I did in ActiveMQ - mid term, I'm evaluating using jbid as abstract jdbc backend I wil

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
szehon-ho commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1635601982 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -132,6 +138,35 @@ private Schema calculateSchema() { Types.StringType.get

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
szehon-ho commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1635601982 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -132,6 +138,35 @@ private Schema calculateSchema() { Types.StringType.get

Re: [PR] Add accessor for Schema identifier_field_ids [iceberg-rust]

2024-06-11 Thread via GitHub
c-thiel commented on PR #388: URL: https://github.com/apache/iceberg-rust/pull/388#issuecomment-2162156979 Oh sorry, last change was on my mobile. Yes, I am going to fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add support for orc format [iceberg-python]

2024-06-11 Thread via GitHub
HonahX commented on code in PR #790: URL: https://github.com/apache/iceberg-python/pull/790#discussion_r1632797941 ## pyiceberg/io/pyarrow.py: ## @@ -912,6 +916,9 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return TimestamptzType()

Re: [PR] Add support for orc format [iceberg-python]

2024-06-11 Thread via GitHub
HonahX commented on code in PR #790: URL: https://github.com/apache/iceberg-python/pull/790#discussion_r1632769062 ## pyiceberg/io/pyarrow.py: ## @@ -799,11 +802,12 @@ def primitive(self, primitive: pa.DataType) -> T: def _get_field_id(field: pa.Field) -> Optional[int]: -

Re: [PR] Spark 3.5: Parallelize reading files in snapshot and migrate procedures [iceberg]

2024-06-11 Thread via GitHub
manuzhang commented on PR #10037: URL: https://github.com/apache/iceberg/pull/10037#issuecomment-2162137634 @nastra @RussellSpitzer @aokolnychyi Could you please take another look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Run RevAPI without Gradle [iceberg]

2024-06-11 Thread via GitHub
ajantha-bhat commented on issue #10368: URL: https://github.com/apache/iceberg/issues/10368#issuecomment-2162111097 @findepi: The problem is gradle plugin is not officially part of rev API maintainers. It is a separate group and they stopped maintaining it (last release was 2022).

Re: [PR] Enhancement: refine the reader interface [iceberg-rust]

2024-06-11 Thread via GitHub
Xuanwo commented on code in PR #401: URL: https://github.com/apache/iceberg-rust/pull/401#discussion_r1635797291 ## crates/iceberg/src/spec/values.rs: ## @@ -82,6 +91,46 @@ pub enum PrimitiveLiteral { Decimal(i128), } +fn serialize_ordered_float( +float: &OrderedFloa

Re: [PR] Derive Clone for TableUpdate [iceberg-rust]

2024-06-11 Thread via GitHub
liurenjie1024 merged PR #402: URL: https://github.com/apache/iceberg-rust/pull/402 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Hive: Return new scan after applying column project parameter [iceberg]

2024-06-11 Thread via GitHub
zhangbutao commented on code in PR #10449: URL: https://github.com/apache/iceberg/pull/10449#discussion_r1635749793 ## mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java: ## @@ -125,11 +125,9 @@ public List getSplits(JobContext context) { } String

Re: [PR] Hive: Return new scan after applying column project parameter [iceberg]

2024-06-11 Thread via GitHub
zhangbutao commented on code in PR #10449: URL: https://github.com/apache/iceberg/pull/10449#discussion_r1635743782 ## mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java: ## @@ -125,11 +125,9 @@ public List getSplits(JobContext context) { } String

Re: [I] AWS: Creating a Glue table with Lake Formation enabled fails [iceberg]

2024-06-11 Thread via GitHub
nickdelnano commented on issue #10226: URL: https://github.com/apache/iceberg/issues/10226#issuecomment-2161972842 @singhpk234 @jackye1995 @xiaoxuandev (authors of previous lake formation PRs), could you please check this issue -- This is an automated message from the Apache Git Service.

Re: [PR] Adding `add_files_overwrite` method [iceberg-python]

2024-06-11 Thread via GitHub
enkidulan commented on code in PR #810: URL: https://github.com/apache/iceberg-python/pull/810#discussion_r1635675320 ## pyiceberg/table/__init__.py: ## @@ -474,6 +474,26 @@ def add_files(self, file_paths: List[str], snapshot_properties: Dict[str, str] = for data_f

[I] Table scan using functional filters [iceberg-python]

2024-06-11 Thread via GitHub
bigluck opened a new issue, #170: URL: https://github.com/apache/iceberg-python/issues/170 ### Feature Request / Improvement Ciao @Fokko Seems like `table.scan()` supports a limited set of `filter` conditions, and it fails when a user specifies a complex one. In my case, I h

Re: [I] Iceberg tables creation using Spark2.4 [iceberg]

2024-06-11 Thread via GitHub
manuzhang commented on issue #10479: URL: https://github.com/apache/iceberg/issues/10479#issuecomment-2161862728 You may find latest support version 1.2.1 [in this table](https://iceberg.apache.org/multi-engine-support/#apache-spark) -- This is an automated message from the Apache Git Ser

Re: [I] Table scan using functional filters [iceberg-python]

2024-06-11 Thread via GitHub
github-actions[bot] commented on issue #170: URL: https://github.com/apache/iceberg-python/issues/170#issuecomment-2161829857 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apac

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
szehon-ho commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1635624765 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -140,6 +142,30 @@ public StructType partitionType() { return lazyPartitionType; } + /**

Re: [I] Table scan using functional filters [iceberg-python]

2024-06-11 Thread via GitHub
github-actions[bot] closed issue #170: Table scan using functional filters URL: https://github.com/apache/iceberg-python/issues/170 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Flink: handle rescale properly for range bounds in sketch statistics [iceberg]

2024-06-11 Thread via GitHub
stevenzwu commented on code in PR #10457: URL: https://github.com/apache/iceberg/pull/10457#discussion_r1635624299 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/AggregatedStatistics.java: ## @@ -35,28 +35,28 @@ class AggregatedStatistics implements Ser

Re: [I] iceberg-bundled-guava module shadowJar relocate use questions ask [iceberg]

2024-06-11 Thread via GitHub
github-actions[bot] closed issue #1750: iceberg-bundled-guava module shadowJar relocate use questions ask URL: https://github.com/apache/iceberg/issues/1750 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] iceberg-bundled-guava module shadowJar relocate use questions ask [iceberg]

2024-06-11 Thread via GitHub
github-actions[bot] commented on issue #1750: URL: https://github.com/apache/iceberg/issues/1750#issuecomment-2161828109 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Flink: handle rescale properly for range bounds in sketch statistics [iceberg]

2024-06-11 Thread via GitHub
stevenzwu commented on code in PR #10457: URL: https://github.com/apache/iceberg/pull/10457#discussion_r1635623085 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/DataStatisticsOperator.java: ## @@ -103,14 +105,41 @@ public void initializeState(StateInit

Re: [PR] Flink: Maintenance - TriggerManager [iceberg]

2024-06-11 Thread via GitHub
stevenzwu commented on code in PR #10484: URL: https://github.com/apache/iceberg/pull/10484#discussion_r1635591884 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/MetricConstants.java: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
szehon-ho commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1635601982 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -132,6 +138,35 @@ private Schema calculateSchema() { Types.StringType.get

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
szehon-ho commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1635601982 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -132,6 +138,35 @@ private Schema calculateSchema() { Types.StringType.get

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
szehon-ho commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1635601982 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -132,6 +138,35 @@ private Schema calculateSchema() { Types.StringType.get

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
szehon-ho commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1635600379 ## api/src/main/java/org/apache/iceberg/Schema.java: ## @@ -507,4 +539,48 @@ public String toString() { .map(this::identifierFieldToString)

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
szehon-ho commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1635600379 ## api/src/main/java/org/apache/iceberg/Schema.java: ## @@ -507,4 +539,48 @@ public String toString() { .map(this::identifierFieldToString)

Re: [PR] Run Flink, Spark3 tests on Java 17 too [iceberg]

2024-06-11 Thread via GitHub
stevenzwu commented on PR #10477: URL: https://github.com/apache/iceberg/pull/10477#issuecomment-2161743778 > [flink-scala-2-12-tests (8, 1.17)](https://github.com/apache/iceberg/actions/runs/9464829758/job/26072977977?pr=10477#logs) failed, not sure why. @pvary is your fix applicabl

Re: [PR] #10275 - fix NullPointerException [iceberg]

2024-06-11 Thread via GitHub
slessard commented on PR #10284: URL: https://github.com/apache/iceberg/pull/10284#issuecomment-2161712697 @nastra I've added a new unit test in iceberg/arrow/src/test/java/org/apache/iceberg/arrow/vectorized/ArrowReaderTest.java. I believe / hope this new unit tests satisfies your requirem

Re: [I] JdbcCatalog fails to initialize with MS SQL Server [iceberg]

2024-06-11 Thread via GitHub
OElabed commented on issue #10068: URL: https://github.com/apache/iceberg/issues/10068#issuecomment-2161584736 @jbonofre thank you for supporting that. I have just 2 questions: - i see that you target to support multiple backend (MySQL, MS SQL, ...). Do you target to support oracle d

[PR] Bump msal from 1.26.0 to 1.28.0 [iceberg-python]

2024-06-11 Thread via GitHub
dependabot[bot] opened a new pull request, #812: URL: https://github.com/apache/iceberg-python/pull/812 Bumps [msal](https://github.com/AzureAD/microsoft-authentication-library-for-python) from 1.26.0 to 1.28.0. Release notes Sourced from https://github.com/AzureAD/microsoft-authe

[PR] Bump azure-identity from 1.15.0 to 1.16.1 [iceberg-python]

2024-06-11 Thread via GitHub
dependabot[bot] opened a new pull request, #811: URL: https://github.com/apache/iceberg-python/pull/811 Bumps [azure-identity](https://github.com/Azure/azure-sdk-for-python) from 1.15.0 to 1.16.1. Release notes Sourced from https://github.com/Azure/azure-sdk-for-python/releases";>a

Re: [PR] Run revapi workflow on workflow/build system changes [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10485: URL: https://github.com/apache/iceberg/pull/10485#issuecomment-2161533974 It doesn't fix anything, because nothing is currently broken. The change is supposed to make it less likely that things break in the future. -- This is an automated message from the A

Re: [PR] Run Flink, Spark3 tests on Java 17 too [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10477: URL: https://github.com/apache/iceberg/pull/10477#issuecomment-2161510779 This time the build was green, including [flink-scala-2-12-tests (8, 1.17)](https://github.com/apache/iceberg/actions/runs/9465703451/job/26075762139?pr=10477#logs) -- This is an auto

Re: [PR] Support building with Java 21 [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10474: URL: https://github.com/apache/iceberg/pull/10474#issuecomment-2161509523 cc @rdblue @szehon-ho @Fokko @jbonofre -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Support building with Java 21 [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10474: URL: https://github.com/apache/iceberg/pull/10474#issuecomment-2161508864 The build is green, ready for review, but still blocked by https://github.com/apache/iceberg/issues/10368#issuecomment-2161039351. Therefore keeping as a draft. -- This is an automat

Re: [PR] Run revapi workflow on workflow/build system changes [iceberg]

2024-06-11 Thread via GitHub
jbonofre commented on PR #10485: URL: https://github.com/apache/iceberg/pull/10485#issuecomment-2161482804 I understand your point but not sure it would happen. Revapi just checks the Java modules. I'm not against your change, I just wonder if it actually fix/improve something 😃 -- This

Re: [PR] Run revapi workflow on workflow/build system changes [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10485: URL: https://github.com/apache/iceberg/pull/10485#issuecomment-2161479241 thank you @jbonofre for your review and feedback! > Not sure it makes sense to trigger revapi (api compatibility workflow) when gradle or github folders are updated. > > Ca

Re: [PR] #10275 - fix NullPointerException [iceberg]

2024-06-11 Thread via GitHub
slessard commented on PR #10284: URL: https://github.com/apache/iceberg/pull/10284#issuecomment-2161448440 @nastra I believe the bug I am trying to fix is associated with Arrow readers, not Spark readers. The file in which you requested I add a unit test is specific to Spark. Is there a pla

Re: [I] Flink sink writes duplicate data in upsert mode [iceberg]

2024-06-11 Thread via GitHub
pvary commented on issue #10431: URL: https://github.com/apache/iceberg/issues/10431#issuecomment-2161437645 Seems like an issue with checkpoint retry. Will be out of office for a bit, but this needs to be investigated. -- This is an automated message from the Apache Git Service. To res

Re: [PR] Implement BoundPredicateVisitor trait for ManifestFilterVisitor [iceberg-rust]

2024-06-11 Thread via GitHub
sdd commented on PR #367: URL: https://github.com/apache/iceberg-rust/pull/367#issuecomment-2161359056 Hi @liurenjie1024 - sorry to pester you but are you able to re-review this please? It's the last major piece of the puzzle on the read side. -- This is an automated message from the Apa

Re: [PR] Cache Manifest files [iceberg-python]

2024-06-11 Thread via GitHub
chinmay-bhat commented on PR #787: URL: https://github.com/apache/iceberg-python/pull/787#issuecomment-2161194170 > Being able to configure (and also disable) the cachine would be a very nice touch Hi @Fokko, I'm curious to know why we would want to allow custom sized manifest caches

Re: [PR] Hive: Return new scan after applying column project parameter [iceberg]

2024-06-11 Thread via GitHub
pvary commented on code in PR #10449: URL: https://github.com/apache/iceberg/pull/10449#discussion_r1635167684 ## mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java: ## @@ -125,11 +125,9 @@ public List getSplits(JobContext context) { } String sche

Re: [PR] Adding `add_files_overwrite` method [iceberg-python]

2024-06-11 Thread via GitHub
enkidulan commented on code in PR #810: URL: https://github.com/apache/iceberg-python/pull/810#discussion_r1635151718 ## pyiceberg/table/__init__.py: ## @@ -474,6 +474,26 @@ def add_files(self, file_paths: List[str], snapshot_properties: Dict[str, str] = for data_f

Re: [PR] Support snapshot management operations like creating tags by adding `ManageSnapshots` API [iceberg-python]

2024-06-11 Thread via GitHub
chinmay-bhat commented on PR #728: URL: https://github.com/apache/iceberg-python/pull/728#issuecomment-2161078410 @Fokko @HonahX I believe all suggestions have now been added to the PR. Can we re-trigger the tests? -- This is an automated message from the Apache Git Service. To respond t

Re: [I] Run RevAPI without Gradle [iceberg]

2024-06-11 Thread via GitHub
jbonofre commented on issue #10368: URL: https://github.com/apache/iceberg/issues/10368#issuecomment-2161039351 @findepi if you change a public API (let's say `SessionCatalog` and `RESTSessionCatalog`, adding a new method there), with Gradle 8.2+, no error, whereas it should have detected a

Re: [PR] Run revapi workflow on workflow/build system changes [iceberg]

2024-06-11 Thread via GitHub
jbonofre commented on PR #10485: URL: https://github.com/apache/iceberg/pull/10485#issuecomment-2161023449 Not sure it makes sense to trigger revapi (api compatibility workflow) when gradle or github folders are updated. Can you please provide the rationale here ? The problem w

Re: [PR] Run revapi workflow on workflow/build system changes [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10485: URL: https://github.com/apache/iceberg/pull/10485#issuecomment-2161008747 cc @jbonofre @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[PR] Run revapi workflow on workflow/build system changes [iceberg]

2024-06-11 Thread via GitHub
findepi opened a new pull request, #10485: URL: https://github.com/apache/iceberg/pull/10485 a PR updating gradle (https://github.com/apache/iceberg/pull/10474) should not have green CI because of the revapi problem outlined here https://github.com/apache/iceberg/pull/10476#issuecomment-216

Re: [PR] HELP WANTED: Jackson access issue [iceberg]

2024-06-11 Thread via GitHub
pan3793 commented on PR #10460: URL: https://github.com/apache/iceberg/pull/10460#issuecomment-2160973234 @Vampire big thanks! I confirmed that test passed after `org.apache.calcite.avatica:avatica`, I'm going to evaluate if we can exclude this dep directly or upgrade to a newer version.

Re: [PR] HELP WANTED: Jackson access issue [iceberg]

2024-06-11 Thread via GitHub
Vampire commented on PR #10460: URL: https://github.com/apache/iceberg/pull/10460#issuecomment-2160965919 `org.apache.calcite.avatica:avatica:1.8.0` is misbehaving. It shades an old version of `jackson-core` without relocating it. You have that dependency in your `test` classpaths and i

Re: [I] Flink: Maintenance - TriggerManager [iceberg]

2024-06-11 Thread via GitHub
pvary commented on issue #10301: URL: https://github.com/apache/iceberg/issues/10301#issuecomment-2160956890 PR created -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Flink: Maintenance - TriggerManager [iceberg]

2024-06-11 Thread via GitHub
pvary closed issue #10301: Flink: Maintenance - TriggerManager URL: https://github.com/apache/iceberg/issues/10301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[PR] Flink: Maintenance - TriggerManager [iceberg]

2024-06-11 Thread via GitHub
pvary opened a new pull request, #10484: URL: https://github.com/apache/iceberg/pull/10484 The responsibility of the Trigger Manager is to start the Maintenance Tasks based on the incoming Table Change messages and prevent overlapping Maintenance Task runs. The event time of the Trigger mes

Re: [PR] Flink: handle rescale properly for range bounds in sketch statistics [iceberg]

2024-06-11 Thread via GitHub
stevenzwu commented on code in PR #10457: URL: https://github.com/apache/iceberg/pull/10457#discussion_r1635019206 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/GlobalStatistics.java: ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
RussellSpitzer commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r163570 ## api/src/main/java/org/apache/iceberg/Schema.java: ## @@ -507,4 +539,48 @@ public String toString() { .map(this::identifierFieldToString)

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
RussellSpitzer commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1634994844 ## api/src/main/java/org/apache/iceberg/Schema.java: ## @@ -507,4 +539,48 @@ public String toString() { .map(this::identifierFieldToString)

Re: [PR] Implement Kerberos authentication support for Hive Catalog [iceberg-python]

2024-06-11 Thread via GitHub
yothinix commented on PR #766: URL: https://github.com/apache/iceberg-python/pull/766#issuecomment-2160907957 Hi @Fokko I added new change as commented, could you help review it again Also, As I tested more on HMS behind Kerberize we found out that it's required the thrift client to b

Re: [PR] Implement Kerberos authentication support for Hive Catalog [iceberg-python]

2024-06-11 Thread via GitHub
yothinix commented on code in PR #766: URL: https://github.com/apache/iceberg-python/pull/766#discussion_r1634984001 ## mkdocs/docs/configuration.md: ## @@ -228,19 +228,19 @@ catalog: catalog: default: uri: thrift://localhost:9083 -s3.endpoint: http://localhost:9000

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-06-11 Thread via GitHub
RussellSpitzer commented on code in PR #10020: URL: https://github.com/apache/iceberg/pull/10020#discussion_r1634983902 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -140,6 +142,30 @@ public StructType partitionType() { return lazyPartitionType; } +

Re: [PR] Implement Kerberos authentication support for Hive Catalog [iceberg-python]

2024-06-11 Thread via GitHub
yothinix commented on code in PR #766: URL: https://github.com/apache/iceberg-python/pull/766#discussion_r163498 ## mkdocs/docs/configuration.md: ## @@ -228,19 +228,19 @@ catalog: catalog: default: uri: thrift://localhost:9083 -s3.endpoint: http://localhost:9000

Re: [PR] Implement Kerberos authentication support for Hive Catalog [iceberg-python]

2024-06-11 Thread via GitHub
yothinix commented on code in PR #766: URL: https://github.com/apache/iceberg-python/pull/766#discussion_r1634979621 ## mkdocs/docs/configuration.md: ## @@ -228,19 +228,19 @@ catalog: catalog: default: uri: thrift://localhost:9083 -s3.endpoint: http://localhost:9000

Re: [I] Flink sink writes duplicate data in upsert mode [iceberg]

2024-06-11 Thread via GitHub
zhongqishang commented on issue #10431: URL: https://github.com/apache/iceberg/issues/10431#issuecomment-2160846434 @pvary I encountered the same problem on another table, this time it was caused by a checkpoint PRC timeout. JM log ``` 2024-06-07 15:50:10.472 [Checkpoint Timer]

Re: [PR] [Build] Add a script to execute revapi without gradle plugin [iceberg]

2024-06-11 Thread via GitHub
Fokko commented on code in PR #10386: URL: https://github.com/apache/iceberg/pull/10386#discussion_r1634919716 ## dev/revapi: ## @@ -0,0 +1,88 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. S

Re: [PR] [Build] Add a script to execute revapi without gradle plugin [iceberg]

2024-06-11 Thread via GitHub
jbonofre commented on code in PR #10386: URL: https://github.com/apache/iceberg/pull/10386#discussion_r1634907444 ## dev/revapi: ## @@ -0,0 +1,88 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements.

Re: [PR] [Build] Add a script to execute revapi without gradle plugin [iceberg]

2024-06-11 Thread via GitHub
Fokko commented on code in PR #10386: URL: https://github.com/apache/iceberg/pull/10386#discussion_r1634900403 ## dev/revapi: ## @@ -0,0 +1,88 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. S

Re: [PR] HELP WANTED: Jackson access issue [iceberg]

2024-06-11 Thread via GitHub
pan3793 commented on PR #10460: URL: https://github.com/apache/iceberg/pull/10460#issuecomment-2160764510 output of `./gradlew :iceberg-mr:dependencies | grep 'com.fasterxml.jackson'` ``` ||+--- com.fasterxml.jackson.core:jackson-core:2.6.3 -> 2.14.2 |||\--- com.f

[I] Getting schema of the all tables instead of actual data [iceberg]

2024-06-11 Thread via GitHub
FenilJain2301 opened a new issue, #10483: URL: https://github.com/apache/iceberg/issues/10483 ### Apache Iceberg version 1.5.2 (latest release) ### Query engine Dremio ### Please describe the bug 🐞 Hi, we are using debezium as a part of our architecture wher

Re: [PR] Adding `add_files_overwrite` method [iceberg-python]

2024-06-11 Thread via GitHub
syun64 commented on code in PR #810: URL: https://github.com/apache/iceberg-python/pull/810#discussion_r1634878236 ## pyiceberg/table/__init__.py: ## @@ -474,6 +474,26 @@ def add_files(self, file_paths: List[str], snapshot_properties: Dict[str, str] = for data_file

Re: [PR] Run Flink, Spark3 tests on Java 17 too [iceberg]

2024-06-11 Thread via GitHub
findepi closed pull request #10477: Run Flink, Spark3 tests on Java 17 too URL: https://github.com/apache/iceberg/pull/10477 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Run Flink, Spark3 tests on Java 17 too [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10477: URL: https://github.com/apache/iceberg/pull/10477#issuecomment-2160660331 [flink-scala-2-12-tests (8, 1.17)](https://github.com/apache/iceberg/actions/runs/9464829758/job/26072977977?pr=10477#logs) failed, not sure why. -- This is an automated message fro

Re: [PR] Manifest list encryption [iceberg]

2024-06-11 Thread via GitHub
ggershinsky commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1634760209 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -147,6 +175,42 @@ static Snapshot fromJson(JsonNode node) { if (node.has(MANIFEST_LIST)) {

Re: [PR] Build: Upgrade to gradle 8.6 [iceberg]

2024-06-11 Thread via GitHub
jbonofre commented on PR #8486: URL: https://github.com/apache/iceberg/pull/8486#issuecomment-2160608165 @findepi that's correct for the gradle update PR because we want to fix revapi and shadow plugins first. I will update the PR to avoid confusion. Thanks ! -- This is an automated mess

Re: [PR] Manifest list encryption [iceberg]

2024-06-11 Thread via GitHub
ggershinsky commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1634750937 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -93,6 +102,21 @@ static void toJson(Snapshot snapshot, JsonGenerator generator) throws IOExcept

Re: [PR] Manifest list encryption [iceberg]

2024-06-11 Thread via GitHub
ggershinsky commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1634747442 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -93,6 +102,21 @@ static void toJson(Snapshot snapshot, JsonGenerator generator) throws IOExcept

Re: [PR] Manifest list encryption [iceberg]

2024-06-11 Thread via GitHub
ggershinsky commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1634747442 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -93,6 +102,21 @@ static void toJson(Snapshot snapshot, JsonGenerator generator) throws IOExcept

Re: [PR] Pin 3rd party CI action version [iceberg]

2024-06-11 Thread via GitHub
Fokko merged PR #10481: URL: https://github.com/apache/iceberg/pull/10481 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Manifest list encryption [iceberg]

2024-06-11 Thread via GitHub
anuragmantri commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1634724398 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -93,6 +102,21 @@ static void toJson(Snapshot snapshot, JsonGenerator generator) throws IOExcep

Re: [PR] Update Gradle to 8.8 [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10476: URL: https://github.com/apache/iceberg/pull/10476#issuecomment-2160557067 thanks @Fokko closing in favor of https://github.com/apache/iceberg/pull/8486 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Update Gradle to 8.8 [iceberg]

2024-06-11 Thread via GitHub
findepi closed pull request #10476: Update Gradle to 8.8 URL: https://github.com/apache/iceberg/pull/10476 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [I] Enhancement: refine the reader interface [iceberg-rust]

2024-06-11 Thread via GitHub
ZENOTME commented on issue #398: URL: https://github.com/apache/iceberg-rust/issues/398#issuecomment-2160551434 I sent a PR(#401) to draft the idea, feel free to tell me if there is something that can be improved. -- This is an automated message from the Apache Git Service. To respond to

[PR] Enhancement: refine the reader interface [iceberg-rust]

2024-06-11 Thread via GitHub
ZENOTME opened a new pull request, #401: URL: https://github.com/apache/iceberg-rust/pull/401 This PR is a draft for #398. If it looks good, I will fill out the test later. The basic here is to move the info needed for the reader to FileScanTask. In this way, we can avoid the inconsistency

Re: [PR] Run Flink, Spark3 tests on Java 17 too [iceberg]

2024-06-11 Thread via GitHub
findepi commented on PR #10477: URL: https://github.com/apache/iceberg/pull/10477#issuecomment-2160536777 > I'm open to running against Java 17. My only concern is that the Spark/Flink tests take a very long time to run, and adding another item to the test-matrix will consume a lot of CI ca

Re: [PR] Docs: Add flinkVersion and flinkVersionMajor instead of hardcode [iceberg]

2024-06-11 Thread via GitHub
manuzhang commented on code in PR #10463: URL: https://github.com/apache/iceberg/pull/10463#discussion_r1634650624 ## docs/docs/flink.md: ## @@ -115,15 +115,15 @@ wget ${FLINK_CONNECTOR_URL}/${FLINK_CONNECTOR_PACKAGE}-${HIVE_VERSION}_${SCALA_V Install the Apache Flink dependen

Re: [PR] Build: Bump mkdocs-material from 9.5.25 to 9.5.26 [iceberg]

2024-06-11 Thread via GitHub
Fokko merged PR #10464: URL: https://github.com/apache/iceberg/pull/10464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Docs: Add flinkVersion and flinkVersionMajor instead of hardcode [iceberg]

2024-06-11 Thread via GitHub
Fokko commented on code in PR #10463: URL: https://github.com/apache/iceberg/pull/10463#discussion_r1634624062 ## docs/docs/flink.md: ## @@ -115,15 +115,15 @@ wget ${FLINK_CONNECTOR_URL}/${FLINK_CONNECTOR_PACKAGE}-${HIVE_VERSION}_${SCALA_V Install the Apache Flink dependency u

Re: [PR] Build: Bump com.azure:azure-sdk-bom from 1.2.23 to 1.2.24 [iceberg]

2024-06-11 Thread via GitHub
Fokko merged PR #10420: URL: https://github.com/apache/iceberg/pull/10420 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Parquet: Make row-group filters cooperate to filter [iceberg]

2024-06-11 Thread via GitHub
zhongyujiang commented on code in PR #10090: URL: https://github.com/apache/iceberg/pull/10090#discussion_r1634617829 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetBloomRowGroupFilter.java: ## @@ -290,7 +299,7 @@ private boolean shouldRead( hashValue

Re: [PR] Remove redundant -XX:+IgnoreUnrecognizedVMOptions [iceberg]

2024-06-11 Thread via GitHub
Fokko commented on PR #10475: URL: https://github.com/apache/iceberg/pull/10475#issuecomment-2160402174 @singhpk234 Any specific reason for adding the `-XX:+IgnoreUnrecognizedVMOptions` flag? -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Update Gradle to 8.8 [iceberg]

2024-06-11 Thread via GitHub
Fokko commented on PR #10476: URL: https://github.com/apache/iceberg/pull/10476#issuecomment-2160331173 Semi duplicate of https://github.com/apache/iceberg/pull/8486 @findepi Thanks for raising this, and we're working on getting the Gradle version bumped to a later one. @jbonofre is d

Re: [PR] Run Flink, Spark3 tests on Java 17 too [iceberg]

2024-06-11 Thread via GitHub
Fokko commented on PR #10477: URL: https://github.com/apache/iceberg/pull/10477#issuecomment-2160311336 Thanks for raising this PR @findepi. Looks like the exclusion isn't working: https://github.com/apache/iceberg/assets/1134248/df7ab42d-b6be-4307-980c-5f9b01f019ae";> I

Re: [PR] Flink: handle rescale properly for range bounds in sketch statistics [iceberg]

2024-06-11 Thread via GitHub
pvary commented on code in PR #10457: URL: https://github.com/apache/iceberg/pull/10457#discussion_r1634507795 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/GlobalStatistics.java: ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation (

[PR] Pin 3rd party CI action version [iceberg]

2024-06-11 Thread via GitHub
findepi opened a new pull request, #10481: URL: https://github.com/apache/iceberg/pull/10481 GitHub allows to delete and re-publish a tag, so referencing 3rd party action by tag name should be discouraged. -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Run Flink and Spark3 tests on Java 17 too [iceberg]

2024-06-11 Thread via GitHub
findepi commented on code in PR #10477: URL: https://github.com/apache/iceberg/pull/10477#discussion_r1634417701 ## .github/workflows/flink-ci.yml: ## @@ -70,8 +70,9 @@ jobs: flink-scala-2-12-tests: runs-on: ubuntu-22.04 strategy: + fail-fast: false matr

Re: [PR] feat: Add storage features for iceberg [iceberg-rust]

2024-06-11 Thread via GitHub
Xuanwo commented on code in PR #400: URL: https://github.com/apache/iceberg-rust/pull/400#discussion_r1634312506 ## Cargo.toml: ## @@ -65,7 +65,7 @@ log = "^0.4" mockito = "^1" murmur3 = "0.5.2" once_cell = "1" -opendal = "0.46" +opendal = "0.47" Review Comment: I update