Re: [I] Support Defining PartitionSpec and SortOrder without field-ids in create_table [iceberg-python]

2024-05-10 Thread via GitHub
jiaoew1991 commented on issue #338: URL: https://github.com/apache/iceberg-python/issues/338#issuecomment-2104062133 Hi @syun64 How is this issue going? I have been tormented by this restriction for several days and still haven't figured it out. 😓 -- This is an automated message from the

Re: [PR] feat: Extract FileRead and FileWrite trait [iceberg-rust]

2024-05-10 Thread via GitHub
Xuanwo commented on code in PR #364: URL: https://github.com/apache/iceberg-rust/pull/364#discussion_r1596403546 ## crates/iceberg/src/arrow/reader.rs: ## @@ -187,3 +197,43 @@ impl ArrowReader { } } } + +/// ArrowFileReader is a wrapper around a FileRead that impl

Re: [PR] feat: Extract FileRead and FileWrite trait [iceberg-rust]

2024-05-10 Thread via GitHub
Xuanwo commented on code in PR #364: URL: https://github.com/apache/iceberg-rust/pull/364#discussion_r1596405143 ## crates/iceberg/src/io.rs: ## @@ -206,6 +205,35 @@ impl FileIO { } } +/// The struct the represents the metadata of a file. +/// +/// TODO: we can add last

Re: [PR] feat: Extract FileRead and FileWrite trait [iceberg-rust]

2024-05-10 Thread via GitHub
Xuanwo commented on code in PR #364: URL: https://github.com/apache/iceberg-rust/pull/364#discussion_r1596407734 ## crates/iceberg/src/arrow/reader.rs: ## @@ -91,12 +98,15 @@ impl ArrowReader { Ok(try_stream! { while let Some(Ok(task)) = tasks.next().awai

Re: [I] Upgrade HadoopTableOperations.version from int32 to long64 [iceberg]

2024-05-10 Thread via GitHub
nastra commented on issue #10277: URL: https://github.com/apache/iceberg/issues/10277#issuecomment-2104207834 @jkolash you might want to report this to Snowflake as the version should currently be an int instead of a long to comply with the implementation in Iceberg -- This is an automat

[I] Flink: Maintenance - MonitorSource [iceberg]

2024-05-10 Thread via GitHub
pvary opened a new issue, #10300: URL: https://github.com/apache/iceberg/issues/10300 ### Feature Request / Improvement The responsibility of the Monitor source is to periodically check the table metadata and based on the new commits, generate the TableChange messages for the Trigger

[I] Flink: Maintenance - TriggerManager [iceberg]

2024-05-10 Thread via GitHub
pvary opened a new issue, #10301: URL: https://github.com/apache/iceberg/issues/10301 ### Feature Request / Improvement The responsibility of the Trigger Manager is to start the Maintenance Tasks based on the incoming Table Change messages and prevent overlapping Maintenance Task run

[I] Flink: Maintenance - CommitConverter [iceberg]

2024-05-10 Thread via GitHub
pvary opened a new issue, #10302: URL: https://github.com/apache/iceberg/issues/10302 ### Feature Request / Improvement The responsibility of the Commit Converter to convert the IcebergCommittables from the PostCommitToplogy to Table Changes. ### Query engine None --

[I] Flink: Maintenance - RewriteManifestFiles [iceberg]

2024-05-10 Thread via GitHub
pvary opened a new issue, #10305: URL: https://github.com/apache/iceberg/issues/10305 ### Feature Request / Improvement Reduce the number of manifest files generated by the high number of commits ### Query engine None -- This is an automated message from the Apache

[I] Flink: Maintenance - DeleteOrphanFiles [iceberg]

2024-05-10 Thread via GitHub
pvary opened a new issue, #10306: URL: https://github.com/apache/iceberg/issues/10306 ### Feature Request / Improvement to remove files from the table directory which are referenced by the table metadata ### Query engine None -- This is an automated message from t

Re: [I] Flink: Maintenance - MonitorSource [iceberg]

2024-05-10 Thread via GitHub
pvary closed issue #10300: Flink: Maintenance - MonitorSource URL: https://github.com/apache/iceberg/issues/10300 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

[I] Snowflke can't read tables generated by pyicberg ? [iceberg-python]

2024-05-10 Thread via GitHub
djouallah opened a new issue, #723: URL: https://github.com/apache/iceberg-python/issues/723 ### Question I keep getting this error ? `Failed to read from Iceberg file 'abfss://xx.dfs.core.windows.net/data/iceberg_dwh/scada/metadata/snap-2728627078701324745-0-7c1d442e-7321-

Re: [PR] AWS: Change S3FileIO to use SHA1 based checksums [iceberg]

2024-05-10 Thread via GitHub
muddyfish commented on PR #10293: URL: https://github.com/apache/iceberg/pull/10293#issuecomment-2104654017 I think that's a reasonable comment, and I'd be happy going for that path forward. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] AWS: Fix S3FileIO tests failing on ListObjects for Express buckets [iceberg]

2024-05-10 Thread via GitHub
muddyfish commented on PR #10292: URL: https://github.com/apache/iceberg/pull/10292#issuecomment-2104754756 I did not try with HadoopFileIO or with Iceberg maintenance operations. According to this PR https://github.com/apache/iceberg/pull/7914, it doesn't seem that `delete_orphan_fi

Re: [I] Cannot access table endpoint in REST catalog when table name contains a slash character (`/`) [iceberg-python]

2024-05-10 Thread via GitHub
ndrluis commented on issue #710: URL: https://github.com/apache/iceberg-python/issues/710#issuecomment-2104825672 Hello @RoseGoldIsntGay, how did you create this table? I couldn't find any specifications in the Iceberg documentation about naming, but I believe the convention is that names m

[PR] Remove pylintrc file [iceberg-python]

2024-05-10 Thread via GitHub
ndrluis opened a new pull request, #724: URL: https://github.com/apache/iceberg-python/pull/724 Resolves #666 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596959383 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java: ## @@ -0,0 +1,774 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596964200 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java: ## @@ -0,0 +1,774 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596970595 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java: ## @@ -0,0 +1,774 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596972921 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java: ## @@ -0,0 +1,774 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596974983 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkAggregator.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596976706 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommittable.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596982907 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommitter.java: ## @@ -0,0 +1,465 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596983987 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommitter.java: ## @@ -0,0 +1,465 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596998147 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/writer/IcebergSinkWriter.java: ## @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596997524 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/WriteResultSerializer.java: ## @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1596998147 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/writer/IcebergSinkWriter.java: ## @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-10 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1597002086 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/committer/TestIcebergFlinkManifest.java: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #10140: URL: https://github.com/apache/iceberg/pull/10140#discussion_r1597014821 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -40,6 +40,8 @@ final class JdbcUtil { // property to control if view support is added t

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-10 Thread via GitHub
amogh-jahagirdar commented on PR #10140: URL: https://github.com/apache/iceberg/pull/10140#issuecomment-2104984953 I'll go ahead and merge this since I think we have consensus on https://github.com/apache/iceberg/pull/10140#discussion_r1581123625. Thanks for reviewing @nastra @jbonofre !

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-10 Thread via GitHub
amogh-jahagirdar merged PR #10140: URL: https://github.com/apache/iceberg/pull/10140 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

[PR] Spark 3.5: Remove obsolete conf parsing logic [iceberg]

2024-05-10 Thread via GitHub
aokolnychyi opened a new pull request, #10309: URL: https://github.com/apache/iceberg/pull/10309 This PR removes some conf parsing logic that predates the split of Spark 2 and Spark 3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Spark 3.5: Remove obsolete conf parsing logic [iceberg]

2024-05-10 Thread via GitHub
aokolnychyi commented on code in PR #10309: URL: https://github.com/apache/iceberg/pull/10309#discussion_r1597024872 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -70,6 +70,14 @@ public DurationConfParser durationConf() { return new

Re: [PR] Spark 3.5: Remove obsolete conf parsing logic [iceberg]

2024-05-10 Thread via GitHub
aokolnychyi commented on code in PR #10309: URL: https://github.com/apache/iceberg/pull/10309#discussion_r1597026246 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -220,14 +228,10 @@ public ThisT tableProperty(String name) { }

Re: [PR] Spark 3.5: Remove obsolete conf parsing logic [iceberg]

2024-05-10 Thread via GitHub
aokolnychyi commented on code in PR #10309: URL: https://github.com/apache/iceberg/pull/10309#discussion_r1597026741 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -220,14 +228,10 @@ public ThisT tableProperty(String name) { }

Re: [I] Snowflake Iceberg Partitioned data read issue [iceberg]

2024-05-10 Thread via GitHub
tnatssb commented on issue #9404: URL: https://github.com/apache/iceberg/issues/9404#issuecomment-2105002267 @sfc-gh-rortloff is there plans for Snowflake to support Iceberg partitions? This seems like a very basic feature you should support. -- This is an automated message from the Apac

Re: [PR] Support partial deletes [iceberg-python]

2024-05-10 Thread via GitHub
jqin61 commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1597063514 ## pyiceberg/table/__init__.py: ## @@ -434,6 +460,8 @@ def overwrite( if table_arrow_schema != df.schema: df = df.cast(table_arrow_schema) +

Re: [PR] Support partial deletes [iceberg-python]

2024-05-10 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1597144084 ## pyiceberg/table/__init__.py: ## @@ -434,6 +460,8 @@ def overwrite( if table_arrow_schema != df.schema: df = df.cast(table_arrow_schema) +

Re: [PR] feat: Extract FileRead and FileWrite trait [iceberg-rust]

2024-05-10 Thread via GitHub
sdd commented on code in PR #364: URL: https://github.com/apache/iceberg-rust/pull/364#discussion_r1597145954 ## crates/iceberg/src/arrow/reader.rs: ## @@ -187,3 +197,43 @@ impl ArrowReader { } } } + +/// ArrowFileReader is a wrapper around a FileRead that impls p

Re: [PR] feat: Convert predicate to arrow filter and push down to parquet reader [iceberg-rust]

2024-05-10 Thread via GitHub
viirya commented on PR #295: URL: https://github.com/apache/iceberg-rust/pull/295#issuecomment-2105157805 @liurenjie1024 I've addressed your comments. Please take a look when you can. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] feat: Extract FileRead and FileWrite trait [iceberg-rust]

2024-05-10 Thread via GitHub
sdd commented on code in PR #364: URL: https://github.com/apache/iceberg-rust/pull/364#discussion_r1597146716 ## crates/iceberg/src/io.rs: ## @@ -206,6 +205,35 @@ impl FileIO { } } +/// The struct the represents the metadata of a file. +/// +/// TODO: we can add last mod

[PR] Remove NoSuchNamespaceError on namespace creation [iceberg-python]

2024-05-10 Thread via GitHub
ndrluis opened a new pull request, #726: URL: https://github.com/apache/iceberg-python/pull/726 Resolves #430 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [I] Support `snapshot_properties` argument for `add_files` function [iceberg-python]

2024-05-10 Thread via GitHub
syun64 closed issue #694: Support `snapshot_properties` argument for `add_files` function URL: https://github.com/apache/iceberg-python/issues/694 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] Spark 3.5: Support camel case session configs and options [iceberg]

2024-05-10 Thread via GitHub
aokolnychyi opened a new pull request, #10310: URL: https://github.com/apache/iceberg/pull/10310 This PR adds support for parsing camel case session configs and options. Our keys contain `-` but all built-in Spark configs follow the camel case style. Apart from the inconsistency, setting Ic

Re: [PR] Implement BoundPredicateVisitor trait for ManifestFilterVisitor [iceberg-rust]

2024-05-10 Thread via GitHub
s-akhtar-baig commented on PR #367: URL: https://github.com/apache/iceberg-rust/pull/367#issuecomment-2105247342 @sdd, thank you for reviewing the changes and providing references! I have modified my code based on your suggestions. Please take a look and let me know if I miss anything. --

[PR] Add EnumConfParser to SparkConfParser [iceberg]

2024-05-10 Thread via GitHub
huaxingao opened a new pull request, #10311: URL: https://github.com/apache/iceberg/pull/10311 Add `EnumConfParser` to `SparkConfParser` to parse Enum type properties -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Add EnumConfParser to SparkConfParser [iceberg]

2024-05-10 Thread via GitHub
huaxingao commented on PR #10311: URL: https://github.com/apache/iceberg/pull/10311#issuecomment-2105457128 cc @aokolnychyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[I] Rename IO traits to `IcebergFileRead` or `IcebergRead`? [iceberg-rust]

2024-05-10 Thread via GitHub
Xuanwo opened a new issue, #368: URL: https://github.com/apache/iceberg-rust/issues/368 Maybe we could call this `IcebergFileRead` or `IcebergRead`? I suppose that's a bit redundant as it would be clear that it is from us by navigating to the `use` statement where it gets impo

Re: [PR] feat: Extract FileRead and FileWrite trait [iceberg-rust]

2024-05-10 Thread via GitHub
Xuanwo commented on code in PR #364: URL: https://github.com/apache/iceberg-rust/pull/364#discussion_r1597332691 ## crates/iceberg/src/io.rs: ## @@ -206,6 +205,35 @@ impl FileIO { } } +/// The struct the represents the metadata of a file. +/// +/// TODO: we can add last

[I] Equality delete files lost after compact data files [iceberg]

2024-05-10 Thread via GitHub
CodingJun opened a new issue, #10312: URL: https://github.com/apache/iceberg/issues/10312 ### Apache Iceberg version 1.5.1 ### Query engine Spark ### Please describe the bug 🐞 I have a program that continuously write streaming data to iceberg, and regularly

Re: [I] Equality delete files lost after compact data files [iceberg]

2024-05-10 Thread via GitHub
lurnagao-dahua commented on issue #10312: URL: https://github.com/apache/iceberg/issues/10312#issuecomment-2105515833 Is there any error log for equality delete? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Equality delete files lost after compact data files [iceberg]

2024-05-10 Thread via GitHub
CodingJun commented on issue #10312: URL: https://github.com/apache/iceberg/issues/10312#issuecomment-2105518645 > Is there any error log for equality delete? No error, If I read directly from snapshot id: 3, the result is correct. -- This is an automated message from the Apache Git

Re: [I] Equality delete files lost after compact data files [iceberg]

2024-05-10 Thread via GitHub
CodingJun commented on issue #10312: URL: https://github.com/apache/iceberg/issues/10312#issuecomment-2105525236 I found the code to drop the equality delete files here. https://github.com/apache/iceberg/blob/2b21020aedb63c26295005d150c05f0a5a5f0eb2/core/src/main/java/org/apache/icebe

Re: [PR] feat: Adding literals [iceberg-go]

2024-05-10 Thread via GitHub
wolfeidau commented on PR #76: URL: https://github.com/apache/iceberg-go/pull/76#issuecomment-2105542638 @zeroshade this is a considerable amount of work, I really need to learn more about the internals of iceberg, mostly working to understand the metadata. Looks great, nothing stands

Re: [I] Equality delete lost after compact data files [iceberg]

2024-05-10 Thread via GitHub
pvary commented on issue #10312: URL: https://github.com/apache/iceberg/issues/10312#issuecomment-2105594944 @CodingJun: Your analysis seems correct to me. We need to take the `minDataSequenceNumber` and `startingSequenceNumber`. @RussellSpitzer and @aokolnychyi might know more. --

Re: [PR] Kafka Connect: Add kerberos authentication option [iceberg]

2024-05-10 Thread via GitHub
Dawnpool commented on PR #10173: URL: https://github.com/apache/iceberg/pull/10173#issuecomment-2105600977 @bryanck Hi, I have removed Hadoop dependencies and applied reflection to dynamically load `UserGroupInformation` class and methods, as you guided. Please take a look at the change