Re: [PR] feat(FileScanTask): partial execute impl for parquet [iceberg-rust]

2024-02-28 Thread via GitHub
sdd commented on code in PR #207: URL: https://github.com/apache/iceberg-rust/pull/207#discussion_r1507143315 ## crates/iceberg/src/scan.rs: ## @@ -163,6 +178,54 @@ impl TableScan { Ok(iter(file_scan_tasks).boxed()) } + +/// Transforms a stream of FileScanTas

Re: [PR] feat(FileScanTask): partial execute impl for parquet [iceberg-rust]

2024-02-28 Thread via GitHub
sdd commented on code in PR #207: URL: https://github.com/apache/iceberg-rust/pull/207#discussion_r1507141512 ## crates/iceberg/src/scan.rs: ## @@ -163,6 +178,54 @@ impl TableScan { Ok(iter(file_scan_tasks).boxed()) } + +/// Transforms a stream of FileScanTas

Re: [I] Calling `rewrite_position_delete_files` rewrites into same amount of files [iceberg]

2024-02-28 Thread via GitHub
manuzhang commented on issue #9833: URL: https://github.com/apache/iceberg/issues/9833#issuecomment-1970584595 have you tried option `'rewrite-all', 'true'`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] Iceberg support ranger to make access data more safety [iceberg]

2024-02-28 Thread via GitHub
shohamyamin commented on issue #3619: URL: https://github.com/apache/iceberg/issues/3619#issuecomment-1970568320 Why not instead of ranger plugin let's make an OPA (Open Policy Agent) Plugin? Similar to what Trino did in the last month with their trino-opa plugin, this will enable a new way

Re: [PR] feat(FileScanTask): partial execute impl for parquet [iceberg-rust]

2024-02-28 Thread via GitHub
sdd commented on code in PR #207: URL: https://github.com/apache/iceberg-rust/pull/207#discussion_r1507126728 ## crates/iceberg/src/scan.rs: ## @@ -163,6 +178,54 @@ impl TableScan { Ok(iter(file_scan_tasks).boxed()) } + +/// Transforms a stream of FileScanTas

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-28 Thread via GitHub
ajantha-bhat commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1970555983 Thanks for helping in narrowing down @RussellSpitzer 👍 We still need to figureout the solution to this problem. But I am not sure how to reproduce locally with small data.

[I] Rewrite delete position files rewrites into same amount of files [iceberg]

2024-02-28 Thread via GitHub
bk-mz opened a new issue, #9833: URL: https://github.com/apache/iceberg/issues/9833 ### Apache Iceberg version 1.4.3 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Hey folks, we're using `rewrite_position_delete_files` to compact delet

Re: [I] add support for DuckDB views as a valid data format [iceberg-python]

2024-02-28 Thread via GitHub
corleyma commented on issue #407: URL: https://github.com/apache/iceberg-python/issues/407#issuecomment-1970543473 Sounds like the ask here is for similar functionality in duckdb as was implemented in polars scan_iceberg. -- This is an automated message from the Apache Git Service. To res

Re: [I] Make the OAuth2 request audience configurable [iceberg-python]

2024-02-28 Thread via GitHub
himadripal commented on issue #479: URL: https://github.com/apache/iceberg-python/issues/479#issuecomment-1970540290 `scope` is already configurable. Apart from `scope` and `audience`, there are few more optional parameter mentioned [here](https://datatracker.ietf.org/doc/html/rfc8693#name-

Re: [I] Support iceberg hadoop catalog in python library [iceberg-python]

2024-02-28 Thread via GitHub
corleyma commented on issue #17: URL: https://github.com/apache/iceberg-python/issues/17#issuecomment-1970530722 @Fokko We do a setup similar to this for integration tests, but the ability to write faster unit tests that depend only on a temp directory fixture in pytest has been great for o

Re: [PR] Docs: Fix links to internal files [iceberg]

2024-02-28 Thread via GitHub
bitsondatadev commented on code in PR #9819: URL: https://github.com/apache/iceberg/pull/9819#discussion_r1507058032 ## docs/docs/configuration.md: ## @@ -122,16 +122,16 @@ The value of these properties are not persisted as a part of the table metadata. Iceberg catalogs supp

[PR] Build: Bump Spark 3.5 to 3.5.1 [iceberg]

2024-02-28 Thread via GitHub
manuzhang opened a new pull request, #9832: URL: https://github.com/apache/iceberg/pull/9832 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

Re: [PR] Flink: Supports specifying comment for iceberg fields in create table and addcolumn syntax using flinksql [iceberg]

2024-02-28 Thread via GitHub
pvary commented on PR #9606: URL: https://github.com/apache/iceberg/pull/9606#issuecomment-1970519161 @huyuanfeng2018: Please fix the checks checkstyle issue @stevenzwu: do you want to take another look? I asked for a bit wider changes -- This is an automated message from the Apach

Re: [PR] Basic manifest encryption [iceberg]

2024-02-28 Thread via GitHub
ggershinsky commented on code in PR #8252: URL: https://github.com/apache/iceberg/pull/8252#discussion_r1507086496 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmOutputStream.java: ## @@ -129,6 +129,11 @@ public void close() throws IOException { targetStream.clos

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-28 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1505224636 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -368,58 +431,63 @@ private void renameToFinal(FileSystem fs, Path src, Path dst, int

[I] Support CreateTableTransaction [iceberg-python]

2024-02-28 Thread via GitHub
HonahX opened a new issue, #483: URL: https://github.com/apache/iceberg-python/issues/483 ### Feature Request / Improvement Since we can write to tables, it would be great if we can create a new table and insert data in a single transaction. Example use case includes: - https://git

Re: [I] Delete column in iceberg table with hive catalog throws exception [iceberg]

2024-02-28 Thread via GitHub
wngus606 commented on issue #1092: URL: https://github.com/apache/iceberg/issues/1092#issuecomment-1970423445 @rdblue Hello. I'm leaving an inquiry because I want to understand the "hive.metastore.disallow.incompatible.col.type.changes" setting in detail. When I read the [docu

Re: [I] [Spark 3.4] java.lang.Integer cannot be cast to org.apache.iceberg.StructLike [iceberg]

2024-02-28 Thread via GitHub
amogh-jahagirdar closed issue #9831: [Spark 3.4] java.lang.Integer cannot be cast to org.apache.iceberg.StructLike URL: https://github.com/apache/iceberg/issues/9831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] [Spark 3.4] java.lang.Integer cannot be cast to org.apache.iceberg.StructLike [iceberg]

2024-02-28 Thread via GitHub
amogh-jahagirdar commented on issue #9831: URL: https://github.com/apache/iceberg/issues/9831#issuecomment-1970400720 I'm going to close this for now, but if you still see this problem after upgrading to 1.5 (still not released, but should be very soon) please reopen. -- This is an automa

Re: [I] [Spark 3.4] java.lang.Integer cannot be cast to org.apache.iceberg.StructLike [iceberg]

2024-02-28 Thread via GitHub
amogh-jahagirdar commented on issue #9831: URL: https://github.com/apache/iceberg/issues/9831#issuecomment-1970399853 @bluzy this should be fixed in Iceberg 1.5, here's the PR that should address it: https://github.com/apache/iceberg/pull/9176 -- This is an automated message from the Apac

Re: [PR] Dynamically support Spark native engine in Iceberg [iceberg]

2024-02-28 Thread via GitHub
zinking commented on PR #9826: URL: https://github.com/apache/iceberg/pull/9826#issuecomment-1970381461 are there any performance metrics ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Flink: Made IcebergFilesCommitter work with single phase commit [iceberg]

2024-02-28 Thread via GitHub
mudit-97 commented on PR #9694: URL: https://github.com/apache/iceberg/pull/9694#issuecomment-1970362915 @stevenzwu , the Pubsub operator will ack the messages in notifyCheckpointComplete() -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] Spark 3.5.0 `MERGE INTO` breaks [iceberg]

2024-02-28 Thread via GitHub
manuzhang commented on issue #9827: URL: https://github.com/apache/iceberg/issues/9827#issuecomment-1970355575 Same issue as reported in https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1704469705606459, can you try removing accept-any-schema table property if it's set? --

[I] [Spark 3.4] java.lang.Integer cannot be cast to org.apache.iceberg.StructLike [iceberg]

2024-02-28 Thread via GitHub
bluzy opened a new issue, #9831: URL: https://github.com/apache/iceberg/issues/9831 ### Apache Iceberg version 1.3.1 ### Query engine Spark ### Please describe the bug 🐞 Hi, I am currently using Spark 3.2, and considering to upgrade spark version to 3.4

Re: [PR] feat: add parquet writer [iceberg-rust]

2024-02-28 Thread via GitHub
liurenjie1024 commented on PR #176: URL: https://github.com/apache/iceberg-rust/pull/176#issuecomment-1970342629 > My personal approach is to develop a simple, functional API that meets the required features and then build upon it. Having a powerful API from the start makes me nervous. Addi

Re: [PR] detect breaking changes [iceberg-python]

2024-02-28 Thread via GitHub
syun64 commented on code in PR #394: URL: https://github.com/apache/iceberg-python/pull/394#discussion_r1506659256 ## tests/api/exclude/pyiceberg-0.6.0.yaml: ## @@ -0,0 +1,47 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agree

Re: [I] Spark Hive Iceberg Table Locks -- Settings Unclear in Docs + Overrides Not Working [iceberg]

2024-02-28 Thread via GitHub
zhanghe-git commented on issue #6667: URL: https://github.com/apache/iceberg/issues/6667#issuecomment-1970339493 @GabeChurch How to solve this problem specifically? I also encountered the same problem. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] feat: add parquet writer [iceberg-rust]

2024-02-28 Thread via GitHub
Xuanwo commented on PR #176: URL: https://github.com/apache/iceberg-rust/pull/176#issuecomment-1970337631 > For here, I just find that this design can give users more power to custom and reuse something so I follow it. It's ok for me to modify it if there is a simpler design. Thank y

Re: [PR] feat: add parquet writer [iceberg-rust]

2024-02-28 Thread via GitHub
liurenjie1024 commented on PR #176: URL: https://github.com/apache/iceberg-rust/pull/176#issuecomment-1970336526 > I still question whether following Java's API is a good idea. I believe we can create a better and more user-friendly API for rust users. In fact this is not end user api

Re: [PR] feat: add parquet writer [iceberg-rust]

2024-02-28 Thread via GitHub
ZENOTME commented on PR #176: URL: https://github.com/apache/iceberg-rust/pull/176#issuecomment-1970332953 > I still question whether following Java's API is a good idea. I believe we can create a better and more user-friendly API for rust users. > > However, this shouldn't hold us ba

[PR] Views, Spark: Add support for Materialized Views; Integrate with Spark SQL [iceberg]

2024-02-28 Thread via GitHub
wmoustafa opened a new pull request, #9830: URL: https://github.com/apache/iceberg/pull/9830 ## Spec This patch adds support for materialized views in Iceberg and integrates the implementation with Spark SQL. It reuses the current spec of Iceberg views and tables by leveraging table prop

Re: [PR] feat: add parquet writer [iceberg-rust]

2024-02-28 Thread via GitHub
liurenjie1024 commented on PR #176: URL: https://github.com/apache/iceberg-rust/pull/176#issuecomment-1970299150 cc @Xuanwo Any other comments? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[I] Spark <> Iceberg bug integration test [iceberg-python]

2024-02-28 Thread via GitHub
kevinjqliu opened a new issue, #482: URL: https://github.com/apache/iceberg-python/issues/482 ### Apache Iceberg version None ### Please describe the bug 🐞 While working on #444, I ran into a weird bug with Spark integration test. Particularly here https://git

[PR] Remove unused catalog from integration test [iceberg-python]

2024-02-28 Thread via GitHub
kevinjqliu opened a new pull request, #481: URL: https://github.com/apache/iceberg-python/pull/481 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-28 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1506916203 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -234,7 +254,7 @@ public long newSnapshotId() { } @VisibleForTesting - Path ge

Re: [PR] Flink: Supports specifying comment for iceberg fields in create table and addcolumn syntax using flinksql [iceberg]

2024-02-28 Thread via GitHub
huyuanfeng2018 commented on code in PR #9606: URL: https://github.com/apache/iceberg/pull/9606#discussion_r1506909261 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java: ## @@ -187,7 +188,7 @@ private static TableLoader createTableLoader(

Re: [PR] add github add to check md link [iceberg-python]

2024-02-28 Thread via GitHub
kevinjqliu commented on PR #324: URL: https://github.com/apache/iceberg-python/pull/324#issuecomment-1970272506 ran `make lint` locally, should be good now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Flink: Supports specifying comment for iceberg fields in create table and addcolumn syntax using flinksql [iceberg]

2024-02-28 Thread via GitHub
huyuanfeng2018 commented on code in PR #9606: URL: https://github.com/apache/iceberg/pull/9606#discussion_r1506909630 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java: ## @@ -68,6 +75,42 @@ public static Schema convert(TableSchema schema) { r

Re: [PR] Flink: Supports specifying comment for iceberg fields in create table and addcolumn syntax using flinksql [iceberg]

2024-02-28 Thread via GitHub
huyuanfeng2018 commented on code in PR #9606: URL: https://github.com/apache/iceberg/pull/9606#discussion_r1506909261 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java: ## @@ -187,7 +188,7 @@ private static TableLoader createTableLoader(

Re: [PR] feat(FileScanTask): partial execute impl for parquet [iceberg-rust]

2024-02-28 Thread via GitHub
ZENOTME commented on code in PR #207: URL: https://github.com/apache/iceberg-rust/pull/207#discussion_r1506902491 ## crates/iceberg/src/scan.rs: ## @@ -163,6 +178,54 @@ impl TableScan { Ok(iter(file_scan_tasks).boxed()) } + +/// Transforms a stream of FileSca

Re: [PR] [WIP] feat: basic implementation of dynamodb catalog [iceberg-rust]

2024-02-28 Thread via GitHub
liurenjie1024 commented on PR #223: URL: https://github.com/apache/iceberg-rust/pull/223#issuecomment-1970242823 > Update: We decide to use Postgres, cuz none of us know Glue😢. I will start working on the SQL Catalog which I think it may be more widely used. I will close this PR and move th

Re: [PR] feat(FileScanTask): partial execute impl for parquet [iceberg-rust]

2024-02-28 Thread via GitHub
liurenjie1024 commented on code in PR #207: URL: https://github.com/apache/iceberg-rust/pull/207#discussion_r1506881445 ## crates/iceberg/src/scan.rs: ## @@ -178,9 +241,16 @@ pub struct FileScanTask { pub type ArrowRecordBatchStream = BoxStream<'static, crate::Result>; impl

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506882144 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -0,0 +1,508 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506878562 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -0,0 +1,508 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506868255 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/IcebergWriter.java: ## @@ -77,8 +78,47 @@ public void write(SinkRecord record) { }

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506867621 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/BaseDeltaTaskWriter.java: ## @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506867118 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/BaseDeltaTaskWriter.java: ## @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506866739 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/UnpartitionedDeltaWriter.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506866469 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordWrapper.java: ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506866029 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/IcebergWriter.java: ## @@ -77,8 +78,47 @@ public void write(SinkRecord record) { }

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506865757 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordProjection.java: ## @@ -0,0 +1,200 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506865531 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -0,0 +1,508 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506865229 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -0,0 +1,508 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506864728 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/PartitionedDeltaWriter.java: ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] feat: Add expression builder and display. [iceberg-rust]

2024-02-28 Thread via GitHub
liurenjie1024 commented on code in PR #169: URL: https://github.com/apache/iceberg-rust/pull/169#discussion_r1506861883 ## crates/iceberg/src/expr/term.rs: ## @@ -17,21 +17,89 @@ //! Term definition. -use crate::spec::NestedFieldRef; +use crate::expr::{BinaryExpression, Pre

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on code in PR #9641: URL: https://github.com/apache/iceberg/pull/9641#discussion_r1506855437 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/IcebergWriter.java: ## @@ -52,20 +51,22 @@ public IcebergWriter(Table table, String tableNa

Re: [PR] Kafka Connect: Record converters and delta writers [iceberg]

2024-02-28 Thread via GitHub
bryanck commented on PR #9641: URL: https://github.com/apache/iceberg/pull/9641#issuecomment-1970183410 FYI, I removed the delta writers along with upsert and CDC related code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [I] Create a bash script to verify the apache iceberg release tarball automatically [iceberg]

2024-02-28 Thread via GitHub
github-actions[bot] commented on issue #1700: URL: https://github.com/apache/iceberg/issues/1700#issuecomment-1970144332 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Where can I find the design documents? For example, support java to write iceberg [iceberg]

2024-02-28 Thread via GitHub
github-actions[bot] commented on issue #1690: URL: https://github.com/apache/iceberg/issues/1690#issuecomment-1970144306 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] iceberg flink sink job can't restart due to metadata location not found [iceberg]

2024-02-28 Thread via GitHub
github-actions[bot] commented on issue #1688: URL: https://github.com/apache/iceberg/issues/1688#issuecomment-1970144281 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Support timestamp partition using truncate [iceberg]

2024-02-28 Thread via GitHub
github-actions[bot] commented on issue #1671: URL: https://github.com/apache/iceberg/issues/1671#issuecomment-1970144256 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Run iceberg-mr and iceberg-hive tests on Hive 3.1.2 as well [iceberg]

2024-02-28 Thread via GitHub
github-actions[bot] commented on issue #1371: URL: https://github.com/apache/iceberg/issues/1371#issuecomment-1970144078 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Missing docs for Entries metadata table [iceberg]

2024-02-28 Thread via GitHub
github-actions[bot] closed issue #1365: Missing docs for Entries metadata table URL: https://github.com/apache/iceberg/issues/1365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] Run iceberg-mr and iceberg-hive tests on Hive 3.1.2 as well [iceberg]

2024-02-28 Thread via GitHub
github-actions[bot] closed issue #1371: Run iceberg-mr and iceberg-hive tests on Hive 3.1.2 as well URL: https://github.com/apache/iceberg/issues/1371 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Missing docs for Entries metadata table [iceberg]

2024-02-28 Thread via GitHub
github-actions[bot] commented on issue #1365: URL: https://github.com/apache/iceberg/issues/1365#issuecomment-1970144057 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[I] Can we load iceberg table using external volume instead of external stage ? [iceberg]

2024-02-28 Thread via GitHub
darshininarayanamurthy opened a new issue, #9828: URL: https://github.com/apache/iceberg/issues/9828 ### Query engine _No response_ ### Question Can we load iceberg table using external volume instead of external stage? Copy into table works using external stage to load

Re: [PR] Flink: Made IcebergFilesCommitter work with single phase commit [iceberg]

2024-02-28 Thread via GitHub
stevenzwu commented on PR #9694: URL: https://github.com/apache/iceberg/pull/9694#issuecomment-1970079323 > Keeping metrics consistent, whatever shows as acked, is actually in data this is also incorrect with 1PC. PubSub sources may have acknowledged the messages as source operator is

Re: [PR] Fix retrying logic [iceberg-python]

2024-02-28 Thread via GitHub
anupam-saini commented on code in PR #480: URL: https://github.com/apache/iceberg-python/pull/480#discussion_r1506787323 ## tests/catalog/test_rest.py: ## @@ -306,24 +306,39 @@ def test_list_namespace_with_parent_200(rest_mock: Mocker) -> None: ] -def test_list_namespa

Re: [PR] feat(FileScanTask): partial execute impl for parquet [iceberg-rust]

2024-02-28 Thread via GitHub
sdd commented on PR #207: URL: https://github.com/apache/iceberg-rust/pull/207#issuecomment-1970059117 PTAL @liurenjie1024 @ZENOTME - took a different approach, inspired by the Java implementation but a lot simpler. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Fix retrying logic [iceberg-python]

2024-02-28 Thread via GitHub
Fokko commented on PR #480: URL: https://github.com/apache/iceberg-python/pull/480#issuecomment-1970033106 cc @anupam-saini -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Fix retrying logic [iceberg-python]

2024-02-28 Thread via GitHub
Fokko commented on code in PR #480: URL: https://github.com/apache/iceberg-python/pull/480#discussion_r1506750317 ## tests/catalog/test_rest.py: ## @@ -306,24 +306,39 @@ def test_list_namespace_with_parent_200(rest_mock: Mocker) -> None: ] -def test_list_namespaces_419

[PR] Fix retrying logic [iceberg-python]

2024-02-28 Thread via GitHub
Fokko opened a new pull request, #480: URL: https://github.com/apache/iceberg-python/pull/480 It was refreshing on each of the calls. I think the `before_sleep` is the correct hook. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] [WIP] feat: basic implementation of dynamodb catalog [iceberg-rust]

2024-02-28 Thread via GitHub
odysa commented on PR #223: URL: https://github.com/apache/iceberg-rust/pull/223#issuecomment-1969995591 Update: We decide to use Postgres, cuz none of us know Glue😢. I will start working on the SQL Catalog which I think it may be more widely used. I will close this PR and move this imple

Re: [PR] [WIP] feat: basic implementation of dynamodb catalog [iceberg-rust]

2024-02-28 Thread via GitHub
odysa closed pull request #223: [WIP] feat: basic implementation of dynamodb catalog URL: https://github.com/apache/iceberg-rust/pull/223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [DRAFT] Fix ASF links on homepage to comply with trademark [iceberg]

2024-02-28 Thread via GitHub
munabedan commented on PR #9729: URL: https://github.com/apache/iceberg/pull/9729#issuecomment-1969927716 Currently working on styling the apache iceberg footer. ![apache_iceberg_footer](https://github.com/apache/iceberg/assets/45054928/2cac6445-f298-4bb1-bc53-a16c2f4fa0a7) --

Re: [PR] detect breaking changes [iceberg-python]

2024-02-28 Thread via GitHub
syun64 commented on code in PR #394: URL: https://github.com/apache/iceberg-python/pull/394#discussion_r1506659256 ## tests/api/exclude/pyiceberg-0.6.0.yaml: ## @@ -0,0 +1,47 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agree

Re: [PR] detect breaking changes [iceberg-python]

2024-02-28 Thread via GitHub
syun64 commented on code in PR #394: URL: https://github.com/apache/iceberg-python/pull/394#discussion_r1506659256 ## tests/api/exclude/pyiceberg-0.6.0.yaml: ## @@ -0,0 +1,47 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agree

[I] Make the OAuth2 request audience configurable [iceberg-python]

2024-02-28 Thread via GitHub
flyrain opened a new issue, #479: URL: https://github.com/apache/iceberg-python/issues/479 ### Feature Request / Improvement [OAuth Audience](https://www.ory.sh/docs/hydra/guides/audiences) helps to prevent unauthorized access to resources. When a resource server receives a token, it

Re: [PR] Construction of filenames for partitioned writes [iceberg-python]

2024-02-28 Thread via GitHub
jqin61 commented on code in PR #453: URL: https://github.com/apache/iceberg-python/pull/453#discussion_r1506256332 ## pyiceberg/partitioning.py: ## @@ -215,3 +246,59 @@ def assign_fresh_partition_spec_ids(spec: PartitionSpec, old_schema: Schema, fre ) )

Re: [PR] [Bug Fix] cast None `current-snapshot-id` as -1 for Backwards Compatibility [iceberg-python]

2024-02-28 Thread via GitHub
Fokko commented on PR #473: URL: https://github.com/apache/iceberg-python/pull/473#issuecomment-1969659333 I prefer to not exclude certain groups (Java <1.3.0 in this case, I'm not sure on which versions all the proprietary implementations). I think a flag is an elegant way of enabling this

Re: [PR] Basic manifest encryption [iceberg]

2024-02-28 Thread via GitHub
RussellSpitzer commented on code in PR #8252: URL: https://github.com/apache/iceberg/pull/8252#discussion_r1506401655 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmOutputStream.java: ## @@ -129,6 +129,11 @@ public void close() throws IOException { targetStream.c

[I] Spark 3.5.0 `MERGE INTO` breaks [iceberg]

2024-02-28 Thread via GitHub
bk-mz opened a new issue, #9827: URL: https://github.com/apache/iceberg/issues/9827 ### Apache Iceberg version 1.4.3 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Hey folks, when migrating from spark 3.4.1 to spark 3.5.0 we observe br

Re: [PR] Basic manifest encryption [iceberg]

2024-02-28 Thread via GitHub
ggershinsky commented on code in PR #8252: URL: https://github.com/apache/iceberg/pull/8252#discussion_r1506219710 ## .palantir/revapi.yml: ## @@ -1018,6 +1018,30 @@ acceptedBreaks: old: "method void org.apache.iceberg.PositionDeletesTable.PositionDeletesBatchScan::(org.

Re: [PR] API, Core: add multi-arg transform and add zOrder as the first one [iceberg]

2024-02-28 Thread via GitHub
szehon-ho commented on code in PR #9662: URL: https://github.com/apache/iceberg/pull/9662#discussion_r1506370636 ## api/src/main/java/org/apache/iceberg/StructTransform.java: ## @@ -51,11 +53,16 @@ class StructTransform implements StructLike, Serializable { this.transforms

Re: [PR] Basic manifest encryption [iceberg]

2024-02-28 Thread via GitHub
ggershinsky commented on code in PR #8252: URL: https://github.com/apache/iceberg/pull/8252#discussion_r1506370516 ## .palantir/revapi.yml: ## @@ -1018,6 +1018,30 @@ acceptedBreaks: old: "method void org.apache.iceberg.PositionDeletesTable.PositionDeletesBatchScan::(org.

Re: [PR] Basic manifest encryption [iceberg]

2024-02-28 Thread via GitHub
ggershinsky commented on code in PR #8252: URL: https://github.com/apache/iceberg/pull/8252#discussion_r1506368675 ## api/src/main/java/org/apache/iceberg/io/PositionOutputStream.java: ## @@ -29,4 +29,15 @@ public abstract class PositionOutputStream extends OutputStream { *

Re: [PR] Construction of filenames for partitioned writes [iceberg-python]

2024-02-28 Thread via GitHub
jqin61 commented on PR #453: URL: https://github.com/apache/iceberg-python/pull/453#issuecomment-1969534661 rebased; removed the comment; renamed the ambiguous function name -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Basic manifest encryption [iceberg]

2024-02-28 Thread via GitHub
rdblue commented on code in PR #8252: URL: https://github.com/apache/iceberg/pull/8252#discussion_r1506325920 ## api/src/main/java/org/apache/iceberg/io/PositionOutputStream.java: ## @@ -29,4 +29,15 @@ public abstract class PositionOutputStream extends OutputStream { * @thr

Re: [PR] Basic manifest encryption [iceberg]

2024-02-28 Thread via GitHub
rdblue commented on code in PR #8252: URL: https://github.com/apache/iceberg/pull/8252#discussion_r1506324791 ## .palantir/revapi.yml: ## @@ -1018,6 +1018,30 @@ acceptedBreaks: old: "method void org.apache.iceberg.PositionDeletesTable.PositionDeletesBatchScan::(org.apach

Re: [PR] Dynamically support Spark native engine in Iceberg [iceberg]

2024-02-28 Thread via GitHub
huaxingao commented on PR #9721: URL: https://github.com/apache/iceberg/pull/9721#issuecomment-1969479329 @aokolnychyi I tried the customized `PartitionReaderFactory` approach. Seems I only need to duplicate two more classes: `CustomizedSparkColumnarReaderFactory` and `CustomizedBa

[PR] Dynamically support Spark native engine in Iceberg [iceberg]

2024-02-28 Thread via GitHub
huaxingao opened a new pull request, #9826: URL: https://github.com/apache/iceberg/pull/9826 This PR tries the approach of injecting a customized `PartitionReaderFactory` to support Spark native execution engines, e.g. [Comet](https://github.com/apache/arrow-datafusion-comet) Example

Re: [PR] [Bug Fix] cast None `current-snapshot-id` as -1 for Backwards Compatibility [iceberg-python]

2024-02-28 Thread via GitHub
syun64 commented on PR #473: URL: https://github.com/apache/iceberg-python/pull/473#issuecomment-1969470701 > The [spec calls out](https://iceberg.apache.org/spec/#table-metadata-fields) the `current_snapshot_id` as optional, so omitting it "should" be valid. I think I agree with @Fokko th

Re: [PR] OpenAPI: Add ContentFile types to spec for the PreplanTable and PlanTable API [iceberg]

2024-02-28 Thread via GitHub
rdblue merged PR #9717: URL: https://github.com/apache/iceberg/pull/9717 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] OpenAPI: Add ContentFile types to spec for the PreplanTable and PlanTable API [iceberg]

2024-02-28 Thread via GitHub
rdblue commented on PR #9717: URL: https://github.com/apache/iceberg/pull/9717#issuecomment-1969426771 This looks good to me now. Since we have 3 approvals, I'll go ahead and merge it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] OpenAPI: Add ContentFile types to spec for the PreplanTable and PlanTable API [iceberg]

2024-02-28 Thread via GitHub
rdblue commented on code in PR #9717: URL: https://github.com/apache/iceberg/pull/9717#discussion_r1506284612 ## open-api/rest-catalog-open-api.yaml: ## @@ -3324,6 +3324,281 @@ components: type: integer format: int64 +BooleanTypeValue: + type: bo

Re: [PR] Flink: Made IcebergFilesCommitter work with single phase commit [iceberg]

2024-02-28 Thread via GitHub
mudit-97 commented on PR #9694: URL: https://github.com/apache/iceberg/pull/9694#issuecomment-1969417117 yea @stevenzwu , and thats why for 2 major reasons we wanted to have choice of 1PC with us: 1. Keeping metrics consistent, whatever shows as acked, is actually in data 2. and, no ha

Re: [PR] Flink: Made IcebergFilesCommitter work with single phase commit [iceberg]

2024-02-28 Thread via GitHub
stevenzwu commented on PR #9694: URL: https://github.com/apache/iceberg/pull/9694#issuecomment-1969410653 > if 2PC is used, then notifyCheckpointComplete will be called parallely and there is no guarantee the messages which are acked in PubSub are even written to Iceberg or not, they might

Re: [PR] OpenAPI: Point to latest doc in rest catalog [iceberg]

2024-02-28 Thread via GitHub
nastra merged PR #9825: URL: https://github.com/apache/iceberg/pull/9825 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Basic manifest encryption [iceberg]

2024-02-28 Thread via GitHub
ggershinsky commented on code in PR #8252: URL: https://github.com/apache/iceberg/pull/8252#discussion_r1506255579 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmOutputStream.java: ## @@ -129,6 +129,11 @@ public void close() throws IOException { targetStream.clos

Re: [PR] Construction of filenames for partitioned writes [iceberg-python]

2024-02-28 Thread via GitHub
jqin61 commented on code in PR #453: URL: https://github.com/apache/iceberg-python/pull/453#discussion_r1506256332 ## pyiceberg/partitioning.py: ## @@ -215,3 +246,59 @@ def assign_fresh_partition_spec_ids(spec: PartitionSpec, old_schema: Schema, fre ) )

  1   2   >