[PR] feat (datafusion integration): convert datafusion expr filters to Iceberg Predicate [iceberg-rust]

2024-08-27 Thread via GitHub
a-agmon opened a new pull request, #588: URL: https://github.com/apache/iceberg-rust/pull/588 This PR partially closes #585 - Adds datafusion filters to the `IcebergTableScan` struct - Converts datafusion filters (or `Exp`) to Iceberg `Predicate` and apply them to the TableScan On dat

Re: [PR] chore(deps): Bump crate-ci/typos from 1.23.6 to 1.24.1 [iceberg-rust]

2024-08-27 Thread via GitHub
liurenjie1024 merged PR #583: URL: https://github.com/apache/iceberg-rust/pull/583 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Core: Fix the behavior of IncrementalFileCleanup when expire a snapshot [iceberg]

2024-08-27 Thread via GitHub
RussellSpitzer commented on PR #10983: URL: https://github.com/apache/iceberg/pull/10983#issuecomment-2314425466 So when I wrote the Spark procedure for this we were already aware that this code path has a lot of potential issues. We end up basically completely rewriting the logic of detect

Re: [PR] Flink: add unit tests for range distribution on bucket partition column [iceberg]

2024-08-27 Thread via GitHub
pvary commented on code in PR #11033: URL: https://github.com/apache/iceberg/pull/11033#discussion_r1734054049 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSinkRangeDistributionBucketing.java: ## @@ -0,0 +1,253 @@ +/* + * Licensed to the Apach

Re: [PR] Table Scan: Add Row Group Skipping [iceberg-rust]

2024-08-27 Thread via GitHub
sdd commented on code in PR #558: URL: https://github.com/apache/iceberg-rust/pull/558#discussion_r1734050618 ## crates/iceberg/src/expr/visitors/row_group_metrics_evaluator.rs: ## @@ -0,0 +1,1927 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more cont

[PR] Flink: add unit tests for range distribution on bucket partition column [iceberg]

2024-08-27 Thread via GitHub
stevenzwu opened a new pull request, #11033: URL: https://github.com/apache/iceberg/pull/11033 Also started to use the new `DataGeneratorSource` which is only available in 1.19 and after. hence, didn't add the unit test to 1.18. -- This is an automated message from the Apache Git Ser

Re: [PR] API: Add ParquetUtils.getSplitOffsets that takes an InputFile [iceberg]

2024-08-27 Thread via GitHub
rustyconover commented on PR #7267: URL: https://github.com/apache/iceberg/pull/7267#issuecomment-2314105060 Can we still make this change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Core: Fix the behavior of IncrementalFileCleanup when expire a snapshot [iceberg]

2024-08-27 Thread via GitHub
hantangwangd commented on code in PR #10983: URL: https://github.com/apache/iceberg/pull/10983#discussion_r1733882501 ## core/src/main/java/org/apache/iceberg/IncrementalFileCleanup.java: ## @@ -327,4 +342,34 @@ private Set findFilesToDelete( return filesToDelete; } +

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-08-27 Thread via GitHub
FANNG1 commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1733826935 ## open-api/rest-catalog-open-api.yaml: ## @@ -2905,6 +2987,10 @@ components: - `token`: Authorization bearer token to use for view requests if OAuth2 secu

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-08-27 Thread via GitHub
FANNG1 commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1733824283 ## open-api/rest-catalog-open-api.yaml: ## @@ -2747,6 +2747,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
wypoon commented on PR #10935: URL: https://github.com/apache/iceberg/pull/10935#issuecomment-2313986585 @pvary and @dramaticlly, thank you both for reviewing. I am busy with some other work at the moment, but I'll return to this by next week. -- This is an automated message from the Apac

Re: [PR] Table Scan: Add Row Group Skipping [iceberg-rust]

2024-08-27 Thread via GitHub
liurenjie1024 commented on code in PR #558: URL: https://github.com/apache/iceberg-rust/pull/558#discussion_r1733780815 ## crates/iceberg/Cargo.toml: ## @@ -83,5 +83,6 @@ ctor = { workspace = true } iceberg-catalog-memory = { workspace = true } iceberg_test_utils = { path = ".

Re: [I] Convert datafusion table scan filter into iceberg table scan' filter. [iceberg-rust]

2024-08-27 Thread via GitHub
liurenjie1024 commented on issue #585: URL: https://github.com/apache/iceberg-rust/issues/585#issuecomment-2313941236 Hi @a-agmon > Then the predicates needs to be applied as followes: > The impl of the ExecutionPlan::execute() function, calls get_batch_stream(), which in turn cal

Re: [PR] Kafka Connect: Disable publish tasks in runtime project [iceberg]

2024-08-27 Thread via GitHub
bryanck commented on PR #11032: URL: https://github.com/apache/iceberg/pull/11032#issuecomment-2313920507 cc @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[PR] Kafka Connect: Disable publish tasks in runtime project [iceberg]

2024-08-27 Thread via GitHub
bryanck opened a new pull request, #11032: URL: https://github.com/apache/iceberg/pull/11032 This PR disables the build publishing tasks for the `kafka-connect-runtime` project. This project doesn't have any libraries to publish. This should resolve https://github.com/apache/iceberg/issues/

Re: [PR] Core: fix NPE with HadoopFileIO because FileIOParser doesn't serialize Hadoop configuration [iceberg]

2024-08-27 Thread via GitHub
stevenzwu commented on code in PR #10926: URL: https://github.com/apache/iceberg/pull/10926#discussion_r1733692980 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java: ## @@ -63,7 +63,11 @@ public class HadoopFileIO implements HadoopConfigurable, DelegateFileIO {

[PR] Docs: Change to Flink directory for instructions [iceberg]

2024-08-27 Thread via GitHub
liuml07 opened a new pull request, #11031: URL: https://github.com/apache/iceberg/pull/11031 All following commands assumes the directory is in flink-${FLINK_VERSION}. Let's update the instructions to make that clear. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
dramaticlly commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1733659899 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/ChangelogRowReader.java: ## @@ -112,13 +149,62 @@ private CloseableIterable openChangelogScanTa

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
dramaticlly commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1733661588 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -133,51 +131,149 @@ private static Map computeSnapshotOrdinals(Deque snapsh

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
dramaticlly commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1733677814 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -63,33 +60,43 @@ protected CloseableIterable doPlanFiles( return Closeabl

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
dramaticlly commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1733661588 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -133,51 +131,149 @@ private static Map computeSnapshotOrdinals(Deque snapsh

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
dramaticlly commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1733661339 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -133,51 +131,149 @@ private static Map computeSnapshotOrdinals(Deque snapsh

Re: [I] Getting Original Schema of a DataFile in a FileScanTask? [iceberg-python]

2024-08-27 Thread via GitHub
github-actions[bot] commented on issue #401: URL: https://github.com/apache/iceberg-python/issues/401#issuecomment-2313787496 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apac

Re: [I] Getting Original Schema of a DataFile in a FileScanTask? [iceberg-python]

2024-08-27 Thread via GitHub
github-actions[bot] closed issue #401: Getting Original Schema of a DataFile in a FileScanTask? URL: https://github.com/apache/iceberg-python/issues/401 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Support iceberg hadoop catalog in python library [iceberg-python]

2024-08-27 Thread via GitHub
github-actions[bot] commented on issue #17: URL: https://github.com/apache/iceberg-python/issues/17#issuecomment-2313787511 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occu

[PR] Core: Add benchmark for FastAppend [iceberg]

2024-08-27 Thread via GitHub
aokolnychyi opened a new pull request, #11029: URL: https://github.com/apache/iceberg/pull/11029 This PR adds a benchmark for `FastAppend`. As shown below, Iceberg is currently very slow when an operation contains many new data files. I'll follow up with a fix separately. ``` Benc

Re: [PR] Spark 3.3: SQL Extensions for CREATE BRANCH AS OF TAG [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7294: URL: https://github.com/apache/iceberg/pull/7294#issuecomment-2313786410 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Spark3 structured streaming enable updates [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7295: URL: https://github.com/apache/iceberg/pull/7295#issuecomment-2313786428 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Spark: Show Create Round trip tests [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7300: URL: https://github.com/apache/iceberg/pull/7300#issuecomment-2313786491 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Spark 3.3: drop_namespace with CASCADE support [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7275: URL: https://github.com/apache/iceberg/pull/7275#issuecomment-2313786346 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: support table comment [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7236: URL: https://github.com/apache/iceberg/pull/7236#issuecomment-2313786256 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Support commit operations in pyiceberg [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on issue #7259: URL: https://github.com/apache/iceberg/issues/7259#issuecomment-2313786310 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Drop the SQL issue when attempting to drop an Iceberg table whose location does not exist [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on issue #7227: URL: https://github.com/apache/iceberg/issues/7227#issuecomment-2313786184 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Add outputFile() method for FileAppender [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7233: URL: https://github.com/apache/iceberg/pull/7233#issuecomment-2313786213 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Fix for Drop SQL issue when attempting to drop an Iceberg table [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7228: URL: https://github.com/apache/iceberg/pull/7228#issuecomment-2313786197 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Updated python-integration.yml [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7210: URL: https://github.com/apache/iceberg/pull/7210#issuecomment-2313786142 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: Add metrics reporter for serializable table [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7144: URL: https://github.com/apache/iceberg/pull/7144#issuecomment-2313786049 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Support bulk remove orphan files [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on issue #7111: URL: https://github.com/apache/iceberg/issues/7111#issuecomment-2313785981 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Spark: Close auto broadcast join in delete orphan action [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7096: URL: https://github.com/apache/iceberg/pull/7096#issuecomment-2313785952 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Support case insensitive id assignment for applyNameMapping when reading parquet [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7299: URL: https://github.com/apache/iceberg/pull/7299#issuecomment-2313786461 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Support timestamp type in partition string when importing files [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7291: URL: https://github.com/apache/iceberg/pull/7291#issuecomment-2313786391 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Spec: metadata file (-.metadata.json) naming convention [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7107: URL: https://github.com/apache/iceberg/pull/7107#issuecomment-2313785970 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink 1.16: Change distribution modes [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7077: URL: https://github.com/apache/iceberg/pull/7077#issuecomment-2313785926 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] added section on distribution modes [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7073: URL: https://github.com/apache/iceberg/pull/7073#issuecomment-2313785911 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: reduce scale factor for HadoopFileIOTest prefix tests [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7047: URL: https://github.com/apache/iceberg/pull/7047#issuecomment-2313785892 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] [Parquet] Eagerly fetch row groups when reading parquet [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7279: URL: https://github.com/apache/iceberg/pull/7279#issuecomment-2313786367 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Iceberg add_files procedure with partition_filter scan non needed folders [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on issue #7027: URL: https://github.com/apache/iceberg/issues/7027#issuecomment-2313785851 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] API: Add ParquetUtils.getSplitOffsets that takes an InputFile [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7267: URL: https://github.com/apache/iceberg/pull/7267#issuecomment-2313786330 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] [WIP] AWS: Fix failing AWS integration tests [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7234: URL: https://github.com/apache/iceberg/pull/7234#issuecomment-2313786235 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Parquet: Add page filter using page indexes [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #6935: URL: https://github.com/apache/iceberg/pull/6935#issuecomment-2313785744 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] [Draft][HiveCatalog] Skip updating column schema when filed schema string is larger than maxHiveTablePropertySize [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7222: URL: https://github.com/apache/iceberg/pull/7222#issuecomment-2313786164 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Build: Remove services files introduced by third-party jars [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7209: URL: https://github.com/apache/iceberg/pull/7209#issuecomment-2313786118 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Replace Thread.sleep() usage in test code with Awaitility [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on issue #7154: URL: https://github.com/apache/iceberg/issues/7154#issuecomment-2313786070 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Refactoring metadata location and adding API to get data and metadata location #7187 [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7188: URL: https://github.com/apache/iceberg/pull/7188#issuecomment-2313786098 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core, Spark: Fix delete with filter on nested columns [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7132: URL: https://github.com/apache/iceberg/pull/7132#issuecomment-2313786029 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] [Core][Spark] Improve DeleteOrphanFiles action to return additional details of deleted orphan files [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7127: URL: https://github.com/apache/iceberg/pull/7127#issuecomment-2313786007 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Document available Flink config options. [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #7041: URL: https://github.com/apache/iceberg/pull/7041#issuecomment-2313785869 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Build: Upgrade netty-buffer to 4.1.89.Final [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #6986: URL: https://github.com/apache/iceberg/pull/6986#issuecomment-2313785822 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Push down group by for partition columns [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #6981: URL: https://github.com/apache/iceberg/pull/6981#issuecomment-2313785805 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Parquet: Implement column index filter and update row read path to support page skipping [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #6967: URL: https://github.com/apache/iceberg/pull/6967#issuecomment-2313785780 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: Add Catalog Transactions API [iceberg]

2024-08-27 Thread via GitHub
github-actions[bot] commented on PR #6948: URL: https://github.com/apache/iceberg/pull/6948#issuecomment-2313785762 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
dramaticlly commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1733659899 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/ChangelogRowReader.java: ## @@ -112,13 +149,62 @@ private CloseableIterable openChangelogScanTa

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
dramaticlly commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1733658370 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/ChangelogRowReader.java: ## @@ -112,13 +149,62 @@ private CloseableIterable openChangelogScanTa

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-27 Thread via GitHub
dramaticlly commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1733657338 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/ChangelogRowReader.java: ## @@ -112,13 +149,62 @@ private CloseableIterable openChangelogScanTa

[PR] Bump deptry from 0.19.1 to 0.20.0 [iceberg-python]

2024-08-27 Thread via GitHub
dependabot[bot] opened a new pull request, #1107: URL: https://github.com/apache/iceberg-python/pull/1107 Bumps [deptry](https://github.com/fpgmaas/deptry) from 0.19.1 to 0.20.0. Release notes Sourced from https://github.com/fpgmaas/deptry/releases";>deptry's releases. 0.20.

Re: [PR] OpenAPI, Build: Apply spotless to testFixtures source code [iceberg]

2024-08-27 Thread via GitHub
amogh-jahagirdar merged PR #11024: URL: https://github.com/apache/iceberg/pull/11024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733513836 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +3818,176 @@ components: type: integer description: "List of equality field IDs" +Pre

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733509644 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,113 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

[PR] Spark 3.5: Use FileGenerationUtil in PlanningBenchmark [iceberg]

2024-08-27 Thread via GitHub
aokolnychyi opened a new pull request, #11027: URL: https://github.com/apache/iceberg/pull/11027 This PR migrates `PlanningBenchmark` to use `FileGenerationUtil` instead of writing a number of real data files and then replicating them. The new approach is more reflective of the actual perfo

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733503188 ## open-api/rest-catalog-open-api.yaml: ## @@ -532,6 +532,100 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733502773 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733501809 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733497286 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
aokolnychyi merged PR #11022: URL: https://github.com/apache/iceberg/pull/11022 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733498821 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
aokolnychyi commented on PR #11022: URL: https://github.com/apache/iceberg/pull/11022#issuecomment-2313506846 Thanks, @singhpk234 @danielcweeks @Fokko @amogh-jahagirdar! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733496627 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733495023 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733493409 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733490422 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733481079 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
aokolnychyi commented on code in PR #11022: URL: https://github.com/apache/iceberg/pull/11022#discussion_r1733431425 ## core/src/test/java/org/apache/iceberg/TestFileGenerationUtil.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
aokolnychyi commented on code in PR #11022: URL: https://github.com/apache/iceberg/pull/11022#discussion_r1733431163 ## core/src/test/java/org/apache/iceberg/TestFileGenerationUtil.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Flink: Maintenance - Lock remover [iceberg]

2024-08-27 Thread via GitHub
pvary commented on code in PR #11010: URL: https://github.com/apache/iceberg/pull/11010#discussion_r1733430117 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TableMaintenanceMetrics.java: ## @@ -28,6 +28,11 @@ public class TableMaintenanceMetric

Re: [PR] Flink: Maintenance - Lock remover [iceberg]

2024-08-27 Thread via GitHub
pvary commented on code in PR #11010: URL: https://github.com/apache/iceberg/pull/11010#discussion_r1733429843 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/LockRemover.java: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Flink: Maintenance - Lock remover [iceberg]

2024-08-27 Thread via GitHub
pvary commented on code in PR #11010: URL: https://github.com/apache/iceberg/pull/11010#discussion_r1733429045 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/LockRemover.java: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Flink: Maintenance - Lock remover [iceberg]

2024-08-27 Thread via GitHub
pvary commented on code in PR #11010: URL: https://github.com/apache/iceberg/pull/11010#discussion_r1733428833 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TaskResult.java: ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Flink: Maintenance - Lock remover [iceberg]

2024-08-27 Thread via GitHub
pvary commented on code in PR #11010: URL: https://github.com/apache/iceberg/pull/11010#discussion_r1733428026 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/LockRemover.java: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Flink: Maintenance - Lock remover [iceberg]

2024-08-27 Thread via GitHub
pvary commented on code in PR #11010: URL: https://github.com/apache/iceberg/pull/11010#discussion_r1733427810 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/LockRemover.java: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
amogh-jahagirdar commented on code in PR #11022: URL: https://github.com/apache/iceberg/pull/11022#discussion_r1733426151 ## core/src/test/java/org/apache/iceberg/TestFileGenerationUtil.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
amogh-jahagirdar commented on code in PR #11022: URL: https://github.com/apache/iceberg/pull/11022#discussion_r1733426151 ## core/src/test/java/org/apache/iceberg/TestFileGenerationUtil.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Flink: Maintenance - Lock remover [iceberg]

2024-08-27 Thread via GitHub
pvary commented on code in PR #11010: URL: https://github.com/apache/iceberg/pull/11010#discussion_r1733426011 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/LockRemover.java: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Core: Fix the behavior of IncrementalFileCleanup when expire a snapshot [iceberg]

2024-08-27 Thread via GitHub
amogh-jahagirdar commented on code in PR #10983: URL: https://github.com/apache/iceberg/pull/10983#discussion_r1733319305 ## core/src/main/java/org/apache/iceberg/IncrementalFileCleanup.java: ## @@ -327,4 +342,34 @@ private Set findFilesToDelete( return filesToDelete;

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
danielcweeks commented on code in PR #11022: URL: https://github.com/apache/iceberg/pull/11022#discussion_r1733380233 ## core/src/test/java/org/apache/iceberg/TestFileGenerationUtil.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
danielcweeks commented on code in PR #11022: URL: https://github.com/apache/iceberg/pull/11022#discussion_r1733380233 ## core/src/test/java/org/apache/iceberg/TestFileGenerationUtil.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Core: Generate realistic bounds in benchmarks [iceberg]

2024-08-27 Thread via GitHub
danielcweeks commented on code in PR #11022: URL: https://github.com/apache/iceberg/pull/11022#discussion_r1733377796 ## core/src/test/java/org/apache/iceberg/TestFileGenerationUtil.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [I] ICEBERG performance is slow when querying tables with a large number of partitions. [iceberg]

2024-08-27 Thread via GitHub
VladRodionov commented on issue #8161: URL: https://github.com/apache/iceberg/issues/8161#issuecomment-2313246358 Is this **Big** or **Small** data technology? Let us say table size is 300TB. With 400K partitions, average partition size is 750MB, which looks normal to me. 3PB tables? I k

[I] Issue with duplicate kafka connect artifacts [iceberg]

2024-08-27 Thread via GitHub
Fokko opened a new issue, #11026: URL: https://github.com/apache/iceberg/issues/11026 ### Feature Request / Improvement I tried to push the Iceberg artifacts locally to test https://github.com/apache/iceberg/pull/10996 but got an error around duplicate Kafka-connect artifacts:

Re: [I] Issue with duplicate kafka connect artifacts [iceberg]

2024-08-27 Thread via GitHub
Fokko commented on issue #11026: URL: https://github.com/apache/iceberg/issues/11026#issuecomment-2313235669 cc @bryanck -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-08-27 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1733326978 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

  1   2   >