Re: [I] Enforce that test classes start with "Test" [iceberg]

2025-07-14 Thread via GitHub
nastra closed issue #13442: Enforce that test classes start with "Test" URL: https://github.com/apache/iceberg/issues/13442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Enforce that test classes start with "Test" instead of suffix [iceberg]

2025-07-14 Thread via GitHub
nastra merged PR #13466: URL: https://github.com/apache/iceberg/pull/13466 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Spark 3.4, 3.5: Expose cleanExpiredMetadata in expire_snapshots Spark procedure [iceberg]

2025-07-14 Thread via GitHub
gaborkaszab commented on PR #13553: URL: https://github.com/apache/iceberg/pull/13553#issuecomment-3072204545 Thx @nastra ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Cannot create table named after metadata table in Spark using REST catalog [iceberg]

2025-07-14 Thread via GitHub
elphastori commented on issue #13388: URL: https://github.com/apache/iceberg/issues/13388#issuecomment-3072169930 > its for both the REST client and the underlying catalog. > > Okay, I wasn't sure whether the REST catalog supports empty namespaces. I'll try updating the server to not load

Re: [I] Cannot create table named after metadata table in Spark using REST catalog [iceberg]

2025-07-14 Thread via GitHub
elphastori commented on issue #13388: URL: https://github.com/apache/iceberg/issues/13388#issuecomment-3072143403 > It'd be great to a test for creating a table named `nyc.entries` and then perhaps also query its metadata table `nyc.entries.entries` Sounds good. I've added the test wh

Re: [PR] feat: RegisterTable support for InMemoryCatalog [iceberg-cpp]

2025-07-14 Thread via GitHub
lishuxu commented on code in PR #142: URL: https://github.com/apache/iceberg-cpp/pull/142#discussion_r2206472958 ## src/iceberg/catalog.h: ## @@ -166,8 +166,7 @@ class ICEBERG_EXPORT Catalog { /// \param identifier a table identifier /// \return instance of Table implement

Re: [PR] feat: avro support applying field-ids based on name mapping [iceberg-cpp]

2025-07-14 Thread via GitHub
MisterRaindrop commented on PR #127: URL: https://github.com/apache/iceberg-cpp/pull/127#issuecomment-3072050471 @wgtmac hi, I update code and all test is ok -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] feat: RegisterTable support for InMemoryCatalog [iceberg-cpp]

2025-07-14 Thread via GitHub
wgtmac commented on code in PR #142: URL: https://github.com/apache/iceberg-cpp/pull/142#discussion_r2206437585 ## src/iceberg/catalog.h: ## @@ -166,8 +166,7 @@ class ICEBERG_EXPORT Catalog { /// \param identifier a table identifier /// \return instance of Table implementa

Re: [I] Merge snapshots into 1 under transaction of multiple operations [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #2201: URL: https://github.com/apache/iceberg-python/issues/2201#issuecomment-3071980918 > Can pyiceberg provide the options such that only 1 snapshot will be generated under a transaction? Im surprised that multiple snapshots are produced. I would expec

Re: [I] Support writing Arrow RecordBatchReader or Scanner to Iceberg tables [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #2152: URL: https://github.com/apache/iceberg-python/issues/2152#issuecomment-3071909371 I think this is a good idea. I also want to continue the discussion from #1004 :) For context, heres something iceberg-go has implemented https://github.com/apache

Re: [I] Ensure absolute path when referencing any file paths [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #1730: URL: https://github.com/apache/iceberg-python/issues/1730#issuecomment-3071896134 sure @rambleraptor assigned to ya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Ensure absolute path when referencing any file paths [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #1730: URL: https://github.com/apache/iceberg-python/issues/1730#issuecomment-3071895976 > Are there other places we want to check this? The two that come to mind are add_files and the example you have above I think we can check lower in the stack. This

Re: [I] UUIDType with BucketTransform incorrectly converts int to str in PartitionKey [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #2002: URL: https://github.com/apache/iceberg-python/issues/2002#issuecomment-3071872524 Reopening this since we do not yet support BucketTransform on UUID type https://github.com/apache/iceberg-python/pull/2007/files#diff-7f3dd1244d08ce27c003cd091da10aa049

[I] UUIDType with BucketTransform incorrectly converts int to str in PartitionKey [iceberg-python]

2025-07-14 Thread via GitHub
dingo4dev opened a new issue, #2002: URL: https://github.com/apache/iceberg-python/issues/2002 ### Apache Iceberg version 0.9.0 (latest release) ### Please describe the bug 🐞 ## Description When using UUIDType as a BucketTransform Partition, an error occurs during tab

Re: [I] Error creating table from pyarrow schema with pa.uuid() [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #1986: URL: https://github.com/apache/iceberg-python/issues/1986#issuecomment-3071860837 Good catch, i think this was auto-closed when the PR was merged I was able to verify that this issue still exists by changing `pa.binary(16)` to `pa.uuid()` in `test

[I] Error creating table from pyarrow schema with pa.uuid() [iceberg-python]

2025-07-14 Thread via GitHub
simw opened a new issue, #1986: URL: https://github.com/apache/iceberg-python/issues/1986 ### Apache Iceberg version 0.9.0 (latest release) ### Please describe the bug 🐞 Preamble: using a local sqlite db: ```python from pyiceberg.catalog import load_catalog w

Re: [PR] feat: update pyiceberg/catalog/hive.py to support hive 4.x.x [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on PR #2206: URL: https://github.com/apache/iceberg-python/pull/2206#issuecomment-3071857219 ah looks like CI failed because we mock `.get_table` in tests https://grep.app/search?f.path=tests%2F&f.path.pattern=tests&f.repo.pattern=iceberg-python&q=.get_table n

Re: [PR] feat: update pyiceberg/catalog/hive.py to support hive 4.x.x [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on code in PR #2206: URL: https://github.com/apache/iceberg-python/pull/2206#discussion_r2206258792 ## pyiceberg/catalog/hive.py: ## @@ -389,8 +389,8 @@ def _create_hive_table(self, open_client: Client, hive_table: HiveTable) -> None def _get_hive_ta

Re: [PR] Fix support for writing to nested field partition [iceberg-python]

2025-07-14 Thread via GitHub
geruh commented on code in PR #2204: URL: https://github.com/apache/iceberg-python/pull/2204#discussion_r2206242808 ## tests/io/test_pyarrow.py: ## @@ -2350,6 +2350,72 @@ def test_partition_for_demo() -> None: ) +def test_partition_for_nested_field() -> None: +schem

Re: [I] docs: clarify `check_duplicate_files` option in the `add_files` api docs [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #2132: URL: https://github.com/apache/iceberg-python/issues/2132#issuecomment-3071834144 this is the `add_files` function signature https://github.com/apache/iceberg-python/blob/86bf71ceb602061d0958619ebd806aaccde348b3/pyiceberg/table/__init__.py#L861

Re: [PR] feat: RegisterTable support for InMemoryCatalog [iceberg-cpp]

2025-07-14 Thread via GitHub
lishuxu commented on code in PR #142: URL: https://github.com/apache/iceberg-cpp/pull/142#discussion_r2206242877 ## src/iceberg/catalog/in_memory_catalog.cc: ## @@ -440,44 +376,60 @@ Result> InMemoryCatalogImpl::ListTables( return table_idents; } -Result> InMemoryCatalogI

Re: [PR] refactor: Add SchemaById and SnapshotById to TableMetadata [iceberg-cpp]

2025-07-14 Thread via GitHub
gty404 commented on code in PR #144: URL: https://github.com/apache/iceberg-cpp/pull/144#discussion_r2206234479 ## src/iceberg/table_metadata.h: ## @@ -123,12 +123,17 @@ struct ICEBERG_EXPORT TableMetadata { /// \brief Get the current schema, return NotFoundError if not fou

Re: [PR] maint: catalog implementation roundtripping tests [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on PR #2090: URL: https://github.com/apache/iceberg-python/pull/2090#issuecomment-3071813000 oops, im late to the party. > I'd prefer to avoid mock and rely on the integration tests that we have with the [rest](https://github.com/apache/iceberg-python/blob/41ff7

Re: [PR] feat: add manifest list reader [iceberg-cpp]

2025-07-14 Thread via GitHub
gty404 commented on code in PR #143: URL: https://github.com/apache/iceberg-cpp/pull/143#discussion_r2206233437 ## src/iceberg/manifest_reader_internal.cc: ## @@ -0,0 +1,251 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agr

[PR] feat(catalog): Add register_table to Catalog trait [iceberg-rust]

2025-07-14 Thread via GitHub
CTTY opened a new pull request, #1509: URL: https://github.com/apache/iceberg-rust/pull/1509 ## Which issue does this PR close? - Closes #1508 ## What changes are included in this PR? - Added `register_table` to `Catalog` trait - Implemented `register_tabl

Re: [PR] feat: add manifest list reader [iceberg-cpp]

2025-07-14 Thread via GitHub
dongxiao1198 commented on code in PR #143: URL: https://github.com/apache/iceberg-cpp/pull/143#discussion_r2206219457 ## src/iceberg/manifest_reader_internal.cc: ## @@ -0,0 +1,251 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor licen

Re: [I] Support Rest Catalog Metrics Endpoint [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #474: URL: https://github.com/apache/iceberg-python/issues/474#issuecomment-3071789895 Thanks for bringing this up! I think we should have a wider discussion to standardize the approach between the different language implementations (and perhaps also align the

Re: [PR] Build: Bump datafusion from 47.0.0 to 48.0.0 [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu merged PR #2207: URL: https://github.com/apache/iceberg-python/pull/2207 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Fix support for writing to nested field partition [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on code in PR #2204: URL: https://github.com/apache/iceberg-python/pull/2204#discussion_r2206215236 ## pyiceberg/io/pyarrow.py: ## @@ -2765,3 +2767,22 @@ def _determine_partitions(spec: PartitionSpec, schema: Schema, arrow_table: pa.T ) retu

Re: [PR] Fix support for writing to nested field partition [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on code in PR #2204: URL: https://github.com/apache/iceberg-python/pull/2204#discussion_r2206212298 ## pyiceberg/io/pyarrow.py: ## @@ -2765,3 +2767,22 @@ def _determine_partitions(spec: PartitionSpec, schema: Schema, arrow_table: pa.T ) retu

Re: [PR] add PARTITION_SUMMARY_PROP [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on PR #2202: URL: https://github.com/apache/iceberg-python/pull/2202#issuecomment-3071743878 Thanks for the PR @gtrettenero and @bryanck for the review :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Build: Bump daft from 0.5.8 to 0.5.10 [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu merged PR #2211: URL: https://github.com/apache/iceberg-python/pull/2211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Build: Bump huggingface-hub from 0.33.2 to 0.33.4 [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu merged PR #2210: URL: https://github.com/apache/iceberg-python/pull/2210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Build: Bump duckdb from 1.3.1 to 1.3.2 [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu merged PR #2208: URL: https://github.com/apache/iceberg-python/pull/2208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [I] Support reading table metadata with partition statistics files [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu closed issue #2034: Support reading table metadata with partition statistics files URL: https://github.com/apache/iceberg-python/issues/2034 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Support reading table metadata with partition statistics files [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #2034: URL: https://github.com/apache/iceberg-python/issues/2034#issuecomment-3071748102 #2146 is merged! We can address actually using these stat files separately :) -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] add PARTITION_SUMMARY_PROP [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu merged PR #2202: URL: https://github.com/apache/iceberg-python/pull/2202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Allow updating table scans with cached properties and non-argument members [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on code in PR #2178: URL: https://github.com/apache/iceberg-python/pull/2178#discussion_r2206187346 ## pyiceberg/table/__init__.py: ## @@ -1691,7 +1691,12 @@ def to_polars(self) -> pl.DataFrame: ... def update(self: S, **overrides: Any) -> S:

Re: [PR] Allow updating table scans with cached properties and non-argument members [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on PR #2178: URL: https://github.com/apache/iceberg-python/pull/2178#issuecomment-3071741466 Thanks for the thoughtful explanation @jayceslesar I prefer option 1 too. We're making the intent of `**self.__dict__` more explicit by only using the parameters of `__in

[I] Add register_table to Catalog trait [iceberg-rust]

2025-07-14 Thread via GitHub
CTTY opened a new issue, #1508: URL: https://github.com/apache/iceberg-rust/issues/1508 ### Is your feature request related to a problem or challenge? Currently `Catalog` trait in iceberg-rs doesn't have `register_table` to register existing tables to a catalog. This feature can be re

Re: [PR] feat: add manifest list reader [iceberg-cpp]

2025-07-14 Thread via GitHub
wgtmac commented on code in PR #143: URL: https://github.com/apache/iceberg-cpp/pull/143#discussion_r2206155534 ## src/iceberg/manifest_reader.h: ## @@ -35,35 +35,32 @@ namespace iceberg { class ICEBERG_EXPORT ManifestReader { public: virtual ~ManifestReader() = default; -

Re: [PR] add PARTITION_SUMMARY_PROP [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on code in PR #2202: URL: https://github.com/apache/iceberg-python/pull/2202#discussion_r2206169714 ## pyiceberg/table/snapshots.py: ## @@ -306,6 +307,8 @@ def build(self) -> Dict[str, str]: changed_partitions_size = len(self.partition_metrics)

[PR] Build: Bump daft from 0.5.8 to 0.5.10 [iceberg-python]

2025-07-14 Thread via GitHub
dependabot[bot] opened a new pull request, #2211: URL: https://github.com/apache/iceberg-python/pull/2211 Bumps [daft](https://github.com/Eventual-Inc/Daft) from 0.5.8 to 0.5.10. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>daft's releases. v0.5

[PR] Build: Bump huggingface-hub from 0.33.2 to 0.33.4 [iceberg-python]

2025-07-14 Thread via GitHub
dependabot[bot] opened a new pull request, #2210: URL: https://github.com/apache/iceberg-python/pull/2210 Bumps [huggingface-hub](https://github.com/huggingface/huggingface_hub) from 0.33.2 to 0.33.4. Release notes Sourced from https://github.com/huggingface/huggingface_hub/release

Re: [PR] Add RemovePartitionStatisticsUpdate and SetPartitionStatisticsUpdate [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on PR #2192: URL: https://github.com/apache/iceberg-python/pull/2192#issuecomment-3071705304 Thanks for the PR @rambleraptor and @smaheshwar-pltr for the review! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [I] Support SetPartitionStatistics and RemovePartitionStatistics [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu closed issue #2191: Support SetPartitionStatistics and RemovePartitionStatistics URL: https://github.com/apache/iceberg-python/issues/2191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Add RemovePartitionStatisticsUpdate and SetPartitionStatisticsUpdate [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu merged PR #2192: URL: https://github.com/apache/iceberg-python/pull/2192 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] feat: add schema conversion from avro `timestamp-millis` and `uuid` [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu merged PR #2173: URL: https://github.com/apache/iceberg-python/pull/2173 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [I] optimize `_combine_positional_deletes` [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on issue #1271: URL: https://github.com/apache/iceberg-python/issues/1271#issuecomment-3071695095 Cool! Looks like this will be part of pyarrow 21 I'd love to get some thoughts around how we can better support multiple pyarrow versions in pyiceberg, https://git

[I] [discussion] dealing with multiple pyarrow versions [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu opened a new issue, #2209: URL: https://github.com/apache/iceberg-python/issues/2209 ### Feature Request / Improvement I've seen in multiple issues and across multiple PRs where we depend on a specific version of pyarrow. - Some features are only available after a certa

[PR] Build: Bump duckdb from 1.3.1 to 1.3.2 [iceberg-python]

2025-07-14 Thread via GitHub
dependabot[bot] opened a new pull request, #2208: URL: https://github.com/apache/iceberg-python/pull/2208 Bumps [duckdb](https://github.com/duckdb/duckdb) from 1.3.1 to 1.3.2. Release notes Sourced from https://github.com/duckdb/duckdb/releases";>duckdb's releases. v1.3.2 Bug

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

2025-07-14 Thread via GitHub
ehsantn commented on code in PR #2167: URL: https://github.com/apache/iceberg-python/pull/2167#discussion_r2206136677 ## tests/integration/test_writes/test_partitioned_writes.py: ## @@ -547,14 +552,14 @@ def test_summaries_with_null(spark: SparkSession, session_catalog: Catalog

Re: [PR] refactor: Add SchemaById and SnapshotById to TableMetadata [iceberg-cpp]

2025-07-14 Thread via GitHub
wgtmac commented on code in PR #144: URL: https://github.com/apache/iceberg-cpp/pull/144#discussion_r2206135898 ## src/iceberg/table_metadata.h: ## @@ -123,12 +123,17 @@ struct ICEBERG_EXPORT TableMetadata { /// \brief Get the current schema, return NotFoundError if not fou

Re: [I] org.apache.thrift.TApplicationException: Invalid method name: 'get_table' [iceberg]

2025-07-14 Thread via GitHub
pan3793 commented on issue #12878: URL: https://github.com/apache/iceberg/issues/12878#issuecomment-3071666097 @shorrocka I think the simplest way is to rebuild the latest Hive branch-2.3 with reverting https://github.com/apache/hive/commit/c78ff81915ff3f54d3b1e7c3ce1f11a6fdf749b2, to repl

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

2025-07-14 Thread via GitHub
ehsantn commented on PR #2167: URL: https://github.com/apache/iceberg-python/pull/2167#issuecomment-3071665954 > @ehsantn i think this should make the tests more maintainable. wdyt? Sure, will work on it ASAP. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

2025-07-14 Thread via GitHub
kevinjqliu commented on code in PR #2167: URL: https://github.com/apache/iceberg-python/pull/2167#discussion_r2206128117 ## tests/integration/test_writes/test_partitioned_writes.py: ## @@ -547,14 +552,14 @@ def test_summaries_with_null(spark: SparkSession, session_catalog: Cata

Re: [PR] Docs: Add BladePipe to list of vendors and blog posts [iceberg]

2025-07-14 Thread via GitHub
ChocZoe commented on PR #13510: URL: https://github.com/apache/iceberg/pull/13510#issuecomment-3071641332 > Thanks @ChocZoe I'll go ahead and merge Hi, thanks for merging. But I found that only the nightly part is updated but not the latest part in the Docs, and I didn't find anywhere

Re: [PR] fix: Fix mock dependency [iceberg-rust]

2025-07-14 Thread via GitHub
liurenjie1024 merged PR #1507: URL: https://github.com/apache/iceberg-rust/pull/1507 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

[PR] Build: Bump datafusion from 47.0.0 to 48.0.0 [iceberg-python]

2025-07-14 Thread via GitHub
dependabot[bot] opened a new pull request, #2207: URL: https://github.com/apache/iceberg-python/pull/2207 Bumps [datafusion](https://github.com/apache/datafusion-python) from 47.0.0 to 48.0.0. Commits https://github.com/apache/datafusion-python/commit/f10d3b892b81f48b4dfc2ccd84

Re: [I] How to load Iceberg Table in Java using Spark 4 (preview2) and Iceberg 1.5.1 [iceberg]

2025-07-14 Thread via GitHub
JeonDaehong commented on issue #13551: URL: https://github.com/apache/iceberg/issues/13551#issuecomment-3071553219 Due to a runtime version conflict with ANTLR, even directly calling CALL procedures is currently failing. I will need to wait until utility class support for Spark 4.0 be

Re: [I] Kafka Connect Sporadic Commit Delay [iceberg]

2025-07-14 Thread via GitHub
aaronphilip commented on issue #11796: URL: https://github.com/apache/iceberg/issues/11796#issuecomment-3071490907 > Yeah, I solved it by setting - > > ``` > "iceberg.kafka.session.timeout.ms": "24", > "iceberg.kafka.heartbeat.interval.ms": "3" > ``` > > session

Re: [PR] [docs] Tidy up left-hand navigation [iceberg]

2025-07-14 Thread via GitHub
manuzhang commented on code in PR #13491: URL: https://github.com/apache/iceberg/pull/13491#discussion_r2206045971 ## docs/mkdocs.yml: ## @@ -22,69 +22,79 @@ plugins: nav: - index.md - - Tables: -- branching.md -- configuration.md -- evolution.md -- mainte

Re: [PR] [docs] Tidy up left-hand navigation [iceberg]

2025-07-14 Thread via GitHub
manuzhang commented on code in PR #13491: URL: https://github.com/apache/iceberg/pull/13491#discussion_r2206041639 ## site/nav.yml: ## @@ -23,23 +23,23 @@ nav: - Docs: - nightly: '!include docs/docs/nightly/mkdocs.yml' - latest: '!include docs/docs/latest/mkdocs.ym

Re: [I] Kafka Connect Sporadic Commit Delay [iceberg]

2025-07-14 Thread via GitHub
fenil25 commented on issue #11796: URL: https://github.com/apache/iceberg/issues/11796#issuecomment-3071441403 Yeah, I solved it by setting - ``` "iceberg.kafka.session.timeout.ms": "24", "iceberg.kafka.heartbeat.interval.ms": "3" ``` session timeout for control gr

Re: [I] docs: add apache amoro(incubating) with iceberg [iceberg]

2025-07-14 Thread via GitHub
github-actions[bot] commented on issue #11965: URL: https://github.com/apache/iceberg/issues/11965#issuecomment-3071402534 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Kafka Connect Sporadic Commit Delay [iceberg]

2025-07-14 Thread via GitHub
aaronphilip commented on issue #11796: URL: https://github.com/apache/iceberg/issues/11796#issuecomment-3071420083 Hi we are running into the same issue for a topic with constant throughput. Our timeout a flush interval configs are all set to the defaults. -- This is an automated message

Re: [PR] File Format API without registry [iceberg]

2025-07-14 Thread via GitHub
github-actions[bot] closed pull request #13257: File Format API without registry URL: https://github.com/apache/iceberg/pull/13257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] File Format API without registry [iceberg]

2025-07-14 Thread via GitHub
github-actions[bot] commented on PR #13257: URL: https://github.com/apache/iceberg/pull/13257#issuecomment-3071402685 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [I] Iceberg API is unable to connect to Hive Metastore > 4.0.0-beta-1 [iceberg]

2025-07-14 Thread via GitHub
github-actions[bot] commented on issue #11928: URL: https://github.com/apache/iceberg/issues/11928#issuecomment-3071402489 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Gradle task to update LICENSE and NOTICE on every build for runtime jars [iceberg]

2025-07-14 Thread via GitHub
github-actions[bot] commented on issue #11559: URL: https://github.com/apache/iceberg/issues/11559#issuecomment-3071402247 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] s3:DeleteObject giving because no session policy allows the s3:DeleteObject action [iceberg]

2025-07-14 Thread via GitHub
github-actions[bot] commented on issue #11153: URL: https://github.com/apache/iceberg/issues/11153#issuecomment-3071402186 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Iceberg to configure AWS S3 configuration with the Hadoop and Hive4 setup is hanging without giving ant error [iceberg]

2025-07-14 Thread via GitHub
github-actions[bot] commented on issue #11145: URL: https://github.com/apache/iceberg/issues/11145#issuecomment-3071402137 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] DeleteOrphanFilesSparkAction.listDirRecursively - No FileSystem for scheme "s3" [iceberg]

2025-07-14 Thread via GitHub
github-actions[bot] commented on issue #10539: URL: https://github.com/apache/iceberg/issues/10539#issuecomment-3071402074 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-14 Thread via GitHub
gabeiglio commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2206024061 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -281,17 +282,21 @@ private static Iterable toIds(Iterable snapshots) { return Iterables

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-14 Thread via GitHub
gabeiglio commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2206024061 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -281,17 +282,21 @@ private static Iterable toIds(Iterable snapshots) { return Iterables

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2205984872 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -281,17 +282,21 @@ private static Iterable toIds(Iterable snapshots) { return It

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2205986062 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -281,17 +282,21 @@ private static Iterable toIds(Iterable snapshots) { return It

Re: [PR] [docs] Tidy up left-hand navigation [iceberg]

2025-07-14 Thread via GitHub
stevenzwu commented on code in PR #13491: URL: https://github.com/apache/iceberg/pull/13491#discussion_r2205964465 ## docs/mkdocs.yml: ## @@ -22,69 +22,79 @@ plugins: nav: - index.md - - Tables: -- branching.md -- configuration.md -- evolution.md -- mainte

[PR] Delegate delete to JUnit [iceberg]

2025-07-14 Thread via GitHub
SaiLalithPrasad opened a new pull request, #13557: URL: https://github.com/apache/iceberg/pull/13557 Hello, I took the liberty of working on the issue. I managed to run the tests on my machine after setting up the code base. Creating this Draft PR to see if I understood the issue correctly.

Re: [PR] Add support for DELTA_BINARY_PACKED Parquet encoding [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #13391: URL: https://github.com/apache/iceberg/pull/13391#discussion_r2205950951 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/parquet/VectorizedDeltaEncodedValuesReader.java: ## @@ -0,0 +1,275 @@ +/* + * Licensed to the Apac

[PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-14 Thread via GitHub
gabeiglio opened a new pull request, #13556: URL: https://github.com/apache/iceberg/pull/13556 This changes ensures that data files are not loaded all into memory at once when validating replaced partitions. Instead it uses `ParallelIterable` to load new files in batches with a hard limit o

Re: [I] How to load Iceberg Table in Java using Spark 4 (preview2) and Iceberg 1.5.1 [iceberg]

2025-07-14 Thread via GitHub
JeonDaehong commented on issue #13551: URL: https://github.com/apache/iceberg/issues/13551#issuecomment-3071144649 @manuzhang @amogh-jahagirdar Ah, so does that mean there's no way to do it at the moment? I've been studying this recently with a focus on integration with Spark,

Re: [I] How to load Iceberg Table in Java using Spark 4 (preview2) and Iceberg 1.5.1 [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on issue #13551: URL: https://github.com/apache/iceberg/issues/13551#issuecomment-3071025230 @manuzhang is right that it will be released in 1.10.0, but one thing I just recalled from the backport is we still call the util class `Spark3Util` which is a bit awkward

Re: [I] Metrics Reporting [iceberg-go]

2025-07-14 Thread via GitHub
zeroshade commented on issue #485: URL: https://github.com/apache/iceberg-go/issues/485#issuecomment-3070996160 Thanks for taking this on. I think it's a great idea for us to ensure we can align all the implementations to utilize the same metrics names for the same things. Even better if yo

Re: [PR] Docs: Add BladePipe to list of vendors and blog posts [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar merged PR #13510: URL: https://github.com/apache/iceberg/pull/13510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Docs: Add BladePipe to list of vendors and blog posts [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on PR #13510: URL: https://github.com/apache/iceberg/pull/13510#issuecomment-3070968413 Thanks @ChocZoe I'll go ahead and merge -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[PR] feat: update pyiceberg/catalog/hive.py to support hive 4.x.x [iceberg-python]

2025-07-14 Thread via GitHub
igorvoltaic opened a new pull request, #2206: URL: https://github.com/apache/iceberg-python/pull/2206 resolves #1222 # Rationale for this change Starting at version 4.0.1, Hive metastore removed deprecated thrift APIs that py-iceberg is currently using. When trying to create a

Re: [I] org.apache.thrift.TApplicationException: Invalid method name: 'get_table' [iceberg]

2025-07-14 Thread via GitHub
shorrocka commented on issue #12878: URL: https://github.com/apache/iceberg/issues/12878#issuecomment-3070919475 Hi Everyone, I am catching up on this but it seems as though support for using a hive 4.0.1 metastore with spark 4.0 is not included in iceberg even in the latest release when bu

Re: [PR] feat(transaction): Add retry logic to transaction [iceberg-rust]

2025-07-14 Thread via GitHub
CTTY commented on code in PR #1484: URL: https://github.com/apache/iceberg-rust/pull/1484#discussion_r2205765539 ## Cargo.toml: ## @@ -82,6 +83,7 @@ itertools = "0.13" linkedbytes = "0.1.8" metainfo = "0.7.14" mimalloc = "0.1.46" +mockall = "0.13.1" Review Comment: Yes, I

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-14 Thread via GitHub
stevenzwu merged PR #13310: URL: https://github.com/apache/iceberg/pull/13310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark: Add Variant read support for Spark Iceberg tables [iceberg]

2025-07-14 Thread via GitHub
aihuaxu commented on code in PR #13219: URL: https://github.com/apache/iceberg/pull/13219#discussion_r2205722186 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkVariants.java: ## @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Spark: Add Variant read support for Spark Iceberg tables [iceberg]

2025-07-14 Thread via GitHub
aihuaxu commented on code in PR #13219: URL: https://github.com/apache/iceberg/pull/13219#discussion_r2127083589 ## parquet/src/main/java/org/apache/iceberg/parquet/TripleIterator.java: ## @@ -21,7 +21,7 @@ import java.util.Iterator; import org.apache.parquet.io.api.Binary;

Re: [PR] Spark: Add Variant read support for Spark Iceberg tables [iceberg]

2025-07-14 Thread via GitHub
aihuaxu commented on PR #13219: URL: https://github.com/apache/iceberg/pull/13219#issuecomment-3070819690 > @aihuaxu, I caught up with @danielcweeks about this yesterday and I think his concern was that we need to support reading shredded values. It would be nice to be able to write them as

Re: [I] Wrong name for parquet page row count min and max stats [iceberg]

2025-07-14 Thread via GitHub
lizdotsh commented on issue #11770: URL: https://github.com/apache/iceberg/issues/11770#issuecomment-3070749452 still an issue/also ran into this. the property also seems to be undocumented on website. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2205666415 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/ExtractRowLineage.java: ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] feat(transaction): Add initial support for update spec in transaction API [iceberg-go]

2025-07-14 Thread via GitHub
zeroshade merged PR #467: URL: https://github.com/apache/iceberg-go/pull/467 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2205641072 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/ExtractRowLineage.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2205634757 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/ExtractRowLineage.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2205634757 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/ExtractRowLineage.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2205634757 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/ExtractRowLineage.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundat

  1   2   >