Re: [PR] Migrate Spark 3.4 TestBase-related remaining tests in actions [iceberg]

2025-03-20 Thread via GitHub
tomtongue commented on code in PR #12579: URL: https://github.com/apache/iceberg/pull/12579#discussion_r2006953306 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRemoveOrphanFilesAction.java: ## @@ -423,11 +438,10 @@ public void testRemoveUnreachableMetad

Re: [PR] Spark 3.4: Read DVs when reading from .position_deletes table [iceberg]

2025-03-20 Thread via GitHub
nastra closed pull request #12597: Spark 3.4: Read DVs when reading from .position_deletes table URL: https://github.com/apache/iceberg/pull/12597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] Spec: Add details on GZIP compressed metadata files [iceberg]

2025-03-20 Thread via GitHub
emkornfield opened a new pull request, #12598: URL: https://github.com/apache/iceberg/pull/12598 I am not sure if this is the best place to put the details (open to suggestions on creating another section elsewhere or if there are thoughts on a better section to add the sentences to).

Re: [PR] Migrate Spark 3.4 TestBase-related remaining tests in actions [iceberg]

2025-03-20 Thread via GitHub
nastra commented on code in PR #12579: URL: https://github.com/apache/iceberg/pull/12579#discussion_r2006928513 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRemoveOrphanFilesAction.java: ## @@ -423,11 +438,10 @@ public void testRemoveUnreachableMetadata

Re: [PR] Migrate Spark 3.4 TestBase-related remaining tests in actions [iceberg]

2025-03-20 Thread via GitHub
nastra commented on code in PR #12579: URL: https://github.com/apache/iceberg/pull/12579#discussion_r2006924675 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java: ## @@ -1560,7 +1560,6 @@ public void testAutoSortShuffleOutput() {

Re: [PR] Core: Handle NamespaceNotEmptyException in NamespaceErrorHandler [iceberg]

2025-03-20 Thread via GitHub
nastra merged PR #12505: URL: https://github.com/apache/iceberg/pull/12505 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Core: Handle NamespaceNotEmptyException in NamespaceErrorHandler [iceberg]

2025-03-20 Thread via GitHub
nastra commented on PR #12505: URL: https://github.com/apache/iceberg/pull/12505#issuecomment-2742422343 thanks for the reviews @danielcweeks @amogh-jahagirdar -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[PR] Core, Spark: Add row lineage metadata columns, and surface them in SparkTable metadata columns [iceberg]

2025-03-20 Thread via GitHub
amogh-jahagirdar opened a new pull request, #12596: URL: https://github.com/apache/iceberg/pull/12596 This change adds 1. Row id and last updated sequence number metadata columns with field IDs as defined per the spec https://iceberg.apache.org/spec/#reserved-field-ids 2. Expos

Re: [PR] Core, Spark: Add row lineage metadata columns, and surface them in SparkTable metadata columns [iceberg]

2025-03-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #12596: URL: https://github.com/apache/iceberg/pull/12596#discussion_r2006888010 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkMetadataColumns.java: ## @@ -311,6 +317,22 @@ public void testConflictingColumns()

Re: [I] Support RowDeltaAction [iceberg-rust]

2025-03-20 Thread via GitHub
ZENOTME commented on issue #1104: URL: https://github.com/apache/iceberg-rust/issues/1104#issuecomment-2742230817 > For metadata conflict detection, what is the exact design outline that you are looking to implement? conflict detection implementation based on the validation phase. I w

Re: [I] Spark can't get information from metadata tables [iceberg]

2025-03-20 Thread via GitHub
varpa89 commented on issue #12466: URL: https://github.com/apache/iceberg/issues/12466#issuecomment-2742245484 @nastra @singhpk234 ok, got it, thank you very much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] Support RowDeltaAction [iceberg-rust]

2025-03-20 Thread via GitHub
jonathanc-n commented on issue #1104: URL: https://github.com/apache/iceberg-rust/issues/1104#issuecomment-2742239268 I'll take a deeper look into the implementation tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] refactor: Split `manifest` module into multiple modules [iceberg-rust]

2025-03-20 Thread via GitHub
liurenjie1024 merged PR #1119: URL: https://github.com/apache/iceberg-rust/pull/1119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [I] Can not read data when there is an required filed under an optional stuct [iceberg]

2025-03-20 Thread via GitHub
lliangyu-lin commented on issue #12441: URL: https://github.com/apache/iceberg/issues/12441#issuecomment-2741740243 I wasn't able to reproduce the issue... Reproduction step I tried * Create schema with required `id` and optional `school`(with required `year` and optional `details`) co

Re: [I] catalog table-default and table-override properties are not supported in CREATE_OR_REPLACE operation in IRC [iceberg]

2025-03-20 Thread via GitHub
puchengy commented on issue #12506: URL: https://github.com/apache/iceberg/issues/12506#issuecomment-2742208580 @nastra Thanks, we can close for now. We have an internal fix for this so I will likely come back on this once I get a chance. -- This is an automated message from the Apache Gi

Re: [I] Support for Parameterized Views in Iceberg [iceberg]

2025-03-20 Thread via GitHub
ajantha-bhat commented on issue #12594: URL: https://github.com/apache/iceberg/issues/12594#issuecomment-2742177958 @WasimIsmail: Have you checked the proposal about SQL UDFs? https://github.com/apache/iceberg/issues/10432 I believe SQL Udfs will also cover the usecase of parametri

[PR] Build: Bump griffe from 1.6.1 to 1.6.2 [iceberg-python]

2025-03-20 Thread via GitHub
dependabot[bot] opened a new pull request, #1823: URL: https://github.com/apache/iceberg-python/pull/1823 Bumps [griffe](https://github.com/mkdocstrings/griffe) from 1.6.1 to 1.6.2. Release notes Sourced from https://github.com/mkdocstrings/griffe/releases";>griffe's releases.

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2006404901 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantReaders.java: ## @@ -879,6 +891,270 @@ public void testMixedRecords() throws IOException { VariantT

Re: [PR] feat: Add `SnapshotSummaries` [iceberg-rust]

2025-03-20 Thread via GitHub
liurenjie1024 commented on code in PR #1085: URL: https://github.com/apache/iceberg-rust/pull/1085#discussion_r2006784586 ## crates/iceberg/src/spec/snapshot_summary.rs: ## @@ -0,0 +1,755 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] feat: re-export name mapping [iceberg-rust]

2025-03-20 Thread via GitHub
liurenjie1024 commented on code in PR #1116: URL: https://github.com/apache/iceberg-rust/pull/1116#discussion_r2006760395 ## crates/iceberg/src/spec/mapped_fields.rs: ## @@ -0,0 +1,123 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] feat: Add `IndexByName`, `IndexById` and `CreateMapping` to `NameMapping` [iceberg-rust]

2025-03-20 Thread via GitHub
jonathanc-n commented on PR #1082: URL: https://github.com/apache/iceberg-rust/pull/1082#issuecomment-2742134074 the reasoning for not implementing `MappedFields` is mentioned in #1072. cc @liurenjie1024 @Fokko -- This is an automated message from the Apache Git Service. To respond to th

Re: [I] RestCatalog append table is slow (2+s) [iceberg-python]

2025-03-20 Thread via GitHub
corleyma commented on issue #1806: URL: https://github.com/apache/iceberg-python/issues/1806#issuecomment-2742117275 I wonder if @c-thiel has any thoughts about the best way to profile this from the Lakekeeper side? My guess is enable tracing logs? -- This is an automated message from t

[PR] Build: Bump mkdocstrings-python from 1.16.6 to 1.16.7 [iceberg-python]

2025-03-20 Thread via GitHub
dependabot[bot] opened a new pull request, #1824: URL: https://github.com/apache/iceberg-python/pull/1824 Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 1.16.6 to 1.16.7. Release notes Sourced from https://github.com/mkdocstrings/python/releases";>mkdocstr

Re: [PR] V3: Introduce `timestamp_ns` and `timestamptz_ns` [iceberg-python]

2025-03-20 Thread via GitHub
sungwy commented on code in PR #1632: URL: https://github.com/apache/iceberg-python/pull/1632#discussion_r2006637432 ## pyiceberg/types.py: ## @@ -62,6 +63,12 @@ FIXED_PARSER = ParseNumberFromBrackets(FIXED) +class TableVersion(IntEnum): +ONE = 1 +TWO = 2 +THREE

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2006398578 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantReaders.java: ## @@ -900,6 +1176,31 @@ private static GenericRecord record(GroupType type, Map fields)

Re: [PR] Test out Iceberg 1.7.2 RC3 [iceberg-python]

2025-03-20 Thread via GitHub
Fokko closed pull request #1581: Test out Iceberg 1.7.2 RC3 URL: https://github.com/apache/iceberg-python/pull/1581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] Core: Fix failure when reading files table with branch [iceberg]

2025-03-20 Thread via GitHub
ebyhr commented on code in PR #11719: URL: https://github.com/apache/iceberg/pull/11719#discussion_r2006724974 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -671,6 +671,76 @@ public void testFilesTableTimeTravelWi

Re: [I] Split manifest module into multiple modules [iceberg-rust]

2025-03-20 Thread via GitHub
liurenjie1024 closed issue #1083: Split manifest module into multiple modules URL: https://github.com/apache/iceberg-rust/issues/1083 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Spec: update to reflect lineage is required [iceberg]

2025-03-20 Thread via GitHub
danielcweeks commented on code in PR #12580: URL: https://github.com/apache/iceberg/pull/12580#discussion_r2006453139 ## format/spec.md: ## @@ -367,37 +367,35 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is:

Re: [PR] feat: Add `SnapshotSummaries` [iceberg-rust]

2025-03-20 Thread via GitHub
jonathanc-n commented on code in PR #1085: URL: https://github.com/apache/iceberg-rust/pull/1085#discussion_r2006708335 ## crates/iceberg/src/spec/snapshot_summary.rs: ## @@ -0,0 +1,747 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [PR] Core, Parquet, ORC: Fix missing data when writing unknown [iceberg]

2025-03-20 Thread via GitHub
rdblue merged PR #12581: URL: https://github.com/apache/iceberg/pull/12581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] refactor: Split manifest crate [iceberg-rust]

2025-03-20 Thread via GitHub
jonathanc-n closed pull request #1109: refactor: Split manifest crate URL: https://github.com/apache/iceberg-rust/pull/1109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[PR] refactor: Split scan module [iceberg-rust]

2025-03-20 Thread via GitHub
jonathanc-n opened a new pull request, #1120: URL: https://github.com/apache/iceberg-rust/pull/1120 ## Which issue does this PR close? - Closes #. ## What changes are included in this PR? Split Scan module, i was going through it and all the information for scan was

Re: [PR] Core, Rest: Enable useSystemProperties on RESTClient [iceberg]

2025-03-20 Thread via GitHub
github-actions[bot] commented on PR #11548: URL: https://github.com/apache/iceberg/pull/11548#issuecomment-2741932146 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Spark: prefix SparkTable with 'iceberg' to clearly identify Iceberg table [iceberg]

2025-03-20 Thread via GitHub
wypoon commented on PR #12543: URL: https://github.com/apache/iceberg/pull/12543#issuecomment-2741967578 Can you elaborate on how you use this and why you want it? So you just want `SparkTable::toString()` to have "iceberg" in it? There was a similar change a while back, https://github.co

Re: [I] V3 Tracking issue [iceberg-python]

2025-03-20 Thread via GitHub
sungwy commented on issue #1818: URL: https://github.com/apache/iceberg-python/issues/1818#issuecomment-2741939223 I'd love to get my hands on working with some more new types (when the dependencies are resolved). Could you assign me to either the VariantType or GeoTypes? (or both? h

Re: [PR] Core: Fix move/update/makeRequire/makeOptional fail after rename schema (#10830) [iceberg]

2025-03-20 Thread via GitHub
github-actions[bot] commented on PR #12202: URL: https://github.com/apache/iceberg/pull/12202#issuecomment-2741932319 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] support create table like in flink catalog and watermark in windows [iceberg]

2025-03-20 Thread via GitHub
github-actions[bot] closed pull request #12116: support create table like in flink catalog and watermark in windows URL: https://github.com/apache/iceberg/pull/12116 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] support create table like in flink catalog and watermark in windows [iceberg]

2025-03-20 Thread via GitHub
github-actions[bot] commented on PR #12116: URL: https://github.com/apache/iceberg/pull/12116#issuecomment-2741932274 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Core: Add list/map block sizes [iceberg]

2025-03-20 Thread via GitHub
github-actions[bot] commented on PR #10973: URL: https://github.com/apache/iceberg/pull/10973#issuecomment-2741932024 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [I] Varying Partitioning [iceberg]

2025-03-20 Thread via GitHub
ravindrabajpai commented on issue #12431: URL: https://github.com/apache/iceberg/issues/12431#issuecomment-2741900475 Hi @RussellSpitzer , Thanks for your reply. "**allowing a limited set of expressions could be possible**" - Even this would address a large number of uses acrosss the indus

Re: [I] Forbidden Exception creating Polaris Rest catalog with Flink 1.20 [iceberg]

2025-03-20 Thread via GitHub
george-zubrienko commented on issue #11836: URL: https://github.com/apache/iceberg/issues/11836#issuecomment-2741860337 > [@george-zubrienko](https://github.com/george-zubrienko) does that work with Iceberg>=1.7? I looked through the Iceberg code but I don't see the HttpClient requests bein

Re: [PR] Core: Enable row lineage for all v3 tables [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12593: URL: https://github.com/apache/iceberg/pull/12593#discussion_r2006579340 ## core/src/test/java/org/apache/iceberg/TestRowLineageMetadata.java: ## @@ -1,334 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or mo

Re: [PR] Core: Enable row lineage for all v3 tables [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12593: URL: https://github.com/apache/iceberg/pull/12593#discussion_r2006579702 ## core/src/test/java/org/apache/iceberg/TestTableMetadata.java: ## @@ -232,8 +231,6 @@ public void testJsonConversion() throws Exception { assertThat(metadata.st

Re: [PR] Core: Enable row lineage for all v3 tables [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12593: URL: https://github.com/apache/iceberg/pull/12593#discussion_r2006578958 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -299,9 +290,8 @@ public String toString() { Map refs, List statisticsFiles, Lis

Re: [PR] Core: Use InternalData with avro for readers. [iceberg]

2025-03-20 Thread via GitHub
danielcweeks merged PR #12476: URL: https://github.com/apache/iceberg/pull/12476 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

[I] Minimally required pyarrow version [iceberg-python]

2025-03-20 Thread via GitHub
gli-chris-hao opened a new issue, #1822: URL: https://github.com/apache/iceberg-python/issues/1822 Hey folks, is it possible to relax the minimum requirement on `pyarrow` versions? In 0.9.0 it requires `>=17.0.0,<20.0.0`: https://github.com/apache/iceberg-python/blob/pyiceberg-0.9.0/pyproje

Re: [PR] Core: Handle NamespaceNotEmptyException in NamespaceErrorHandler [iceberg]

2025-03-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #12505: URL: https://github.com/apache/iceberg/pull/12505#discussion_r2005994448 ## core/src/main/java/org/apache/iceberg/rest/ErrorHandlers.java: ## @@ -181,6 +191,20 @@ public void accept(ErrorResponse error) { } } + /** Requ

Re: [PR] fix: Clickhouse does not support "null" as partition spec metadata [iceberg-go]

2025-03-20 Thread via GitHub
arnaudbriche commented on PR #347: URL: https://github.com/apache/iceberg-go/pull/347#issuecomment-2741017842 Is there anything else to do before it's merged ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Prototyping Spark 3.4 row lineage [iceberg]

2025-03-20 Thread via GitHub
amogh-jahagirdar commented on PR #12592: URL: https://github.com/apache/iceberg/pull/12592#issuecomment-2741759994 Next steps: 1. I'll be primarily be looking at the Spark plan side of things. So this means handling the rest of the cases (CoW/pure appends) and to begin with I'll focu

Re: [PR] Core, Parquet, ORC: Fix missing data when writing unknown [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12581: URL: https://github.com/apache/iceberg/pull/12581#discussion_r2006502201 ## parquet/src/main/java/org/apache/iceberg/data/parquet/InternalWriter.java: ## @@ -40,11 +40,15 @@ public class InternalWriter extends BaseParquetWriter { priv

Re: [PR] Implement MergeFiles operation [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on code in PR #354: URL: https://github.com/apache/iceberg-go/pull/354#discussion_r2006483602 ## table/snapshot_producers.go: ## @@ -92,6 +92,65 @@ func (fa *fastAppendFiles) deletedEntries() ([]iceberg.ManifestEntry, error) { return nil, nil } +f

Re: [PR] Spec: update to reflect lineage is required [iceberg]

2025-03-20 Thread via GitHub
RussellSpitzer commented on code in PR #12580: URL: https://github.com/apache/iceberg/pull/12580#discussion_r2006463935 ## format/spec.md: ## @@ -367,37 +367,35 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is

Re: [PR] Core, Parquet, ORC: Fix missing data when writing unknown [iceberg]

2025-03-20 Thread via GitHub
danielcweeks commented on code in PR #12581: URL: https://github.com/apache/iceberg/pull/12581#discussion_r2006421739 ## parquet/src/main/java/org/apache/iceberg/data/parquet/InternalWriter.java: ## @@ -40,11 +40,15 @@ public class InternalWriter extends BaseParquetWriter {

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2006420320 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantReaders.java: ## @@ -1086,14 +1387,22 @@ private static GroupType field(String name, Type shreddedType)

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2006395249 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantReaders.java: ## @@ -879,6 +891,270 @@ public void testMixedRecords() throws IOException { VariantT

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2006385823 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantReaders.java: ## @@ -900,6 +1176,31 @@ private static GenericRecord record(GroupType type, Map fields)

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2006370387 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantReaders.java: ## @@ -900,6 +1176,31 @@ private static GenericRecord record(GroupType type, Map fields)

[I] [DISCUSS] We have some C++ code to share for Iceberg read/write [iceberg-cpp]

2025-03-20 Thread via GitHub
jovezhong opened a new issue, #46: URL: https://github.com/apache/iceberg-cpp/issues/46 Just a quick note, we are open-sourcing our C++ implementation for Iceberg read/write, starting from this "small" PR: https://github.com/timeplus-io/proton/pull/928 It supports Iceberg REST Catal

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2006284066 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetVariantReaders.java: ## @@ -95,6 +96,19 @@ public static VariantValueReader objects( fieldReaders);

Re: [PR] feat: validate snapshot write compatibility [iceberg-python]

2025-03-20 Thread via GitHub
Fokko commented on code in PR #1772: URL: https://github.com/apache/iceberg-python/pull/1772#discussion_r2006284761 ## pyiceberg/table/update/snapshot.py: ## @@ -251,6 +253,13 @@ def _commit(self) -> UpdatesAndRequirements: ) location_provider = self._transacti

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2006275486 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetVariantReaders.java: ## @@ -332,6 +346,57 @@ public void setPageSource(PageReadStore pageStore) { }

Re: [PR] fix(catalog): strip trailing slash for warehouse paths [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade merged PR #349: URL: https://github.com/apache/iceberg-go/pull/349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] fix(table): remove inconsistencies in metadata serialize [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade merged PR #350: URL: https://github.com/apache/iceberg-go/pull/350 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] Delete operation is not executed with ThreadPools#DELETE_WORKER_POOL [iceberg]

2025-03-20 Thread via GitHub
amogh-jahagirdar commented on issue #12590: URL: https://github.com/apache/iceberg/issues/12590#issuecomment-2741215904 I think the intention of that threadpool is a bit different than what you're thinking. `DELETE_WORKER_POOL` is meant for reading delete files concurrently into in-memory s

Re: [PR] fix(catalog): add rest integration write test [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade merged PR #352: URL: https://github.com/apache/iceberg-go/pull/352 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] Inconsistencies in Iceberg metadata serialization [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade closed issue #345: Inconsistencies in Iceberg metadata serialization URL: https://github.com/apache/iceberg-go/issues/345 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Core: Use InternalData with avro and common DataIterable for readers. [iceberg]

2025-03-20 Thread via GitHub
danielcweeks commented on code in PR #12476: URL: https://github.com/apache/iceberg/pull/12476#discussion_r2006137086 ## core/src/main/java/org/apache/iceberg/ManifestReader.java: ## @@ -133,12 +131,18 @@ private > PartitionSpec readPartitionSpec(InputFile inp private static

Re: [PR] fix(table): fix funky test paths [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on code in PR #351: URL: https://github.com/apache/iceberg-go/pull/351#discussion_r2006135579 ## table/table_test.go: ## @@ -280,10 +280,10 @@ func (t *TableWritingTestSuite) writeParquet(fio iceio.WriteFileIO, filePath str func (t *TableWritingTestSuite)

Re: [PR] fix: Clickhouse does not support "null" as partition spec metadata [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on PR #347: URL: https://github.com/apache/iceberg-go/pull/347#issuecomment-2741094944 I was just waiting for the CI to finish. Gonna merge now. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] fix(table): fix funky test paths [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on code in PR #351: URL: https://github.com/apache/iceberg-go/pull/351#discussion_r2006133656 ## table/table_test.go: ## @@ -280,10 +280,10 @@ func (t *TableWritingTestSuite) writeParquet(fio iceio.WriteFileIO, filePath str func (t *TableWritingTestSuite)

Re: [PR] fix(table): remove inconsistencies in metadata serialize [iceberg-go]

2025-03-20 Thread via GitHub
kevinjqliu commented on code in PR #350: URL: https://github.com/apache/iceberg-go/pull/350#discussion_r2006131459 ## table/metadata.go: ## @@ -1095,7 +1095,7 @@ func (c *commonMetadata) Version() int { return c.FormatVersion } type metadataV1 struct { Schema*ice

Re: [PR] fix(table): remove inconsistencies in metadata serialize [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on code in PR #350: URL: https://github.com/apache/iceberg-go/pull/350#discussion_r2006127434 ## table/metadata.go: ## @@ -1095,7 +1095,7 @@ func (c *commonMetadata) Version() int { return c.FormatVersion } type metadataV1 struct { Schema*iceb

Re: [PR] fix(table): remove inconsistencies in metadata serialize [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on code in PR #350: URL: https://github.com/apache/iceberg-go/pull/350#discussion_r2006125628 ## table/metadata.go: ## @@ -1095,7 +1095,7 @@ func (c *commonMetadata) Version() int { return c.FormatVersion } type metadataV1 struct { Schema*iceb

Re: [PR] fix(table): remove inconsistencies in metadata serialize [iceberg-go]

2025-03-20 Thread via GitHub
kevinjqliu commented on code in PR #350: URL: https://github.com/apache/iceberg-go/pull/350#discussion_r2006112959 ## table/metadata.go: ## @@ -1095,7 +1095,7 @@ func (c *commonMetadata) Version() int { return c.FormatVersion } type metadataV1 struct { Schema*ice

Re: [PR] fix(table): fix funky test paths [iceberg-go]

2025-03-20 Thread via GitHub
kevinjqliu commented on code in PR #351: URL: https://github.com/apache/iceberg-go/pull/351#discussion_r2006099386 ## table/table_test.go: ## @@ -280,10 +280,10 @@ func (t *TableWritingTestSuite) writeParquet(fio iceio.WriteFileIO, filePath str func (t *TableWritingTestSuite

Re: [PR] fix: Clickhouse does not support "null" as partition spec metadata [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade merged PR #347: URL: https://github.com/apache/iceberg-go/pull/347 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[PR] fix(catalog): add rest integration write test [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade opened a new pull request, #352: URL: https://github.com/apache/iceberg-go/pull/352 Includes a few fixes that were found when writing the test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Enable HTTP proxy support for the client used by REST Catalog [iceberg]

2025-03-20 Thread via GitHub
danielcweeks commented on PR #12406: URL: https://github.com/apache/iceberg/pull/12406#issuecomment-2741090134 I have a few concerns here and it may overlap a little with what @adutra was getting at. This appears to tunnel a separate config/auth to the http client as opposed to extending a

Re: [I] Delete operation is not executed with ThreadPools#DELETE_WORKER_POOL [iceberg]

2025-03-20 Thread via GitHub
sopel39 commented on issue #12590: URL: https://github.com/apache/iceberg/issues/12590#issuecomment-2741086927 cc @findepi @raunaq -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Spark: Use correct statistics file in SparkScan::estimateStatistics(Snapshot) [iceberg]

2025-03-20 Thread via GitHub
wypoon commented on PR #12482: URL: https://github.com/apache/iceberg/pull/12482#issuecomment-2741063720 Thanks @findepi! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] feat: re-export name mapping [iceberg-rust]

2025-03-20 Thread via GitHub
jdockerty commented on code in PR #1116: URL: https://github.com/apache/iceberg-rust/pull/1116#discussion_r2005389130 ## crates/iceberg/src/spec/name_mapping.rs: ## @@ -32,9 +36,12 @@ pub struct NameMapping { #[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)] #[ser

Re: [PR] feat(iceberg): introduce remove schemas [iceberg-rust]

2025-03-20 Thread via GitHub
ZENOTME commented on code in PR #1115: URL: https://github.com/apache/iceberg-rust/pull/1115#discussion_r2005799771 ## crates/iceberg/src/spec/table_metadata_builder.rs: ## @@ -1210,6 +1210,35 @@ impl TableMetadataBuilder { fn highest_sort_order_id(&self) -> Option {

Re: [PR] Spec: update to reflect lineage is required [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12580: URL: https://github.com/apache/iceberg/pull/12580#discussion_r2005935738 ## format/spec.md: ## @@ -700,18 +697,18 @@ The `first_row_id` is only inherited for added data files. The inherited value m A snapshot consists of the following f

Re: [PR] Spark: Add some tests for variant fixup [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12497: URL: https://github.com/apache/iceberg/pull/12497#discussion_r2005968071 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestSparkFixupTypes.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] fix: Clickhouse does not support "null" as partition spec metadata [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on code in PR #347: URL: https://github.com/apache/iceberg-go/pull/347#discussion_r2005949334 ## manifest.go: ## @@ -951,7 +951,13 @@ func (w *ManifestWriter) meta() (map[string][]byte, error) { return nil, err } - specFieldsJ

Re: [PR] fix: Clickhouse does not support "null" as partition spec metadata [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on code in PR #347: URL: https://github.com/apache/iceberg-go/pull/347#discussion_r2005949334 ## manifest.go: ## @@ -951,7 +951,13 @@ func (w *ManifestWriter) meta() (map[string][]byte, error) { return nil, err } - specFieldsJ

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-03-20 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2005927235 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HMSTablePropertyHelper.java: ## @@ -0,0 +1,189 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Spec: update to reflect lineage is required [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12580: URL: https://github.com/apache/iceberg/pull/12580#discussion_r2005925034 ## format/spec.md: ## @@ -367,37 +367,35 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is: -| F

Re: [PR] Spec: update to reflect lineage is required [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12580: URL: https://github.com/apache/iceberg/pull/12580#discussion_r2005925034 ## format/spec.md: ## @@ -367,37 +367,35 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is: -| F

Re: [PR] Spec: update to reflect lineage is required [iceberg]

2025-03-20 Thread via GitHub
rdblue commented on code in PR #12580: URL: https://github.com/apache/iceberg/pull/12580#discussion_r2005920486 ## format/spec.md: ## @@ -367,37 +367,35 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is: -| F

[I] Spark Executors erroring out with Exit Code : 134, after running compaction [iceberg]

2025-03-20 Thread via GitHub
kaushikranjan opened a new issue, #12588: URL: https://github.com/apache/iceberg/issues/12588 ### Query engine Spark ### Question I have a spark-streaming job which writes data from source to destination table using MERGE INTO MERGE INTO nessie.local.dst dst

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-03-20 Thread via GitHub
gaborkaszab commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2005886814 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HMSTablePropertyHelper.java: ## @@ -0,0 +1,189 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] feat: validate snapshot write compatibility [iceberg-python]

2025-03-20 Thread via GitHub
Fokko commented on PR #1772: URL: https://github.com/apache/iceberg-python/pull/1772#issuecomment-2740745140 Let's add some tests as well: ```python @pytest.mark.integration @pytest.mark.parametrize("format_version", [1, 2]) def test_conflict( spark: SparkSession, sessi

[PR] Added New Blog Post: Loading Data into Apache Iceberg [iceberg]

2025-03-20 Thread via GitHub
SourabhEstuary opened a new pull request, #12587: URL: https://github.com/apache/iceberg/pull/12587 Added a new blog post: "How to Load Data into Apache Iceberg: A Step-by-Step Tutorial" by Estuary. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [I] NullPointerException after deleting old partition column [iceberg]

2025-03-20 Thread via GitHub
RussellSpitzer commented on issue #10626: URL: https://github.com/apache/iceberg/issues/10626#issuecomment-2740528052 @jacksbox I believe the intent here is that we want to block the drop until partition spec referencing that column has been removed from the metadata. We don't have the tool

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-03-20 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2005852058 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -230,12 +220,12 @@ protected void doCommit(TableMetadata base, TableMetadata

Re: [PR] Flink: backport for fix read config of connector.iceberg.max-allowed-planning-failures to 1.18 and 1.19 [iceberg]

2025-03-20 Thread via GitHub
pvary commented on PR #12589: URL: https://github.com/apache/iceberg/pull/12589#issuecomment-2740772798 Merged to main! Thanks for the fix @Guosmilesmile! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] fix: Clickhouse does not support "null" as partition spec metadata [iceberg-go]

2025-03-20 Thread via GitHub
zeroshade commented on code in PR #347: URL: https://github.com/apache/iceberg-go/pull/347#discussion_r2005838886 ## manifest_test.go: ## @@ -867,6 +868,16 @@ func (m *ManifestTestSuite) TestManifestEntryBuilder() { m.Assert().Equal(0, *data.SortOrderID()) } +func (m

Re: [PR] feat: validate snapshot write compatibility [iceberg-python]

2025-03-20 Thread via GitHub
Fokko commented on PR #1772: URL: https://github.com/apache/iceberg-python/pull/1772#issuecomment-2740743985 @kaushiksrini can you check the CI? It looks like `mypy` has some issues: ``` pyiceberg/table/update/snapshot.py:303: error: Item "None" of "Snapshot | None" has no attribut

  1   2   >