Re: [PR] Build: Let revapi compare against 1.9.0 [iceberg]

2025-04-28 Thread via GitHub
nastra commented on PR #12912: URL: https://github.com/apache/iceberg/pull/12912#issuecomment-2837694580 > Let me add that I feel like we're working around Immutables here. The interface itself should be fine with Map<>. Is there a way to tell Immmutables to use a specific concrete implemen

Re: [I] Unexpected FileIO `remove_all` behavior for S3 [iceberg-rust]

2025-04-28 Thread via GitHub
Xuanwo commented on issue #1118: URL: https://github.com/apache/iceberg-rust/issues/1118#issuecomment-2837692421 > cc @Xuanwo Do you have time to work on this? I can handle it if you are busy. Will finish this today 😚 -- This is an automated message from the Apache Git Service. To

Re: [I] Feature request: manifest file can track deletion vector [iceberg-rust]

2025-04-28 Thread via GitHub
liurenjie1024 commented on issue #1272: URL: https://github.com/apache/iceberg-rust/issues/1272#issuecomment-2837683599 Thanks @dentiny for raising this. It's only about puffin format reader/writer. Statistics and deletion vector are not supported yet. -- This is an automated message from

Re: [I] Unexpected FileIO `remove_all` behavior for S3 [iceberg-rust]

2025-04-28 Thread via GitHub
liurenjie1024 commented on issue #1118: URL: https://github.com/apache/iceberg-rust/issues/1118#issuecomment-2837667220 cc @Xuanwo Do you have time to work on this? I can handle it if you are busy. -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [I] Cannot write nullable values to non-null column in the Iceberg Table [iceberg]

2025-04-28 Thread via GitHub
aagamasjain commented on issue #9488: URL: https://github.com/apache/iceberg/issues/9488#issuecomment-2837639922 Hello @1316147945 may I know if COALESCE(column,0) is applied on dataFrame or on spark.sql view? -- This is an automated message from the Apache Git Service. To respond to the

Re: [D] [Question] `drop_namespace` behavior inconsistent for different catalogs [iceberg-rust]

2025-04-28 Thread via GitHub
GitHub user dentiny added a comment to the discussion: [Question] `drop_namespace` behavior inconsistent for different catalogs Thanks! It would be nice if you or others who are senior in iceberg field to help consolidate the behavior to reduce confusion :) GitHub link: https://github.com/

Re: [PR] feat(rest): support AWS SIGv4 [iceberg-rust]

2025-04-28 Thread via GitHub
liurenjie1024 commented on code in PR #1241: URL: https://github.com/apache/iceberg-rust/pull/1241#discussion_r2065547513 ## crates/catalog/rest/src/client.rs: ## @@ -220,6 +225,39 @@ impl HttpClient { /// Executes the given `Request` and returns a `Response`. pub asyn

Re: [PR] feat(rest): support AWS SIGv4 [iceberg-rust]

2025-04-28 Thread via GitHub
liurenjie1024 commented on code in PR #1241: URL: https://github.com/apache/iceberg-rust/pull/1241#discussion_r2065547513 ## crates/catalog/rest/src/client.rs: ## @@ -220,6 +225,39 @@ impl HttpClient { /// Executes the given `Request` and returns a `Response`. pub asyn

Re: [PR] Update Spark Parquet vectorized read tests to uses Iceberg Record instead of Avro GenericRecord [iceberg]

2025-04-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #12925: URL: https://github.com/apache/iceberg/pull/12925#discussion_r2065200539 ## data/src/test/java/org/apache/iceberg/data/TestLocalScan.java: ## @@ -263,9 +264,37 @@ public void testRandomData() throws IOException { append.com

Re: [D] [Question] `drop_namespace` behavior inconsistent for different catalogs [iceberg-rust]

2025-04-28 Thread via GitHub
GitHub user liurenjie1024 added a comment to the discussion: [Question] `drop_namespace` behavior inconsistent for different catalogs There was some issue there: https://github.com/apache/iceberg-rust/issues/519 GitHub link: https://github.com/apache/iceberg-rust/discussions/1274#discussionco

Re: [I] what's the recommended way to test rest catalog changes? [iceberg-rust]

2025-04-28 Thread via GitHub
liurenjie1024 commented on issue #1270: URL: https://github.com/apache/iceberg-rust/issues/1270#issuecomment-2837538361 >Since https://github.com/apache/iceberg-rust/pull/1266 is a bug fix, is it possible to add a unit test / integration test, which fails before the change, but passes after

Re: [D] [Question] `drop_namespace` behavior inconsistent for different catalogs [iceberg-rust]

2025-04-28 Thread via GitHub
GitHub user dentiny added a comment to the discussion: [Question] `drop_namespace` behavior inconsistent for different catalogs Another benefit is we could share tests among catalogs right now they are scattered around GitHub link: https://github.com/apache/iceberg-rust/discussions/1274#disc

Re: [PR] Core: Implement source-ids to deal with multi arguments transforms [iceberg]

2025-04-28 Thread via GitHub
jbonofre commented on code in PR #12897: URL: https://github.com/apache/iceberg/pull/12897#discussion_r2065540144 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -625,7 +634,7 @@ PartitionSpec buildUnchecked() { static void checkCompatibility(PartitionSpec

Re: [D] [Question] `drop_namespace` behavior inconsistent for different catalogs [iceberg-rust]

2025-04-28 Thread via GitHub
GitHub user liurenjie1024 added a comment to the discussion: [Question] `drop_namespace` behavior inconsistent for different catalogs This seems to be a common problem also in java/python implementation, and there is no clear answer to that. Personally I think it's reasoanble to unify behavio

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2025-04-28 Thread via GitHub
pvary commented on PR #11497: URL: https://github.com/apache/iceberg/pull/11497#issuecomment-2837510693 @stevenzwu: Thanks for the review! I was OOO for a while, but I'm back now. Addressed all of your concerns. If you have time, could you please review? Thanks, Peter -- This i

Re: [PR] Flink: Add lockFactory open in LockRemover [iceberg]

2025-04-28 Thread via GitHub
Guosmilesmile commented on code in PR #12900: URL: https://github.com/apache/iceberg/pull/12900#discussion_r2065501971 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/maintenance/operator/TestLockRemover.java: ## @@ -294,24 +294,33 @@ private void processAndCheck(

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2025-04-28 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r2065501298 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/LogUtil.java: ## @@ -0,0 +1,26 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Flink: Add lockFactory open in LockRemover [iceberg]

2025-04-28 Thread via GitHub
pvary commented on code in PR #12900: URL: https://github.com/apache/iceberg/pull/12900#discussion_r2065478202 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/maintenance/operator/TestLockRemover.java: ## @@ -294,24 +294,33 @@ private void processAndCheck( } p

Re: [PR] Docs: Add versioned Javadocs for 1.9.0 [iceberg]

2025-04-28 Thread via GitHub
ajantha-bhat commented on PR #12920: URL: https://github.com/apache/iceberg/pull/12920#issuecomment-2837385839 I just followed the steps from https://iceberg.apache.org/how-to-release/#versioned-javadoc `./gradlew refreshJavadoc` JDK 21, gradle version 8.13, I have used release

Re: [PR] Enable HTTP proxy support for the client used by REST Catalog [iceberg]

2025-04-28 Thread via GitHub
akhilputhiry commented on PR #12406: URL: https://github.com/apache/iceberg/pull/12406#issuecomment-2837363721 Wanted to follow up on this @amogh-jahagirdar @danielcweeks @nastra @adutra -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [WIP] feat(catalog): Add drop namespace doumentation [iceberg-rust]

2025-04-28 Thread via GitHub
dentiny commented on PR #1273: URL: https://github.com/apache/iceberg-rust/pull/1273#issuecomment-2837360790 I find different catalog has different behavior, so switch to discussion thread: https://github.com/apache/iceberg-rust/discussions/1274 -- This is an automated message from the Ap

Re: [PR] [WIP] feat(catalog): Add drop namespace doumentation [iceberg-rust]

2025-04-28 Thread via GitHub
dentiny closed pull request #1273: [WIP] feat(catalog): Add drop namespace doumentation URL: https://github.com/apache/iceberg-rust/pull/1273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] build(deps): bump github.com/go-jose/go-jose/v4 from 4.0.4 to 4.0.5 [iceberg-go]

2025-04-28 Thread via GitHub
zeroshade merged PR #409: URL: https://github.com/apache/iceberg-go/pull/409 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [D] [Question] `drop_namespace` behavior inconsistent for different catalogs [iceberg-rust]

2025-04-28 Thread via GitHub
GitHub user dentiny edited a discussion: [Question] `drop_namespace` behavior inconsistent for different catalogs When checking `drop_namespace` implementation, I found different catalogs doesn't conform to the same behavior. For example, when we delete parent namespace, what would happen to t

Re: [D] [Question] `drop_namespace` behavior inconsistent for different catalogs [iceberg-rust]

2025-04-28 Thread via GitHub
GitHub user dentiny edited a discussion: [Question] `drop_namespace` behavior inconsistent for different catalogs When checking `drop_namespace` implementation, I found different catalogs doesn't conform to the same behavior. For example, when we delete parent namespace, what would happen to t

Re: [I] Iceberg 1.9.0 : iceberg-build.properties [iceberg]

2025-04-28 Thread via GitHub
ajantha-bhat commented on issue #12926: URL: https://github.com/apache/iceberg/issues/12926#issuecomment-2837332378 @nastra : Do we have guidelines in publishing this jar? I think it was automated script that we use all the time. -- This is an automated message from the Apache Git Servic

Re: [PR] Core: Remove deprecations for 1.10.0 [iceberg]

2025-04-28 Thread via GitHub
manuzhang commented on code in PR #12909: URL: https://github.com/apache/iceberg/pull/12909#discussion_r2065338635 ## .palantir/revapi.yml: ## @@ -1178,6 +1178,9 @@ acceptedBreaks: new: "class org.apache.iceberg.Metrics" justification: "Java serialization across ve

Re: [I] org.apache.thrift.TApplicationException: Invalid method name: 'get_table' [iceberg]

2025-04-28 Thread via GitHub
zhangbutao commented on issue #12878: URL: https://github.com/apache/iceberg/issues/12878#issuecomment-2837326376 @wypoon Yes, IMO, Hive 4 is a big major version, which may have some incompatible changes compared with the previous version(Hive2&Hive3). Although personally I think we should

Re: [I] [bug] REST catalog `namespace_exists` fails with 400 Bad Request [iceberg-rust]

2025-04-28 Thread via GitHub
Xuanwo commented on issue #1271: URL: https://github.com/apache/iceberg-rust/issues/1271#issuecomment-2837315573 > If the namespace doesn't exist in the catalog, `get_namespace` returns error (which is expected) > > Error: Unexpected => Tried to get a namespace that does not exist

Re: [PR] feat: support azure blob storage [iceberg-rust]

2025-04-28 Thread via GitHub
Xuanwo commented on PR #1242: URL: https://github.com/apache/iceberg-rust/pull/1242#issuecomment-2837311085 > I think which blob storage to use in Azure should be a choice for the folks deploying the warehouse and not something that needs to be decided by iceberg sdks -- in other words, why

Re: [PR] Parquet variant array write [iceberg]

2025-04-28 Thread via GitHub
XBaith commented on code in PR #12847: URL: https://github.com/apache/iceberg/pull/12847#discussion_r2065304170 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantWriters.java: ## @@ -104,6 +137,11 @@ public class TestVariantWriters { Variant.of(EMPTY_METADA

Re: [PR] [WIP] feat(catalog): Add drop namespace doumentation [iceberg-rust]

2025-04-28 Thread via GitHub
dentiny commented on code in PR #1273: URL: https://github.com/apache/iceberg-rust/pull/1273#discussion_r2065294379 ## crates/iceberg/src/catalog/mod.rs: ## @@ -68,6 +68,10 @@ pub trait Catalog: Debug + Sync + Send { ) -> Result<()>; /// Drop a namespace from the cat

[PR] [WIP] feat(catalog): Add drop namespace doumentation [iceberg-rust]

2025-04-28 Thread via GitHub
dentiny opened a new pull request, #1273: URL: https://github.com/apache/iceberg-rust/pull/1273 ## What changes are included in this PR? This PR adds documentation on `drop_namespace` behavior, which I think unclear. The behavior referenced to memory catalog and s3 table catalog.

Re: [PR] Spark 3.5: Support case sensitive in replace where statement [iceberg]

2025-04-28 Thread via GitHub
zhoujinsong commented on PR #12706: URL: https://github.com/apache/iceberg/pull/12706#issuecomment-2837281399 Thanks for the work! @dolcino-li It works for me after merging this patch! @amogh-jahagirdar do you have time to help review this PR? -- This is an automated message

Re: [PR] fix(table): Handle nullable struct with Required Field [iceberg-go]

2025-04-28 Thread via GitHub
EthanBlackburn commented on PR #408: URL: https://github.com/apache/iceberg-go/pull/408#issuecomment-2837261873 @zeroshade it works! 🄳 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Update Spark Parquet vectorized read tests to uses Iceberg Record instead of Avro GenericRecord [iceberg]

2025-04-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #12925: URL: https://github.com/apache/iceberg/pull/12925#discussion_r2065200539 ## data/src/test/java/org/apache/iceberg/data/TestLocalScan.java: ## @@ -263,9 +264,37 @@ public void testRandomData() throws IOException { append.com

Re: [PR] Update Spark Parquet vectorized read tests to uses Iceberg Record instead of Avro GenericRecord [iceberg]

2025-04-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #12925: URL: https://github.com/apache/iceberg/pull/12925#discussion_r2065200539 ## data/src/test/java/org/apache/iceberg/data/TestLocalScan.java: ## @@ -263,9 +264,37 @@ public void testRandomData() throws IOException { append.com

Re: [PR] Update Spark Parquet vectorized read tests to uses Iceberg Record instead of Avro GenericRecord [iceberg]

2025-04-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #12925: URL: https://github.com/apache/iceberg/pull/12925#discussion_r2065195777 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/GenericsHelpers.java: ## @@ -289,11 +317,27 @@ private static void assertEqualsUnsafe(Type ty

Re: [I] Add Docstrings to `pyiceberg/table/inspect.py` [iceberg-python]

2025-04-28 Thread via GitHub
github-actions[bot] commented on issue #1191: URL: https://github.com/apache/iceberg-python/issues/1191#issuecomment-2837091181 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2025-04-28 Thread via GitHub
github-actions[bot] commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2837085773 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Iceberg metadata table is not created if using MYSQL JDBC catalog and there are existing iceberg meta tables in another database. [iceberg]

2025-04-28 Thread via GitHub
github-actions[bot] commented on issue #11423: URL: https://github.com/apache/iceberg/issues/11423#issuecomment-2837085597 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Iceberg does not work with Spark's default hive metastore (embedded Derby database) [iceberg]

2025-04-28 Thread via GitHub
github-actions[bot] commented on issue #7847: URL: https://github.com/apache/iceberg/issues/7847#issuecomment-2837084780 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Parquet variant array write [iceberg]

2025-04-28 Thread via GitHub
aihuaxu commented on code in PR #12847: URL: https://github.com/apache/iceberg/pull/12847#discussion_r2065074735 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantWriters.java: ## @@ -104,6 +137,11 @@ public class TestVariantWriters { Variant.of(EMPTY_METAD

Re: [PR] Parquet variant array write [iceberg]

2025-04-28 Thread via GitHub
aihuaxu commented on code in PR #12847: URL: https://github.com/apache/iceberg/pull/12847#discussion_r2065070837 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantWriters.java: ## @@ -104,6 +137,11 @@ public class TestVariantWriters { Variant.of(EMPTY_METAD

[PR] Build: Bump pyarrow from 19.0.1 to 20.0.0 [iceberg-python]

2025-04-28 Thread via GitHub
dependabot[bot] opened a new pull request, #1957: URL: https://github.com/apache/iceberg-python/pull/1957 Bumps [pyarrow](https://github.com/apache/arrow) from 19.0.1 to 20.0.0. Release notes Sourced from https://github.com/apache/arrow/releases";>pyarrow's releases. Apache A

[PR] Build: Bump mypy-boto3-glue from 1.37.31 to 1.38.0 [iceberg-python]

2025-04-28 Thread via GitHub
dependabot[bot] opened a new pull request, #1956: URL: https://github.com/apache/iceberg-python/pull/1956 Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.37.31 to 1.38.0. Release notes Sourced from https://github.com/youtype/mypy_boto3_builder/releases

[PR] Build: Bump polars from 1.27.1 to 1.28.1 [iceberg-python]

2025-04-28 Thread via GitHub
dependabot[bot] opened a new pull request, #1955: URL: https://github.com/apache/iceberg-python/pull/1955 Bumps [polars](https://github.com/pola-rs/polars) from 1.27.1 to 1.28.1. Release notes Sourced from https://github.com/pola-rs/polars/releases";>polars's releases. Pytho

[PR] Build: Bump griffe from 1.7.2 to 1.7.3 [iceberg-python]

2025-04-28 Thread via GitHub
dependabot[bot] opened a new pull request, #1954: URL: https://github.com/apache/iceberg-python/pull/1954 Bumps [griffe](https://github.com/mkdocstrings/griffe) from 1.7.2 to 1.7.3. Release notes Sourced from https://github.com/mkdocstrings/griffe/releases";>griffe's releases.

[PR] Build: Bump pypa/cibuildwheel from 2.23.2 to 2.23.3 [iceberg-python]

2025-04-28 Thread via GitHub
dependabot[bot] opened a new pull request, #1953: URL: https://github.com/apache/iceberg-python/pull/1953 Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.23.2 to 2.23.3. Release notes Sourced from https://github.com/pypa/cibuildwheel/releases";>pypa/cibuildwh

Re: [PR] Data: Handle case where partition location is missing for `TableMigrationUtil` [iceberg]

2025-04-28 Thread via GitHub
jshmchenxi commented on code in PR #12212: URL: https://github.com/apache/iceberg/pull/12212#discussion_r2065008196 ## data/src/main/java/org/apache/iceberg/data/TableMigrationUtil.java: ## @@ -163,10 +167,19 @@ public static List listPartition( Path partitionDir = new

Re: [I] Nessie should throw a NoSuchNamespaceException when listing a non-existing namespace [iceberg]

2025-04-28 Thread via GitHub
coderfender commented on issue #12875: URL: https://github.com/apache/iceberg/issues/12875#issuecomment-2837001318 https://github.com/apache/iceberg/pull/12901 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [I] Clarify Error Logs for GCS Directory/Object Conflicts [iceberg-python]

2025-04-28 Thread via GitHub
HyunWooZZ commented on issue #1952: URL: https://github.com/apache/iceberg-python/issues/1952#issuecomment-2836984873 I found where that file location information come from! https://github.com/fsspec/filesystem_spec/blob/master/fsspec/spec.py#L97 ![Image](https://github.com/u

Re: [PR] Parquet variant array write [iceberg]

2025-04-28 Thread via GitHub
aihuaxu commented on code in PR #12847: URL: https://github.com/apache/iceberg/pull/12847#discussion_r2064838380 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantWriters.java: ## @@ -104,6 +137,11 @@ public class TestVariantWriters { Variant.of(EMPTY_METAD

Re: [I] [bug] REST catalog `namespace_exists` fails with 400 Bad Request [iceberg-rust]

2025-04-28 Thread via GitHub
dentiny commented on issue #1271: URL: https://github.com/apache/iceberg-rust/issues/1271#issuecomment-2836941338 Hi @Xuanwo , thank you for the quick response! I just realize I didn't push my devcontainer config to the repo; it should be fixed now, and you should be easily reproduce the

Re: [PR] Core: Deep copy Record values for equality deletes [iceberg]

2025-04-28 Thread via GitHub
hsingh574 commented on code in PR #12855: URL: https://github.com/apache/iceberg/pull/12855#discussion_r2064954317 ## core/src/main/java/org/apache/iceberg/data/GenericRecord.java: ## @@ -65,13 +68,36 @@ private GenericRecord(StructType struct) { this.nameToPos = NAME_MAP_C

Re: [I] Clarify Error Logs for GCS Directory/Object Conflicts [iceberg-python]

2025-04-28 Thread via GitHub
HyunWooZZ commented on issue #1952: URL: https://github.com/apache/iceberg-python/issues/1952#issuecomment-2836935894 Also, in the Object Storage Engine, it is weird that there is a directory type. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] feat: support azure blob storage [iceberg-rust]

2025-04-28 Thread via GitHub
corleyma commented on PR #1242: URL: https://github.com/apache/iceberg-rust/pull/1242#issuecomment-2836902572 I think which blob storage to use in Azure should be a choice for the folks deploying the warehouse and not something that needs to be decided by iceberg sdks -- in other words, why

Re: [PR] Core: Deep copy Record values for equality deletes [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on code in PR #12855: URL: https://github.com/apache/iceberg/pull/12855#discussion_r2064926813 ## core/src/main/java/org/apache/iceberg/data/GenericRecord.java: ## @@ -65,13 +68,36 @@ private GenericRecord(StructType struct) { this.nameToPos = NAME_

[PR] API, Core: Add table metadata keys for encryption [iceberg]

2025-04-28 Thread via GitHub
rdblue opened a new pull request, #12927: URL: https://github.com/apache/iceberg/pull/12927 This implements the spec changes from #12162. It adds the `encryption-keys` list to `TableMetadata` that stores an `EncryptedKey`. The `TableMetadata.Builder` is updated with `addEncryptionKey`

Re: [PR] Hive: Throw exception when listNamespaces takes non-empty namespace [iceberg]

2025-04-28 Thread via GitHub
ebyhr closed pull request #12884: Hive: Throw exception when listNamespaces takes non-empty namespace URL: https://github.com/apache/iceberg/pull/12884 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Parquet variant array write [iceberg]

2025-04-28 Thread via GitHub
aihuaxu commented on code in PR #12847: URL: https://github.com/apache/iceberg/pull/12847#discussion_r2064838380 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantWriters.java: ## @@ -104,6 +137,11 @@ public class TestVariantWriters { Variant.of(EMPTY_METAD

Re: [PR] SPARK: Remove dependency on hadoop's filesystem class from remove orphan files [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on code in PR #12254: URL: https://github.com/apache/iceberg/pull/12254#discussion_r2064728006 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -302,21 +303,29 @@ private Dataset actualFileIdentD

Re: [PR] SPARK: Remove dependency on hadoop's filesystem class from remove orphan files [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on code in PR #12254: URL: https://github.com/apache/iceberg/pull/12254#discussion_r2064518937 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -302,21 +303,29 @@ private Dataset actualFileIdentD

Re: [I] org.apache.thrift.TApplicationException: Invalid method name: 'get_table' [iceberg]

2025-04-28 Thread via GitHub
wypoon commented on issue #12878: URL: https://github.com/apache/iceberg/issues/12878#issuecomment-2836452058 On the other hand, IIUC, this means that the over-the-wire protocol (Thrift) compatibility is broken, which is why a Hive 2 client cannot be used with a Hive 4 HMS. Is this compatib

[I] Iceberg 1.9.0 : iceberg-build.properties [iceberg]

2025-04-28 Thread via GitHub
sullis opened a new issue, #12926: URL: https://github.com/apache/iceberg/issues/12926 ### Apache Iceberg version 1.9.0 (latest release) ### Query engine None ### Please describe the bug šŸž Apache Iceberg 1.9.0 (java library) iceberg-api-1.9.0.jar cont

Re: [PR] Feature: Write to branches [iceberg-python]

2025-04-28 Thread via GitHub
vinjai commented on PR #941: URL: https://github.com/apache/iceberg-python/pull/941#issuecomment-2836376523 Hey @Fokko Will try to resolve the conflicts over the weekend. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Docs: Add versioned Javadocs for 1.9.0 [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on PR #12920: URL: https://github.com/apache/iceberg/pull/12920#issuecomment-2836373972 A few more superficial changes? 1.8.1 has a "legal" directory, 1.9.0 does not 1.8.1 has a "script-dir", 1.9.0 does not 1.9.0 includes docs for classes in ``` or

Re: [PR] Docs: Add versioned Javadocs for 1.9.0 [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on PR #12920: URL: https://github.com/apache/iceberg/pull/12920#issuecomment-2836354025 Briefly looking, there seem to be a few additional indexing files. I wonder if it's the JVM version being used to generate docs? -- This is an automated message from the Apache

Re: [PR] Docs: Add versioned Javadocs for 1.9.0 [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on PR #12920: URL: https://github.com/apache/iceberg/pull/12920#issuecomment-2836320442 Not sure what happened here but we have nearly twice as many added lines as 1.8.0/1.8.1 but only a few more files, (1724 -> 1788). Any idea what happened? -- This is an automat

Re: [PR] Update Spark Parquet vectorized read tests to uses Iceberg Record instead of Avro GenericRecord [iceberg]

2025-04-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #12925: URL: https://github.com/apache/iceberg/pull/12925#discussion_r2064398015 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/GenericsHelpers.java: ## @@ -289,11 +317,27 @@ private static void assertEqualsUnsafe(Type ty

Re: [PR] Update Spark Parquet vectorized read tests to uses Iceberg Record instead of Avro GenericRecord [iceberg]

2025-04-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #12925: URL: https://github.com/apache/iceberg/pull/12925#discussion_r2064377681 ## data/src/test/java/org/apache/iceberg/data/RandomGenericData.java: ## @@ -155,11 +175,19 @@ protected Object randomValue(Type.PrimitiveType primitive, Ra

Re: [PR] Core: Deep copy Record values for equality deletes [iceberg]

2025-04-28 Thread via GitHub
hsingh574 commented on code in PR #12855: URL: https://github.com/apache/iceberg/pull/12855#discussion_r2064373229 ## core/src/main/java/org/apache/iceberg/data/GenericRecord.java: ## @@ -65,13 +68,36 @@ private GenericRecord(StructType struct) { this.nameToPos = NAME_MAP_C

Re: [PR] Build: Let revapi compare against 1.9.0 [iceberg]

2025-04-28 Thread via GitHub
danielcweeks commented on PR #12912: URL: https://github.com/apache/iceberg/pull/12912#issuecomment-2836274559 I think we should be ok accepting the revapi violation here. The storage credential serialization is scoped to runtime and the serialization through REST API is json, so I don't t

[PR] Update Spark Parquet vectorized read tests to uses Iceberg Record instead of Avro GenericRecord [iceberg]

2025-04-28 Thread via GitHub
amogh-jahagirdar opened a new pull request, #12925: URL: https://github.com/apache/iceberg/pull/12925 This change updates the Spark Parquet Vectorized read tests to write and validate against Iceberg Records instead of Avro generic records. Iceberg generic record is the interface that we sh

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-28 Thread via GitHub
talatuyarer commented on code in PR #12808: URL: https://github.com/apache/iceberg/pull/12808#discussion_r2064341361 ## bigquery/src/main/java/org/apache/iceberg/gcp/bigquery/BigQueryMetastoreCatalog.java: ## @@ -0,0 +1,394 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-28 Thread via GitHub
talatuyarer commented on code in PR #12808: URL: https://github.com/apache/iceberg/pull/12808#discussion_r2064339989 ## bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/BigQueryMetastoreTestUtils.java: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-28 Thread via GitHub
talatuyarer commented on code in PR #12808: URL: https://github.com/apache/iceberg/pull/12808#discussion_r2064334939 ## bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/BigQueryMetastoreTestUtils.java: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-28 Thread via GitHub
talatuyarer commented on code in PR #12808: URL: https://github.com/apache/iceberg/pull/12808#discussion_r2064334228 ## bigquery/src/main/java/org/apache/iceberg/gcp/bigquery/BigQueryMetastoreCatalog.java: ## @@ -0,0 +1,394 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-04-28 Thread via GitHub
sriharshaj commented on PR #12634: URL: https://github.com/apache/iceberg/pull/12634#issuecomment-2836184517 @amogh-jahagirdar Can you please take a look? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Core: Implement source-ids to deal with multi arguments transforms [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on code in PR #12897: URL: https://github.com/apache/iceberg/pull/12897#discussion_r2064268775 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -625,7 +634,7 @@ PartitionSpec buildUnchecked() { static void checkCompatibility(Partiti

Re: [PR] Spark 3.5: Disable executor cache for delete files in RewriteDataFilesSparkAction [iceberg]

2025-04-28 Thread via GitHub
anuragmantri commented on PR #12893: URL: https://github.com/apache/iceberg/pull/12893#issuecomment-2836136070 > Why do we want to disable the executor cache? The theory is that delete cache maybe causing stalls due to connection pool exhaustion (https://github.com/apache/iceberg/iss

Re: [PR] Spark 3.5: Disable executor cache for delete files in RewriteDataFilesSparkAction [iceberg]

2025-04-28 Thread via GitHub
anuragmantri commented on PR #12893: URL: https://github.com/apache/iceberg/pull/12893#issuecomment-2836129808 Thanks for the reviews, I will separate out the spark conf PR and the plumb it to `DeleteFilter` -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Docs: Incorrect property in CREATE CATALOG for Flink [iceberg]

2025-04-28 Thread via GitHub
mrsubhash commented on code in PR #12894: URL: https://github.com/apache/iceberg/pull/12894#discussion_r2061221293 ## docs/docs/aws.md: ## @@ -84,7 +84,7 @@ With those dependencies, you can create a Flink catalog like the following: CREATE CATALOG my_catalog WITH ( 'type'='

[PR] Infra: Add release notes for current version [iceberg]

2025-04-28 Thread via GitHub
ajantha-bhat opened a new pull request, #12924: URL: https://github.com/apache/iceberg/pull/12924 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Spec: Add details on GZIP compressed metadata files [iceberg]

2025-04-28 Thread via GitHub
danielcweeks commented on code in PR #12598: URL: https://github.com/apache/iceberg/pull/12598#discussion_r2064219250 ## format/spec.md: ## @@ -1761,6 +1764,10 @@ The reference Java implementation uses a type 4 uuid and XORs the 4 most signifi Java writes `-1` for "no curren

Re: [PR] refactor partition_summary_limit into SnapshotSummaryCollector constr… [iceberg-python]

2025-04-28 Thread via GitHub
kevinjqliu commented on PR #1940: URL: https://github.com/apache/iceberg-python/pull/1940#issuecomment-2836053049 Thanks @stevie9868 for the PR and @Fokko for the review :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Refactor setting the `max_changed_partitions_for_summaries` [iceberg-python]

2025-04-28 Thread via GitHub
kevinjqliu closed issue #1779: Refactor setting the `max_changed_partitions_for_summaries` URL: https://github.com/apache/iceberg-python/issues/1779 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Refactor setting the `max_changed_partitions_for_summaries` [iceberg-python]

2025-04-28 Thread via GitHub
kevinjqliu closed issue #1779: Refactor setting the `max_changed_partitions_for_summaries` URL: https://github.com/apache/iceberg-python/issues/1779 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] refactor partition_summary_limit into SnapshotSummaryCollector constr… [iceberg-python]

2025-04-28 Thread via GitHub
kevinjqliu merged PR #1940: URL: https://github.com/apache/iceberg-python/pull/1940 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Spark 3.5: Update MERGE and UPDATE for row lineage [iceberg]

2025-04-28 Thread via GitHub
rdblue merged PR #12736: URL: https://github.com/apache/iceberg/pull/12736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Spark 3.5: Update MERGE and UPDATE for row lineage [iceberg]

2025-04-28 Thread via GitHub
rdblue commented on PR #12736: URL: https://github.com/apache/iceberg/pull/12736#issuecomment-2836046881 I'm going to merge this since it's been ready for a few days. We can still follow up if there is additional feedback. -- This is an automated message from the Apache Git Service. To re

Re: [PR] Core: Deep copy Record values for equality deletes [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on code in PR #12855: URL: https://github.com/apache/iceberg/pull/12855#discussion_r2064187940 ## core/src/main/java/org/apache/iceberg/data/GenericRecord.java: ## @@ -65,13 +68,36 @@ private GenericRecord(StructType struct) { this.nameToPos = NAME_

Re: [PR] Docs: Fix version doc release step [iceberg]

2025-04-28 Thread via GitHub
ajantha-bhat commented on code in PR #12922: URL: https://github.com/apache/iceberg/pull/12922#discussion_r2064181809 ## site/docs/how-to-release.md: ## @@ -323,11 +323,11 @@ Please follow the instructions on the GitHub repository in the [`README.md` in t Versioned Docs

Re: [PR] Core: Deep copy Record values for equality deletes [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on code in PR #12855: URL: https://github.com/apache/iceberg/pull/12855#discussion_r2064181063 ## core/src/main/java/org/apache/iceberg/data/GenericRecord.java: ## @@ -65,13 +68,36 @@ private GenericRecord(StructType struct) { this.nameToPos = NAME_

[I] Feature request: manifest file can track deletion vector [iceberg-rust]

2025-04-28 Thread via GitHub
dentiny opened a new issue, #1272: URL: https://github.com/apache/iceberg-rust/issues/1272 ### Is your feature request related to a problem or challenge? Hi team, this feature request is half a question on puffin / deletion vector progress, and half on feature request for manifest sup

Re: [PR] Parquet variant array write [iceberg]

2025-04-28 Thread via GitHub
rdblue commented on code in PR #12847: URL: https://github.com/apache/iceberg/pull/12847#discussion_r2064166754 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantWriters.java: ## @@ -104,6 +137,11 @@ public class TestVariantWriters { Variant.of(EMPTY_METADA

Re: [PR] Spec: Add details on GZIP compressed metadata files [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on PR #12598: URL: https://github.com/apache/iceberg/pull/12598#issuecomment-2835959921 This looks fine to me, do we have any other outstanding issues or can we vote on this? -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] Spark 3.5: Disable executor cache for delete files in RewriteDataFilesSparkAction [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on PR #12893: URL: https://github.com/apache/iceberg/pull/12893#issuecomment-2835954365 I would recommend seeing if we can get this into SparkDeleteFilter, rather than changing the ExecutorCache itself. Basically set it up so that when we create a SparkDeleteFilter

Re: [PR] Spark 3.5: Disable executor cache for delete files in RewriteDataFilesSparkAction [iceberg]

2025-04-28 Thread via GitHub
RussellSpitzer commented on PR #12893: URL: https://github.com/apache/iceberg/pull/12893#issuecomment-2835943212 I feel like this should be plumbed all the way through the scanner. It feels like we are setting this property essentially through a workaround rather than through an explicit ap

Re: [PR] Core: Do not reuse containers when reading delete files (#11239) [iceberg]

2025-04-28 Thread via GitHub
hsingh574 commented on PR #12855: URL: https://github.com/apache/iceberg/pull/12855#issuecomment-2835939519 @RussellSpitzer Updated with a deepCopy method. Tried to use existing code wherever possible, let me know if there's a better approach. -- This is an automated message from the Apac

  1   2   3   >