Re: [PR] rewrite v2 tables by skip deletes planning and join deletes data tables [iceberg]
ajantha-bhat commented on PR #8807: URL: https://github.com/apache/iceberg/pull/8807#issuecomment-1759043430 Is running rewrite_position_delete before running rewrite_data_files not helping in this scenario? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Nessie: Adopt to Nessie 0.71.1 release [iceberg]
ajantha-bhat commented on PR #8798: URL: https://github.com/apache/iceberg/pull/8798#issuecomment-1759048641 cc: @dimas-b, @snazy Dependabot is raising PRs for Nessie bumps now. This is a follow up for the latest bump. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Spark: support rewrite on specified target branch [iceberg]
ajantha-bhat commented on code in PR #8797: URL: https://github.com/apache/iceberg/pull/8797#discussion_r1356348310 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteDataFilesProcedure.java: ## @@ -109,6 +110,8 @@ public InternalRow[] call(InternalRow args) { action = checkAndApplyStrategy(action, strategy, sortOrderString, table.schema()); } + action = checkAndApplyBranch(table, action); Review Comment: yes. Support extracting branch info from table identifer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]
ajantha-bhat commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1356434493 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.iceberg.connect.events; + +import java.util.UUID; +import org.apache.avro.Schema; +import org.apache.avro.SchemaBuilder; + +public class CommitCompletePayload implements Payload { + + private UUID commitId; + private Long vtts; + private final Schema avroSchema; + + private static final Schema AVRO_SCHEMA = + SchemaBuilder.builder() + .record(CommitCompletePayload.class.getName()) + .fields() + .name("commitId") + .prop(FIELD_ID_PROP, DUMMY_FIELD_ID) + .type(UUID_SCHEMA) + .noDefault() + .name("vtts") + .prop(FIELD_ID_PROP, DUMMY_FIELD_ID) + .type() + .nullable() + .longType() + .noDefault() + .endRecord(); + + // Used by Avro reflection to instantiate this class when reading events + public CommitCompletePayload(Schema avroSchema) { +this.avroSchema = avroSchema; + } + + public CommitCompletePayload(UUID commitId, Long vtts) { +this.commitId = commitId; +this.vtts = vtts; +this.avroSchema = AVRO_SCHEMA; + } + + public UUID commitId() { +return commitId; + } + + public Long vtts() { Review Comment: I think we can also add events.md doc with this PR now. https://github.com/tabular-io/iceberg-kafka-connect/blob/main/docs/events.md -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] push down min/max/count to iceberg [iceberg]
amogh-jahagirdar commented on PR #6252: URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1759179813 My mistake, yes you can have format version 2 and have copy on write. The remaining issue is why you are even seeing delete files if CoW is set. That seems to be the fundamental issue here. I'll try and repro that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add ASF DOAP rdf file [iceberg]
nastra commented on PR #8586: URL: https://github.com/apache/iceberg/pull/8586#issuecomment-1759247885 thanks @jbonofre -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add ASF DOAP rdf file [iceberg]
nastra merged PR #8586: URL: https://github.com/apache/iceberg/pull/8586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add ASF DOAP rdf file [iceberg]
jbonofre commented on PR #8586: URL: https://github.com/apache/iceberg/pull/8586#issuecomment-1759269677 Awesome ! Thanks, I'm dealing with the ASF record now ;) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]
ajantha-bhat commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1356710076 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.iceberg.connect.events; + +import java.util.UUID; +import org.apache.avro.Schema; +import org.apache.avro.SchemaBuilder; + +public class CommitCompletePayload implements Payload { + + private UUID commitId; + private Long vtts; + private final Schema avroSchema; + + private static final Schema AVRO_SCHEMA = + SchemaBuilder.builder() + .record(CommitCompletePayload.class.getName()) + .fields() + .name("commitId") + .prop(FIELD_ID_PROP, DUMMY_FIELD_ID) + .type(UUID_SCHEMA) + .noDefault() + .name("vtts") + .prop(FIELD_ID_PROP, DUMMY_FIELD_ID) + .type() + .nullable() + .longType() + .noDefault() + .endRecord(); + + // Used by Avro reflection to instantiate this class when reading events + public CommitCompletePayload(Schema avroSchema) { +this.avroSchema = avroSchema; + } + + public CommitCompletePayload(UUID commitId, Long vtts) { +this.commitId = commitId; +this.vtts = vtts; +this.avroSchema = AVRO_SCHEMA; + } + + public UUID commitId() { +return commitId; + } + + public Long vtts() { Review Comment: We can add this info as javadoc > VTTS (valid-through timestamp) property indicating through what timestamp records have been fully processed, i.e. all records processed from then on will have a timestamp greater than the VTTS. This is calculated by taking the maximum timestamp of records processed from each topic partition, and taking the minimum of these. If any partitions were not processed as part of the commit then the VTTS is not set https://github.com/tabular-io/iceberg-kafka-connect/blob/main/docs/design.md#snapshot-properties -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[I] Make iceberg an idempotent sink for Spark like delta lake [iceberg]
paulpaul1076 opened a new issue, #8809: URL: https://github.com/apache/iceberg/issues/8809 ### Feature Request / Improvement Delta lake has an interesting feature which you can read about here: https://docs.delta.io/latest/delta-streaming.html#idempotent-table-writes-in-foreachbatch And here:   From what I understand, iceberg does not support this, but I think that it is a really important feature. Can we add this to iceberg? I don't think that multi-table transactions will solve this problem, because from my understanding foreachBatch commits its offsets after the entire lambda function passed to it gets executed, now imagine you have this code with multi-table transactions: ``` dfStr.writeStream.foreachBatch((df: DataFrame, id: Long) => { // create transaction1 // create transaction2 // multi_table_commit(transaction1, transaction2) // send something to kafka }).start().awaitTermination() ``` From what I understand, if the "send something to kafka" step fails, the entire microbatch is re-executed and the multi-table transaction will write the same data a second time, which will cause data duplication. At my job, for example, we use this kind of logic and we frequently kill our streaming jobs to redeploy new code after which we restart them. So, from my understanding, iceberg is not an idempotent sink and you can't expect to have end-to-end exactly once with iceberg? ### Query engine Spark -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] Add Blogs Related to Hive & Iceberg. [iceberg-docs]
ayushtkn opened a new pull request, #282: URL: https://github.com/apache/iceberg-docs/pull/282 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add Blogs Related to Hive & Iceberg. [iceberg-docs]
ayushtkn commented on PR #282: URL: https://github.com/apache/iceberg-docs/pull/282#issuecomment-1759537789 Tried building locally to validate, no link is broken, Attaching screenshot https://github.com/apache/iceberg-docs/assets/25608848/205e05af-1eb8-41f3-b0e4-7cfb66c80c58";> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Nessie: Adopt to Nessie 0.71.1 release [iceberg]
dimas-b commented on code in PR #8798: URL: https://github.com/apache/iceberg/pull/8798#discussion_r1356768108 ## nessie/src/test/java/org/apache/iceberg/nessie/TestCustomNessieClient.java: ## @@ -78,30 +77,11 @@ public void testNonExistentCustomClient() { temp.toUri().toString(), CatalogProperties.URI, uri, - NessieConfigConstants.CONF_NESSIE_CLIENT_BUILDER_IMPL, - "non.existent.ClientBuilderImpl")); -}) -.isInstanceOf(RuntimeException.class) -.hasMessageContaining("Cannot load Nessie client builder implementation class"); - } - - @Test - public void testCustomClientByImpl() { Review Comment: These tests are different, this one uses `NessieConfigConstants.CONF_NESSIE_CLIENT_BUILDER_IMPL`, the other one uses `NessieConfigConstants.CONF_NESSIE_CLIENT_NAME`, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Nessie: Adopt to Nessie 0.71.1 release [iceberg]
ajantha-bhat commented on code in PR #8798: URL: https://github.com/apache/iceberg/pull/8798#discussion_r1356770832 ## nessie/src/test/java/org/apache/iceberg/nessie/TestCustomNessieClient.java: ## @@ -78,30 +77,11 @@ public void testNonExistentCustomClient() { temp.toUri().toString(), CatalogProperties.URI, uri, - NessieConfigConstants.CONF_NESSIE_CLIENT_BUILDER_IMPL, - "non.existent.ClientBuilderImpl")); -}) -.isInstanceOf(RuntimeException.class) -.hasMessageContaining("Cannot load Nessie client builder implementation class"); - } - - @Test - public void testCustomClientByImpl() { Review Comment: Since CONF_NESSIE_CLIENT_BUILDER_IMPL is deprecated, we need change testcase to use CONF_NESSIE_CLIENT_NAME instead of removing. But there is already a testcase to do that. So, I guess he removed the testcase of CONF_NESSIE_CLIENT_BUILDER_IMPL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add Blogs Related to Hive & Iceberg. [iceberg-docs]
pvary merged PR #282: URL: https://github.com/apache/iceberg-docs/pull/282 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[I] Replace `.size() > 0` with `!.isempty()` [iceberg]
Fokko opened a new issue, #8810: URL: https://github.com/apache/iceberg/issues/8810 ### Feature Request / Improvement Suggestion by IDEA:  I think this is nice because `isEmpty` should be faster. We also have different implementations in `PartitionSet.java`: ```java @Override public int size() { return partitionSetById.values().stream().mapToInt(Set::size).sum(); } @Override public boolean isEmpty() { return partitionSetById.values().stream().allMatch(Set::isEmpty); } ``` ### Query engine Other -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Update roadmap.md [iceberg-docs]
ajantha-bhat commented on code in PR #272: URL: https://github.com/apache/iceberg-docs/pull/272#discussion_r1356790636 ## landing-page/content/common/roadmap.md: ## @@ -22,28 +22,36 @@ disableSidebar: true # Roadmap Overview -This roadmap outlines projects that the Iceberg community is working on, their priority, and a rough size estimate. -This is based on the latest [community priority discussion](https://lists.apache.org/thread.html/r84e80216c259c81f824c6971504c321cd8c785774c489d52d4fc123f%40%3Cdev.iceberg.apache.org%3E). +This roadmap outlines projects that the Iceberg community is working on. Each high-level item links to a Github project board that tracks the current status. Related design docs will be linked on the planning boards. -# Priority 1 - -* API: [Iceberg 1.0.0](https://github.com/apache/iceberg/projects/3) [medium] -* Python: [Pythonic refactor](https://github.com/apache/iceberg/projects/7) [medium] -* Spec: [Z-ordering / Space-filling curves](https://github.com/apache/iceberg/projects/16) [medium] -* Spec: [Snapshot tagging and branching](https://github.com/apache/iceberg/projects/4) [small] -* Views: [Spec](https://github.com/apache/iceberg/projects/6) [medium] -* Puffin: [Implement statistics information in table snapshot](https://github.com/apache/iceberg/pull/4741) [medium] -* Flink: [FLIP-27 based Iceberg source](https://github.com/apache/iceberg/projects/23) [large] - -# Priority 2 - -* ORC: [Support delete files stored as ORC](https://github.com/apache/iceberg/projects/13) [small] -* Spark: [DSv2 streaming improvements](https://github.com/apache/iceberg/projects/2) [small] -* Flink: [Inline file compaction](https://github.com/apache/iceberg/projects/14) [small] -* Flink: [Support UPSERT](https://github.com/apache/iceberg/projects/15) [small] -* Spec: [Secondary indexes](https://github.com/apache/iceberg/projects/17) [large] -* Spec v3: [Encryption](https://github.com/apache/iceberg/projects/5) [large] -* Spec v3: [Relative paths](https://github.com/apache/iceberg/projects/18) [large] -* Spec v3: [Default field values](https://github.com/apache/iceberg/projects/19) [medium] +# General + +* [Multi-table transaction support](https://github.com/apache/iceberg/projects/30) +* [Views Support](https://github.com/apache/iceberg/projects/29) +* [Change Data Capture (CDC) Support](https://github.com/apache/iceberg/projects/26) +* [Snapshot tagging and branching](https://github.com/apache/iceberg/projects/4) +* [Inline file compaction](https://github.com/apache/iceberg/projects/14) +* [Delete File compaction](https://github.com/apache/iceberg/projects/10) +* [Z-ordering / Space-filling curves](https://github.com/apache/iceberg/projects/16) +* [Support UPSERT](https://github.com/apache/iceberg/projects/15) + Review Comment: Can you please add partition stats? https://github.com/apache/iceberg/projects/31 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]
nk1506 commented on PR #8804: URL: https://github.com/apache/iceberg/pull/8804#issuecomment-1759573960 @nastra , Please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Nessie: Adapt to Nessie 0.71.1 release [iceberg]
nk1506 commented on PR #8798: URL: https://github.com/apache/iceberg/pull/8798#issuecomment-1759595831 > nit: `Adopt` -> `Adapt` in title? > > I believe the removed test case is worth keeping. Since `CONF_NESSIE_CLIENT_BUILDER_IMPL` has been deprecated, replacing it with `CONF_NESSIE_CLIENT_NAME` will make [testCustomClientByImpl](https://github.com/apache/iceberg/blob/b5ea0d5a7f55e5b8d9eec8e764bbcc35f8301db3/nessie/src/test/java/org/apache/iceberg/nessie/TestCustomNessieClient.java#L89 ) and [testCustomClientByName](https://github.com/apache/iceberg/blob/b5ea0d5a7f55e5b8d9eec8e764bbcc35f8301db3/nessie/src/test/java/org/apache/iceberg/nessie/TestCustomNessieClient.java#L108) duplicate of each other. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Core: DeleteMarker to mark row as deleted [iceberg]
Humbedooh closed pull request #2434: Core: DeleteMarker to mark row as deleted URL: https://github.com/apache/iceberg/pull/2434 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Hive: Allow to create external table to access the iceberg table managed in hive catalog [iceberg]
Humbedooh closed pull request #3539: Hive: Allow to create external table to access the iceberg table managed in hive catalog URL: https://github.com/apache/iceberg/pull/3539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Core: Add RocksDBStructLikeMap [iceberg]
Humbedooh closed pull request #2680: Core: Add RocksDBStructLikeMap URL: https://github.com/apache/iceberg/pull/2680 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Hive: Bug when runing SQL with multiple table join. [iceberg]
Humbedooh closed pull request #3392: Hive: Bug when runing SQL with multiple table join. URL: https://github.com/apache/iceberg/pull/3392 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Update the comment content of 'commit.status-check.total-timeout-ms' [iceberg]
Humbedooh closed pull request #2894: Update the comment content of 'commit.status-check.total-timeout-ms' URL: https://github.com/apache/iceberg/pull/2894 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Aliyun: Add iceberg-aliyun document [iceberg]
Humbedooh closed pull request #3686: Aliyun: Add iceberg-aliyun document URL: https://github.com/apache/iceberg/pull/3686 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] Build: Bump org.springframework:spring-web from 5.3.9 to 6.0.13 [iceberg]
dependabot[bot] opened a new pull request, #8811: URL: https://github.com/apache/iceberg/pull/8811 Bumps [org.springframework:spring-web](https://github.com/spring-projects/spring-framework) from 5.3.9 to 6.0.13. Release notes Sourced from https://github.com/spring-projects/spring-framework/releases";>org.springframework:spring-web's releases. v6.0.13 :star: New Features Improve diagnostics for negative repeated text count in SpEL https://redirect.github.com/spring-projects/spring-framework/issues/31342";>#31342 Improve diagnostics when repeated text size calculation results in overflow in SpEL https://redirect.github.com/spring-projects/spring-framework/issues/31341";>#31341 UnknownContentTypeException is not Serializable https://redirect.github.com/spring-projects/spring-framework/issues/31283";>#31283 Reintroduce FastClass in CGLIB class names for @Configuration classes https://redirect.github.com/spring-projects/spring-framework/issues/31272";>#31272 :lady_beetle: Bug Fixes HibernateJpaDialect and HibernateExceptionTranslator throw SQLExceptionTranslator-provided exception instead of returning it https://redirect.github.com/spring-projects/spring-framework/issues/31409";>#31409 AnnotationScanner scanning leads to StackOverflowError with recursive annotation https://redirect.github.com/spring-projects/spring-framework/issues/31400";>#31400 NamedParameterJdbcTemplate throws unexpected exception for null query https://redirect.github.com/spring-projects/spring-framework/issues/31391";>#31391 HTTP server exchange observations have incorrect UNKNOWN status tag if the client disconnected https://redirect.github.com/spring-projects/spring-framework/issues/31388";>#31388 Breaking change from 6.0.11 to 6.0.12 if you expect query parameters in @RequestBody https://redirect.github.com/spring-projects/spring-framework/issues/31327";>#31327 SpEL's CompoundExpression.toStringAST() omits ? for null-safe navigation https://redirect.github.com/spring-projects/spring-framework/issues/31326";>#31326 ConcurrentLruCache no longer supports capacity = 0 https://redirect.github.com/spring-projects/spring-framework/issues/31317";>#31317 Using R2dbc transactional and non transactional on a database connection pool will fail for Oracle. https://redirect.github.com/spring-projects/spring-framework/issues/31268";>#31268 AOT-generated code no longer set bean class for beans created from a @Bean method https://redirect.github.com/spring-projects/spring-framework/issues/31242";>#31242 CGLIB proxy classes are no longer cached properly https://redirect.github.com/spring-projects/spring-framework/issues/31238";>#31238 Illegal reflective access in ContextOverridingClassLoader.isEligibleForOverriding https://redirect.github.com/spring-projects/spring-framework/issues/31232";>#31232 Fix RuntimeHintsPredicates matching rules for public/declared elements https://redirect.github.com/spring-projects/spring-framework/issues/31224";>#31224 MultipartParser should respect read position https://redirect.github.com/spring-projects/spring-framework/issues/31110";>#31110 WebClient reports 'Host is not specified' for URI with hostname and port, but without scheme https://redirect.github.com/spring-projects/spring-framework/issues/31033";>#31033 R2DBC Connection is closed during transaction when using TransactionAwareConnectionFactoryProxy https://redirect.github.com/spring-projects/spring-framework/issues/28133";>#28133 SpEL cannot evaluate or compile expression with null-safe void method invocation https://redirect.github.com/spring-projects/spring-framework/issues/27421";>#27421 LazyResolutionMessage does not implement proper toString https://redirect.github.com/spring-projects/spring-framework/issues/21265";>#21265 :notebook_with_decorative_cover: Documentation Document Kotlin declaration site variance subtleties https://redirect.github.com/spring-projects/spring-framework/issues/31370";>#31370 Add missing conversionService field in doc example https://redirect.github.com/spring-projects/spring-framework/pull/31330";>#31330 Clarify documentation on Spring Web MVC pattern comparison https://redirect.github.com/spring-projects/spring-framework/issues/31294";>#31294 Improved documentation for MethodParameter#getAnnotatedElement https://redirect.github.com/spring-projects/spring-framework/issues/30397";>#30397 Javadoc for BeanPropertyRowMapper.getColumnValue(ResultSet, int, Class) is inconsistent with code https://redirect.github.com/spring-projects/spring-framework/issues/29285";>#29285 Referencing a @Bean method in a @Configuration class' @PostConstruct method leads to circular reference https://redirect.github.com/spring-projects/spring-framework/issues/27876";>#27876 Incorrect reference information about CGLIB supported method visibility https://redirect.github.com/spring-projects/spring-fra
Re: [PR] Build: Bump org.springframework:spring-web from 5.3.9 to 6.0.12 [iceberg]
dependabot[bot] commented on PR #8734: URL: https://github.com/apache/iceberg/pull/8734#issuecomment-1759624056 Superseded by #8811. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Build: Bump org.springframework:spring-web from 5.3.9 to 6.0.12 [iceberg]
dependabot[bot] closed pull request #8734: Build: Bump org.springframework:spring-web from 5.3.9 to 6.0.12 URL: https://github.com/apache/iceberg/pull/8734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.24.0 [iceberg]
dependabot[bot] opened a new pull request, #8812: URL: https://github.com/apache/iceberg/pull/8812 Bumps [com.palantir.baseline:gradle-baseline-java](https://github.com/palantir/gradle-baseline) from 4.42.0 to 5.24.0. Release notes Sourced from https://github.com/palantir/gradle-baseline/releases";>com.palantir.baseline:gradle-baseline-java's releases. 5.24.0 Type Description Link Fix baseline-exact-dependencies is now far more lazy around Configuration creation in order to support Gradle 8. https://redirect.github.com/palantir/gradle-baseline/pull/2639";>palantir/gradle-baseline#2639 5.23.0 Type Description Link Fix Use a Proxy for JavaInstallationMetadata so we can work across Gradle 7 and 8. https://redirect.github.com/palantir/gradle-baseline/pull/2605";>palantir/gradle-baseline#2605 5.22.0 Automated release, no documented user facing changes 5.21.0 Type Description Link Improvement Upgrade error-prone to 2.21.1 (from 2.19.1) https://redirect.github.com/palantir/gradle-baseline/pull/2628";>palantir/gradle-baseline#2628 5.20.0 Type Description Link Improvement Improve SafeLoggingPropagation on Immutables, taking into account fields from superinterfaces https://redirect.github.com/palantir/gradle-baseline/pull/2629";>palantir/gradle-baseline#2629 5.19.0 Type Description Link Improvement Prefer InputStream.transferTo(OutputStream)Add error-prone check to automate migration to prefer InputStream.transferTo(OutputStream) instead of utility methods such as Guava's com.google.common.io.ByteStreams.copy(InputStream, OutputStream).Allow for optimization when underlying input stream (such as ByteArrayInputStream, ChannelInputStream) overrides transferTo(OutputStream) to avoid extra array allocations and copy larger chunks at a time (e.g. allowing 16KiB chunks via ApacheHttpClientBlockingChannel.ModulatingOutputStream from https://redirect.github.com/palantir/gradle-baseline/issues/1790";>#1790).When running on JDK 21+, this also enables 16KiB byte chunk copies via InputStream.transferTo(OutputStream) perhttps://bugs.openjdk.org/browse/JDK-8299336";>JDK-8299336, where as on JDK Closes https://redirect.github.com/palantir/gradle-baseline/issues/2615";>palantir/gradle-baseline#2615 https://redirect.github.com/palantir/gradle-baseline/pull/2615";>palantir/gradle-baseline#2615, https://redirect.github.com/palantir/gradle-baseline/pull/2616";>palantir/gradle-baseline#2616 5.18.0 Automated release, no documented user facing changes 5.17.0 Type Description Link Feature Add error-prone check to prefer ZoneId constants https://redirect.github.com/palantir/gradle-baseline/pull/2596";>palantir/gradle-baseline#2596 5.16.0 Type Description Link Fix Fix nullaway checkerframework dependency https://redirect.github.com/palantir/gradle-baseline/pull/2602";>palantir/gradle-baseline#2602 5.15.0 Automated release, no documented user facing changes 5.14.0 Type Description Link Fix Fix unintentional suppression of StrictUnusedVariable https://redirect.github.com/palantir/gradle-baseline/pull/2599";>palantir/gradle-baseline#2599 5.13.0 ... (truncated) Commits https://github.com/palantir/gradle-baseline/commit/2381f401fb0050eae77a10f27397ad5835cdd3ca";>2381f40 Autorelease 5.24.0 https://github.com/palantir/gradle-baseline/commit/f6e43095b61de662685810defa070661dad8ee08";>f6e4309 Lazy exact dependencies (https://redirect.github.com/palantir/gradle-baseline/issues/2639";>#2639) https://github.com/palantir/gradle-baseline/commit/642e1e3d675ca49517113b70f73144144914ba0e";>642e1e3 Autorelease 5.23.0 https://github.com/palantir/gradle-baseline/commit/437d3920294ba82c8296acf844ddfa866df1662c";>437d392 Use a Proxy for JavaInstallationMetadata so we can work across Gradle 7 a... https://github.com/palantir/gradle-baseline/commit/777e9f7e69a3b21daa5d2b8a70267dbb38bc8f4c";>777e9f7 Excavator: Upgrade buildscript dependencies (https://redirect.github.com/palantir/gradle-baseline/issues/2638";>#2638) https://github.com/palantir/gradle-baseline/commit/1eefbf55c3a2ced9af0785014a2eb9a2396a4b9a";>1eefbf5 Excavator: Upgrades Baseline to the latest version (https://redirect.github.com/palantir/gradle-baseline/issues/2637";>#2637) https://github.com/palantir/gradle-baseline/commit/dca88ac7faf672d9ab5d941061cd155fcf6fdf0c";>dca88ac Allow incubating method use inside other incubating methods (https://redirect.github.com/palantir/gradle-baseline/issues/2636";>#2636)
Re: [PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.22.0 [iceberg]
dependabot[bot] commented on PR #8778: URL: https://github.com/apache/iceberg/pull/8778#issuecomment-1759625459 Superseded by #8812. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.22.0 [iceberg]
dependabot[bot] closed pull request #8778: Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.22.0 URL: https://github.com/apache/iceberg/pull/8778 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Rename master branch to main [iceberg]
jbonofre commented on PR #8722: URL: https://github.com/apache/iceberg/pull/8722#issuecomment-1759631883 `master` branch has been renamed to `main`. @Fokko @nastra if you can merge this PR when you have time, it would be great. Thanks ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Rename master branch to main [iceberg]
Fokko merged PR #8722: URL: https://github.com/apache/iceberg/pull/8722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Rename master branch to main [iceberg]
Fokko commented on PR #8722: URL: https://github.com/apache/iceberg/pull/8722#issuecomment-1759635303 Thanks @jbonofre for taking the lead on this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Make iceberg an idempotent sink for Spark like delta lake [iceberg]
paulpaul1076 commented on issue #8809: URL: https://github.com/apache/iceberg/issues/8809#issuecomment-1759723529 @RussellSpitzer provided this code which achieves the same: ``` foreachBatch (batch_df, batch_id) => { val lastBatch = Spark3Util.loadIcebergTable(spark,"db.timezoned").currentSnapshot().summary()(STREAMID) if (batch_id > lastBatch) { batch_df.writeTo(...).option("snapshot-property."+STREAMID, batch_id).append } } ``` But I wonder if delta lake if faster here, because I assume that this metadata lookup `Spark3Util.loadIcebergTable(spark,"db.timezoned").currentSnapshot().summary()(STREAMID)` goes to S3? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Replace `.size() > 0` with `!.isempty()` [iceberg]
Fokko commented on issue #8810: URL: https://github.com/apache/iceberg/issues/8810#issuecomment-1759954274 There are quite a few: ``` ./core/src/main/java/org/apache/iceberg/BaseDistributedDataScan.java: boolean mayHaveEqualityDeletes = deleteManifests.size() > 0 && mayHaveEqualityDeletes(snapshot); ./core/src/main/java/org/apache/iceberg/util/PartitionUtil.java: if (partitionType.fields().size() > 0) { ./core/src/main/java/org/apache/iceberg/TableMetadata.java: || (discardChanges && changes.size() > 0) ./core/src/main/java/org/apache/iceberg/io/DeleteWriteResult.java:return referencedDataFiles != null && referencedDataFiles.size() > 0; ./core/src/main/java/org/apache/iceberg/SnapshotSummary.java: setIf(changedPartitions.size() > 0, builder, PARTITION_SUMMARY_PROP, "true"); ./core/src/main/java/org/apache/iceberg/FastAppend.java:if (newManifests == null && newFiles.size() > 0) { ./core/src/main/java/org/apache/iceberg/actions/RewritePositionDeletesGroup.java: Preconditions.checkArgument(tasks.size() > 0, "Tasks must not be empty"); ./core/src/main/java/org/apache/iceberg/actions/BaseCommitService.java: while (running.get() || completedRewrites.size() > 0 || inProgressCommits.size() > 0) { ./core/src/main/java/org/apache/iceberg/actions/BaseCommitService.java: if (!running.get() && completedRewrites.size() > 0) { ./core/src/main/java/org/apache/iceberg/actions/BaseCommitService.java: boolean writingComplete = !running.get() && completedRewrites.size() > 0; ./core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: Preconditions.checkArgument(versions.size() > 0, "Invalid view: no versions were added"); ./core/src/main/java/org/apache/iceberg/PositionDeletesTable.java:if (partitionType.fields().size() > 0) { ./core/src/main/java/org/apache/iceberg/BaseOverwriteFiles.java: if (deletedDataFiles.size() > 0) { ./core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: return deletePaths.size() > 0 ./core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: || dropPartitions.size() > 0; ./core/src/main/java/org/apache/iceberg/ManifestFilterManager.java:if (dropPartitions.size() > 0) { ./core/src/main/java/org/apache/iceberg/ManifestFilterManager.java:} else if (deletePaths.size() > 0) { ./core/src/main/java/org/apache/iceberg/ReachableFileUtil.java:if (metadataLogEntries.size() > 0) { ./core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: return newDataFiles.size() > 0; ./core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: return newDeleteFilesBySpec.size() > 0; ./core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:if (newDataFiles.size() > 0) { ./core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:if (hasNewDeleteFiles && cachedNewDeleteManifests.size() > 0) { ./core/src/main/java/org/apache/iceberg/BaseRewriteFiles.java:if (replacedDataFiles.size() > 0) { ./core/src/main/java/org/apache/iceberg/ContentFileParser.java:return partitionData != null && partitionData.size() > 0; ./core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: if (snapshot.deleteManifests(table().io()).size() > 0) { ./aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/AliyunOSSMockLocalStore.java: return buckets.size() > 0 ? buckets.get(0) : null; ./mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java: if (dataFiles.size() > 0) { ./mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java: if (dataFiles.size() > 0) { ./flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java: if (equalityFieldColumns != null && equalityFieldColumns.size() > 0) { ./flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/DataStatisticsCoordinator.java: gateways[subtaskIndex].size() > 0, ./flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java: if (equalityFieldColumns != null && equalityFieldColumns.size() > 0) { ./flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java: if (equalityFieldColumns != null && equalityFieldColumns.size() > 0) { ./delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: if (filesToAdd.size() > 0 && filesToRemove.size() > 0) { ./delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: } else if (filesToAdd.size() > 0) { ./delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: } else if (filesToRemove.size() > 0) { ./hive3/src/main/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java: return hasBase || deltas.size() > 0; ./api/src/main/java/org/apache/iceberg/expressions/
Re: [PR] [ISSUE #8810] replaced .size() > 0 with isEmpty() [iceberg]
Fokko commented on PR #8813: URL: https://github.com/apache/iceberg/pull/8813#issuecomment-1759955441 Thanks for opening this PR @PickBas. There are a couple more in the codebase. What do you think of doing a PR per module? So we keep it manageable. In this case everything in `core/*`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] [ISSUE #8810] replaced .size() > 0 with isEmpty() [iceberg]
PickBas commented on PR #8813: URL: https://github.com/apache/iceberg/pull/8813#issuecomment-1759958867 @Fokko Sure, will be done. PR per module works for me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Replace `.size() > 0` with `!.isempty()` [iceberg]
PickBas commented on issue #8810: URL: https://github.com/apache/iceberg/issues/8810#issuecomment-1759962395 @Fokko Will be done. Could you assign the issue to me, if you don't mind? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[I] struct value design [iceberg-rust]
ZENOTME opened a new issue, #77: URL: https://github.com/apache/iceberg-rust/issues/77 Use lookup will make the memory cost if we have multiple struct with same type. One solve way is to use `Arc` in Struct. I try this design in https://github.com/icelake-io/icelake/pull/136. Anyway I think it's ok to solve this Problem in another PR. If this desgin looks work, I'm glad to port it. _Originally posted by @ZENOTME in https://github.com/apache/iceberg-rust/pull/20#discussion_r1282598302_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] struct value design [iceberg-rust]
ZENOTME commented on issue #77: URL: https://github.com/apache/iceberg-rust/issues/77#issuecomment-1759997715 I find that our struct value didn't include type info. Do we want include type info in it? 1. If we include info in struct, the struct value may look like ``` struct Struct { type: Arc ... } ``` - The benefit of it is we can look up field info directly by struct. - The drawback is that extra 8 bytes pointer cost. 2. Another solution is pass struct type as another parameter when we need it, e.g. ``` fn write(struct_value: Struct,struct_type: StructType) ``` - The benefit of it is save memory cost. - The drawback is that I'm not sure whether the struct type is hard available in some case . For now, I'm working on serialize/deserialize value. And both process seem can solve by second way. (pass a struct type as parameter) But I'm not sure whether there is some scenario we must include info in struct value. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Replace `.size() > 0` with `!.isempty()` [iceberg]
PickBas commented on issue #8810: URL: https://github.com/apache/iceberg/issues/8810#issuecomment-1760006978 @Fokko I have changed everywhere in core/* module from `size()` to `isEmpty()` except _ContentFileParser.java_. In order to move away from `.size() > 0` it is required to add the `isEmpty()` method to the `StructLike` interface. This interface has 29 implementations. Do I need to add the `isEmpty()` method to the aforementioned interface or leave it as is? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Use ParallelIterable in Deletes::toPositionIndex (6387) [iceberg]
wypoon commented on PR #8805: URL: https://github.com/apache/iceberg/pull/8805#issuecomment-1760074698 @rdblue @aokolnychyi as @rbalamohan indicated that he's not working on https://github.com/apache/iceberg/pull/6432 anymore, I have taken it up here. I rebased it on master and resolved the conflicts, moving the configuration from `SystemProperties` to the new `SystemConfigs`, changed the default for the pool size to be the same as for the existing worker pool, and the tests are green. @aokolnychyi I didn't see https://github.com/apache/iceberg/pull/6432#issuecomment-1758777424 until I have put this up. I didn't know that you're also working on this in https://github.com/apache/iceberg/pull/8755. I'll take a look at that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Update roadmap.md [iceberg-docs]
bitsondatadev commented on PR #272: URL: https://github.com/apache/iceberg-docs/pull/272#issuecomment-1760168587 Hey all, I'm looping back to this today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Update roadmap.md [iceberg-docs]
bitsondatadev commented on code in PR #272: URL: https://github.com/apache/iceberg-docs/pull/272#discussion_r1357238803 ## landing-page/content/common/roadmap.md: ## @@ -22,28 +22,36 @@ disableSidebar: true # Roadmap Overview -This roadmap outlines projects that the Iceberg community is working on, their priority, and a rough size estimate. -This is based on the latest [community priority discussion](https://lists.apache.org/thread.html/r84e80216c259c81f824c6971504c321cd8c785774c489d52d4fc123f%40%3Cdev.iceberg.apache.org%3E). +This roadmap outlines projects that the Iceberg community is working on. Each high-level item links to a Github project board that tracks the current status. Related design docs will be linked on the planning boards. -# Priority 1 - -* API: [Iceberg 1.0.0](https://github.com/apache/iceberg/projects/3) [medium] -* Python: [Pythonic refactor](https://github.com/apache/iceberg/projects/7) [medium] -* Spec: [Z-ordering / Space-filling curves](https://github.com/apache/iceberg/projects/16) [medium] -* Spec: [Snapshot tagging and branching](https://github.com/apache/iceberg/projects/4) [small] -* Views: [Spec](https://github.com/apache/iceberg/projects/6) [medium] -* Puffin: [Implement statistics information in table snapshot](https://github.com/apache/iceberg/pull/4741) [medium] -* Flink: [FLIP-27 based Iceberg source](https://github.com/apache/iceberg/projects/23) [large] - -# Priority 2 - -* ORC: [Support delete files stored as ORC](https://github.com/apache/iceberg/projects/13) [small] -* Spark: [DSv2 streaming improvements](https://github.com/apache/iceberg/projects/2) [small] -* Flink: [Inline file compaction](https://github.com/apache/iceberg/projects/14) [small] -* Flink: [Support UPSERT](https://github.com/apache/iceberg/projects/15) [small] -* Spec: [Secondary indexes](https://github.com/apache/iceberg/projects/17) [large] -* Spec v3: [Encryption](https://github.com/apache/iceberg/projects/5) [large] -* Spec v3: [Relative paths](https://github.com/apache/iceberg/projects/18) [large] -* Spec v3: [Default field values](https://github.com/apache/iceberg/projects/19) [medium] +# General + +* [Multi-table transaction support](https://github.com/apache/iceberg/projects/30) +* [Views Support](https://github.com/apache/iceberg/projects/29) +* [Change Data Capture (CDC) Support](https://github.com/apache/iceberg/projects/26) +* [Snapshot tagging and branching](https://github.com/apache/iceberg/projects/4) +* [Inline file compaction](https://github.com/apache/iceberg/projects/14) +* [Delete File compaction](https://github.com/apache/iceberg/projects/10) +* [Z-ordering / Space-filling curves](https://github.com/apache/iceberg/projects/16) +* [Support UPSERT](https://github.com/apache/iceberg/projects/15) + Review Comment: Will do! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Spec: Clarify spec_id field in Data File [iceberg]
Fokko commented on code in PR #8730: URL: https://github.com/apache/iceberg/pull/8730#discussion_r1357291336 ## format/spec.md: ## @@ -443,13 +443,13 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | **`132 split_offsets`** | `list<133: long>`| Split offsets for the data file. For example, all row group offsets in a Parquet file. Must be sorted ascending | || _optional_ | **`135 equality_ids`** | `list<136: int>` | Field ids used to determine row equality in equality delete files. Required when `content=2` and should be null otherwise. Fields with ids listed in this column must be present in the delete file | | _optional_ | _optional_ | **`140 sort_order_id`** | `int` | ID representing sort order for this file [3]. | - +| _optional_ | _optional_ | **`141 spec_id`**| `int` | ID representing partition spec for this file [4]. | Review Comment: How about keeping it blank? This means that the field should not be written. ```suggestion | | | **`141 spec_id`**| `int`| ID representing partition spec for this file [4]. | ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Flink: flink/*: replaced .size() > 0 with isEmpty() [iceberg]
Fokko commented on code in PR #8819: URL: https://github.com/apache/iceberg/pull/8819#discussion_r1357327064 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/DataStatisticsCoordinator.java: ## @@ -340,7 +340,7 @@ private void unregisterSubtaskGateway(int subtaskIndex, int attemptNumber) { private OperatorCoordinator.SubtaskGateway getSubtaskGateway(int subtaskIndex) { Preconditions.checkState( - gateways[subtaskIndex].size() > 0, + !gateways[subtaskIndex].isEmpty(), Review Comment: ```suggestion !gateways[subtaskIndex].isEmpty(), ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] rewrite v2 tables by skipping deletes planning and join deletes data tables [iceberg]
singhpk234 commented on PR #8807: URL: https://github.com/apache/iceberg/pull/8807#issuecomment-1760327041 interesting this is an approach which impala folks took too : - https://docs.google.com/document/d/1WF_UOanQ61RUuQlM4LaiRWI0YXpPKZ2VEJ8gyJdDyoY/edit# wondering if we could benefit from reads in general as well ? Also do you have more crisp benchmarks demonstrating this would benefit always ? have you tried the caching of delete files on executor solution which @aokolnychyi is working on and integrating with it ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Fix column rename doc example to reflect correct API [iceberg-python]
Fokko merged PR #59: URL: https://github.com/apache/iceberg-python/pull/59 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] Add Snapshot logic and Summary generation [iceberg-python]
Fokko opened a new pull request, #61: URL: https://github.com/apache/iceberg-python/pull/61 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] Make `next_sequence_number` private [iceberg-python]
Fokko opened a new pull request, #62: URL: https://github.com/apache/iceberg-python/pull/62 We should only use this in the table module. Follow up on https://github.com/apache/iceberg-python/pull/60#discussion_r1355656751 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Spark 3.5: Fix specific field values treated as unequal while comparing rows for carry-over removal [iceberg]
flyrain commented on PR #8799: URL: https://github.com/apache/iceberg/pull/8799#issuecomment-1760415613 I will take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] De-Duping Rows While Compacting [iceberg]
dramaticlly commented on issue #8702: URL: https://github.com/apache/iceberg/issues/8702#issuecomment-1760455077 data compaction only change physical files layout but not the data visible to users. Consider you originally have 1000 records with 10 duplicates, after deduplication it would be 990 records and also file layout change, I think deduplication (with ability to identify the row based on primary key or unique row identifier) probably need its own action/procedure instead of rely on data compaction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Add outputFile() for FileAppender [iceberg]
github-actions[bot] commented on issue #7231: URL: https://github.com/apache/iceberg/issues/7231#issuecomment-1760563302 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Add outputFile() for FileAppender [iceberg]
github-actions[bot] closed issue #7231: Add outputFile() for FileAppender URL: https://github.com/apache/iceberg/issues/7231 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] rewrite v2 tables by skipping deletes planning and join deletes data tables [iceberg]
zinking commented on PR #8807: URL: https://github.com/apache/iceberg/pull/8807#issuecomment-1760653401 > wondering if we could benefit from reads in general as well ? yep, like mentioned in the distributed planning work: when metadata becomes big, hand crafted parallel code is no longer optimal. if reads are planned optimally these delete files would be read concurrently instead of what we have now. > Also do you have more crisp benchmarks demonstrating this would benefit always ? I don't think this benefit always, it's easy to imagine that when there are only a couple of delete files, join would certainly not outperform. but when metadata becomes larger, it would always benefit as in theory file reads decreased. I don't have more numbers at the moment, and the benchmark above isn't fully optimized. > have you tried the caching of delete files on executor solution which @aokolnychyi is working on and integrating with it ? not yet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] struct value design [iceberg-rust]
ZENOTME commented on issue #77: URL: https://github.com/apache/iceberg-rust/issues/77#issuecomment-1760655157 cc @JanKaul @Fokko @Xuanwo @liurenjie1024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] How to read data in the order in which files are commited? [iceberg]
MarsKT commented on issue #8802: URL: https://github.com/apache/iceberg/issues/8802#issuecomment-1760659443 > Thanks @pvary, I have a maybe naive question @MarsKT > > > want the data read from iceberg to be in the same order every time. > > Can I ask what's driving this need? The users of a data analytics software that utilizes a data lake as its storage layer desire consistent data ordering when viewing the data. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[I] [BUG] string row filter ignore 2nd (and onwards And) [iceberg-python]
puchengy opened a new issue, #64: URL: https://github.com/apache/iceberg-python/issues/64 ### Apache Iceberg version None ### Please describe the bug 🐞 ``` tasks = table.scan(row_filter="dt='2023-08-20' AND view_type=1 AND hr='00' ").plan_files() ``` Only filter for `dt` and `view_type` is taken care of, but `hr` is not. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] How to read data in the order in which files are commited? [iceberg]
Zhanxiao-Ma commented on issue #8802: URL: https://github.com/apache/iceberg/issues/8802#issuecomment-1760700741 > Currently there is no way to order the scan task. The planning side specifically makes sure that even the planning could be done by parallel threads (reading manifests files parallel) > > Sometimes we need to do similar thing in Flink Source, and we ended up creating our own comparator for this which compares Iceberg splits (which are a wrapper above ScanTasks). > > You can do something similar like this in java code with one serious caveat: For a big table you might not want/able to keep all of the tasks in memory, which is needed for sorting. What we do in flink is limit the number of snapshots to read once. > > I hope this helps, Peter > Currently there is no way to order the scan task. The planning side specifically makes sure that even the planning could be done by parallel threads (reading manifests files parallel) > > Sometimes we need to do similar thing in Flink Source, and we ended up creating our own comparator for this which compares Iceberg splits (which are a wrapper above ScanTasks). > > You can do something similar like this in java code with one serious caveat: For a big table you might not want/able to keep all of the tasks in memory, which is needed for sorting. What we do in flink is limit the number of snapshots to read once. > > I hope this helps, Peter > Currently there is no way to order the scan task. The planning side specifically makes sure that even the planning could be done by parallel threads (reading manifests files parallel) > > Sometimes we need to do similar thing in Flink Source, and we ended up creating our own comparator for this which compares Iceberg splits (which are a wrapper above ScanTasks). > > You can do something similar like this in java code with one serious caveat: For a big table you might not want/able to keep all of the tasks in memory, which is needed for sorting. What we do in flink is limit the number of snapshots to read once. > > I hope this helps, Peter > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] How to read data in the order in which files are commited? [iceberg]
Zhanxiao-Ma closed issue #8802: How to read data in the order in which files are commited? URL: https://github.com/apache/iceberg/issues/8802 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[I] How to read data in the order in which files are commited? [iceberg]
Zhanxiao-Ma opened a new issue, #8802: URL: https://github.com/apache/iceberg/issues/8802 ### Query engine Iceberg java api(Version 0.14.1) ### Question I want the data read from iceberg to be in the same order every time. But I can't find an attribute that would make FileScanTask ordered. Is there a way I can implement it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] How to read data in the order in which files are commited? [iceberg]
Zhanxiao-Ma commented on issue #8802: URL: https://github.com/apache/iceberg/issues/8802#issuecomment-1760704067 > Sometimes we need to do similar thing in Flink Source, and we ended up creating our own comparator for this which compares Iceberg splits (which are a wrapper above ScanTasks). I'm sorry, I didn't quite understand this point. Could you please explain it in more detail? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] feat: First version of rest catalog. [iceberg-rust]
liurenjie1024 opened a new pull request, #78: URL: https://github.com/apache/iceberg-rust/pull/78 In this pr we add initial support for rest, which finished simple rest apis. Complex apis such as create table, update table, commits which be added in following pr so that we can make each pr's size reasonable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] feat: First version of rest catalog. [iceberg-rust]
liurenjie1024 commented on PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#issuecomment-1760707832 cc @JanKaul @Xuanwo @Fokko @ZENOTME PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] feat: First version of rest catalog. [iceberg-rust]
Xuanwo commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1357725856 ## crates/iceberg/src/catalog/rest.rs: ## @@ -0,0 +1,912 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +//! This module contains rest catalog implementation. + +use std::collections::HashMap; + +use async_trait::async_trait; +use reqwest::header::{self, HeaderMap, HeaderName, HeaderValue}; +use reqwest::{Client, Request}; +use serde::de::DeserializeOwned; +use urlencoding::encode; + +use crate::error::Result; +use crate::table::Table; +use crate::{ +Catalog, Error, ErrorKind, Namespace, NamespaceIdent, TableCommit, TableCreation, TableIdent, +}; + +use self::_serde::{ +CatalogConfig, ErrorModel, ErrorResponse, ListNamespaceResponse, ListTableResponse, +NamespaceSerde, RenameTableRequest, NO_CONTENT, OK, +}; + +const ICEBERG_REST_SPEC_VERSION: &str = "0.14.1"; +const PATH_V1: &str = "v1"; + +#[derive(Debug, Builder)] +pub struct RestCatalogConfig { +uri: String, +#[builder(default)] +warehouse: Option, + +#[builder(default)] +props: HashMap, +} + +impl RestCatalogConfig { +fn config_endpoint(&self) -> String { +[&self.uri, PATH_V1, "config"].join("/") +} + +fn namespaces_endpoint(&self) -> String { +[&self.uri, PATH_V1, "namespaces"].join("/") +} + +fn namespace_endpoint(&self, ns: &NamespaceIdent) -> Result { +Ok([&self.uri, PATH_V1, "namespaces", &ns.encode_in_url()?].join("/")) +} + +fn tables_endpoint(&self, ns: &NamespaceIdent) -> Result { +Ok([ +&self.uri, +PATH_V1, +"namespaces", +&ns.encode_in_url()?, +"tables", +] +.join("/")) +} + +fn rename_table_endpoint(&self) -> Result { +Ok([&self.uri, PATH_V1, "tables", "rename"].join("/")) +} + +fn table_endpoint(&self, table: &TableIdent) -> Result { +Ok([ +&self.uri, +PATH_V1, +"namespaces", +&table.namespace.encode_in_url()?, +"tables", +encode(&table.name).as_ref(), +] +.join("/")) +} + +fn try_create_rest_client(&self) -> Result { +//TODO: We will add oauth, ssl config, sigv4 later +let mut headers = HeaderMap::new(); +headers.insert( +header::CONTENT_TYPE, +HeaderValue::from_static("application/json"), +); +headers.insert( +HeaderName::from_static("x-client-version"), +HeaderValue::from_static(ICEBERG_REST_SPEC_VERSION), +); +headers.insert( +header::USER_AGENT, +HeaderValue::from_str(&format!("iceberg-rs/{}", env!("CARGO_PKG_VERSION"))).unwrap(), +); + +Ok(HttpClient( +Client::builder().default_headers(headers).build()?, +)) +} +} + +impl NamespaceIdent { +/// Returns url encoded format. +pub fn encode_in_url(&self) -> Result { +if self.0.is_empty() { Review Comment: It's better to ensure that `NamespaceIdent` is valid so that we don't have to check it when using it. This change can removes a lot of `Result` in related APIs. ## crates/iceberg/src/catalog/rest.rs: ## @@ -0,0 +1,912 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +//! This module contains rest catalog implementation. + +use std::collection
Re: [I] struct value design [iceberg-rust]
liurenjie1024 commented on issue #77: URL: https://github.com/apache/iceberg-rust/issues/77#issuecomment-1760723094 > Another solution is pass struct type as another parameter when we need it, e.g. I prefer this approach. It's weird for me to store types with values, and we can always to attach type to it when necessary. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] feat: First version of rest catalog. [iceberg-rust]
liurenjie1024 commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1357732820 ## crates/iceberg/Cargo.toml: ## @@ -41,20 +41,24 @@ either = "1" futures = "0.3" itertools = "0.11" lazy_static = "1" +log = "^0.4" murmur3 = "0.5.2" once_cell = "1" opendal = "0.40" ordered-float = "4.0.0" +reqwest = { version = "^0.11", features = ["json"] } Review Comment: Maybe we should make it a feature? I'm not sure if it deserves another crate. cc @JanKaul @ZENOTME How do you guys think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] feat: First version of rest catalog. [iceberg-rust]
Xuanwo commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1357738084 ## crates/iceberg/Cargo.toml: ## @@ -41,20 +41,24 @@ either = "1" futures = "0.3" itertools = "0.11" lazy_static = "1" +log = "^0.4" murmur3 = "0.5.2" once_cell = "1" opendal = "0.40" ordered-float = "4.0.0" +reqwest = { version = "^0.11", features = ["json"] } Review Comment: The problem with the features is that they are add-only, making it difficult for users to disable them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] rewrite_position_delete_files leads to error [iceberg]
atifiu commented on issue #8045: URL: https://github.com/apache/iceberg/issues/8045#issuecomment-1760787847 @szehon-ho Thanks for the fix. I am facing the same issue on iceberg 1.3.0 while trying to remove delete files using proc `rewrite_position_delete_files` . Reason why I have remove delete files is the fact that aggregate pushdown is failing with this error message `SparkScanBuilder: Skipping aggregate pushdown: detected row level deletes`. https://github.com/apache/iceberg/pull/6252#issuecomment-1757848584 And I am still not sure how delete files were created when I have defined Merge on Read for dml operations. https://github.com/apache/iceberg/pull/6252#issuecomment-1758873680 So my questions to you is how can we remove delete files if we are still using 1.3.0 ? Is it somehow possible to manually remove reference of delete files without corrupting the metadata ? Thanks for your help. ``` 23/10/13 00:16:56 ERROR RewritePositionDeleteFilesSparkAction: Failure during rewrite group FileGroupInfo{globalIndex=1, partitionIndex=1, partition=org.apache.iceberg.util.StructProjection@3162902b} org.apache.spark.sql.AnalysisException: cannot resolve '(partition.`page_view_dtm_day` = 18384)' due to data type mismatch: differing types in '(partition.`page_view_dtm_day` = 18384)' (date and int).; 'Filter (partition#4925.page_view_dtm_day = 18384) +- RelationV2[content#4921, file_path#4922, file_format#4923, spec_id#4924, partition#4925, record_count#4926L, file_size_in_bytes#4927L, column_sizes#4928, value_counts#4929, null_value_counts#4930, nan_value_counts#4931, lower_bounds#4932, upper_bounds#4933, key_metadata#4934, split_offsets#4935, equality_ids#4936, sort_order_id#4937, readable_metrics#4938] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] push down min/max/count to iceberg [iceberg]
atifiu commented on PR #6252: URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1760825445 @huaxingao What can be the possible reasons for aggregate pushdown to not work when using filters, if you can give me some idea/hint I will try to look into it further. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] feat: First version of rest catalog. [iceberg-rust]
liurenjie1024 commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1357791341 ## crates/iceberg/Cargo.toml: ## @@ -41,20 +41,24 @@ either = "1" futures = "0.3" itertools = "0.11" lazy_static = "1" +log = "^0.4" murmur3 = "0.5.2" once_cell = "1" opendal = "0.40" ordered-float = "4.0.0" +reqwest = { version = "^0.11", features = ["json"] } Review Comment: Sorry, I don't get your point, would you give a concrete example? One concern with separate crate approach is that it makes loading catalog dynamically like Python difficult. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[I] Flaky test: TestRemoveOrphanFilesAction3 > orphanedFileRemovedWithParallelTasks [iceberg]
ajantha-bhat opened a new issue, #8824: URL: https://github.com/apache/iceberg/issues/8824 PR: https://github.com/apache/iceberg/pull/8822 Build: https://github.com/apache/iceberg/actions/runs/6499875599/job/17655030618?pr=8822 ``` TestRemoveOrphanFilesAction3 > orphanedFileRemovedWithParallelTasks FAILED java.lang.AssertionError: Should delete 4 files expected:<4> but was:<3> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.apache.iceberg.spark.actions.TestRemoveOrphanFilesAction.orphanedFileRemovedWithParallelTasks(TestRemoveOrphanFilesAction.java:307) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86) at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:110) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:90) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:85) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:62) at
Re: [I] Flaky test: TestRemoveOrphanFilesAction3 > orphanedFileRemovedWithParallelTasks [iceberg]
ajantha-bhat commented on issue #8824: URL: https://github.com/apache/iceberg/issues/8824#issuecomment-1760921944 Looks like it is a regression : https://github.com/apache/iceberg/pull/4859 It seems we tried fixing it long time back but didn't fix properly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] feat: suport read/write Manifest [iceberg-rust]
ZENOTME opened a new pull request, #79: URL: https://github.com/apache/iceberg-rust/pull/79 This PR prepare to support read/write Manifest. related issue: #36 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] feat: suport read/write Manifest [iceberg-rust]
ZENOTME commented on PR #79: URL: https://github.com/apache/iceberg-rust/pull/79#issuecomment-1760940516 For now, it still not been completed. It only complete the basic design and I want to make sure whether the design well first. If it looks well, I will complete it and add the test later. cc @JanKaul @Fokko @Xuanwo @liurenjie1024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] feat: First version of rest catalog. [iceberg-rust]
ZENOTME commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1357818399 ## crates/iceberg/src/catalog/rest.rs: ## @@ -0,0 +1,900 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +//! This module contains rest catalog implementation. + +use std::collections::HashMap; + +use async_trait::async_trait; +use reqwest::header::{self, HeaderMap, HeaderName, HeaderValue}; +use reqwest::{Client, Request}; +use serde::de::DeserializeOwned; +use urlencoding::encode; + +use crate::error::Result; +use crate::table::Table; +use crate::{ +Catalog, Error, ErrorKind, Namespace, NamespaceIdent, TableCommit, TableCreation, TableIdent, +}; + +use self::_serde::{ +CatalogConfig, ErrorModel, ErrorResponse, ListNamespaceResponse, ListTableResponse, +NamespaceSerde, RenameTableRequest, NO_CONTENT, OK, +}; + +const ICEBERG_REST_SPEC_VERSION: &str = "0.14.1"; +const PATH_V1: &str = "v1"; + +#[derive(Debug, Builder)] +pub struct RestCatalogConfig { +uri: String, +#[builder(default)] +warehouse: Option, + +#[builder(default)] +props: HashMap, +} + +impl RestCatalogConfig { +fn config_endpoint(&self) -> String { +[&self.uri, PATH_V1, "config"].join("/") +} + +fn namespaces_endpoint(&self) -> String { +[&self.uri, PATH_V1, "namespaces"].join("/") +} + +fn namespace_endpoint(&self, ns: &NamespaceIdent) -> String { +[&self.uri, PATH_V1, "namespaces", &ns.encode_in_url()].join("/") +} + +fn tables_endpoint(&self, ns: &NamespaceIdent) -> String { +[ +&self.uri, +PATH_V1, +"namespaces", +&ns.encode_in_url(), +"tables", +] +.join("/") +} + +fn rename_table_endpoint(&self) -> String { +[&self.uri, PATH_V1, "tables", "rename"].join("/") +} + +fn table_endpoint(&self, table: &TableIdent) -> String { +[ +&self.uri, +PATH_V1, +"namespaces", +&table.namespace.encode_in_url(), +"tables", +encode(&table.name).as_ref(), +] +.join("/") +} + +fn try_create_rest_client(&self) -> Result { +//TODO: We will add oauth, ssl config, sigv4 later +let mut headers = HeaderMap::new(); +headers.insert( +header::CONTENT_TYPE, +HeaderValue::from_static("application/json"), +); +headers.insert( +HeaderName::from_static("x-client-version"), +HeaderValue::from_static(ICEBERG_REST_SPEC_VERSION), +); +headers.insert( +header::USER_AGENT, +HeaderValue::from_str(&format!("iceberg-rs/{}", env!("CARGO_PKG_VERSION"))).unwrap(), +); + +Ok(HttpClient( +Client::builder().default_headers(headers).build()?, +)) +} +} + +impl NamespaceIdent { +/// Returns url encoded format. +pub fn encode_in_url(&self) -> String { +encode(&self.0.join("\u{1F}")).to_string() +} +} + +struct HttpClient(Client); + +impl HttpClient { +async fn execute< +R: DeserializeOwned, +E: DeserializeOwned + Into, +const SUCCESS_CODE: u16, +>( +&self, +request: Request, +) -> Result { +let resp = self.0.execute(request).await?; + +if resp.status().as_u16() == SUCCESS_CODE { +let text = resp.bytes().await?; +Ok(serde_json::from_slice::(&text).map_err(|e| { +Error::new( +ErrorKind::Unexpected, +"Failed to parse response from rest catalog server!", +) +.with_context("json", String::from_utf8_lossy(&text)) +.with_source(e) +})?) +} else { +let text = resp.bytes().await?; +let e = serde_json::from_slice::(&text).map_err(|e| { +Error::new( +ErrorKind::Unexpected, +"Failed to parse response from rest catalog server!", +) +.with_context("json", String::from_utf8_lossy(&text)) +.with_source(