Re: [I] Spark: add IcebergConnectHiveDelegationTokenProvider [iceberg]

2025-05-26 Thread via GitHub


pvary commented on issue #13116:
URL: https://github.com/apache/iceberg/issues/13116#issuecomment-2909014482

   @zhangwl9: It would be nice to have both the Spark and the Flink delegation 
token supported by the HiveCatalog.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Introduce snapshot summary properties [iceberg-rust]

2025-05-26 Thread via GitHub


dentiny commented on PR #1336:
URL: https://github.com/apache/iceberg-rust/pull/1336#issuecomment-2909013780

   @Xuanwo friendly ping :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] OpenAPI: Add missing schema field for TableMetadata [iceberg]

2025-05-26 Thread via GitHub


nastra commented on PR #13152:
URL: https://github.com/apache/iceberg/pull/13152#issuecomment-2908991182

   Since this is a Spec change, can you please mention this on the 
[d...@iceberg.apache.org](mailto:d...@iceberg.apache.org) list so that people 
are aware and can vote on this change?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] AWS: update test cases to verify credentials for the prefixed S3 client [iceberg]

2025-05-26 Thread via GitHub


nastra commented on code in PR #13118:
URL: https://github.com/apache/iceberg/pull/13118#discussion_r2106652089


##
aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIO.java:
##
@@ -862,6 +919,110 @@ public void multipleStorageCredentialsConfigured() {
 s3FileIOProperties2
 .extracting(S3FileIOProperties::sessionToken)
 .isEqualTo("sessionTokenFromCredential2");
+// verify that the credentials identity gets the correct credentials for 
the prefixed S3 client
+assertThat(fileIO.client("s3://custom-uri/2").serviceClientConfiguration())
+.extracting(AwsServiceClientConfiguration::credentialsProvider)
+.extracting(IdentityProvider::resolveIdentity)
+.satisfies(
+x -> {
+  AwsSessionCredentialsIdentity identity = 
(AwsSessionCredentialsIdentity) x.get();
+  assertThat(identity).isNotNull();
+  
assertThat(identity.accessKeyId()).isEqualTo("keyIdFromCredential2");
+  
assertThat(identity.secretAccessKey()).isEqualTo("accessKeyFromCredential2");
+  
assertThat(identity.sessionToken()).isEqualTo("sessionTokenFromCredential2");
+});
+  }
+
+  @Test
+  public void noStorageCredentialConfiguredWithoutCredentialsInProperties() {
+S3FileIO fileIO = new S3FileIO();
+fileIO.initialize(ImmutableMap.of("client.region", "us-east-1"));
+
+S3ServiceClientConfiguration actualConfiguration = 
fileIO.client().serviceClientConfiguration();
+assertThat(actualConfiguration).isNotNull();
+assertThatThrownBy(() -> 
actualConfiguration.credentialsProvider().resolveIdentity())
+.isInstanceOf(SdkClientException.class)
+.hasMessageContaining("Unable to load credentials from any of the 
providers");
+  }
+
+  @Test
+  public void storageCredentialsConfiguredOverwriteCredentialsInProperties()
+  throws ExecutionException, InterruptedException {
+StorageCredential s3Credential =
+StorageCredential.create(
+"s3",
+ImmutableMap.of(
+"s3.access-key-id",
+"updateKeyIdFromCredential",
+"s3.secret-access-key",
+"updateAccessKeyFromCredential",
+"s3.session-token",
+"updateSessionTokenFromCredential"));
+
+S3FileIO fileIO = new S3FileIO();
+fileIO.setCredentials(ImmutableList.of(s3Credential));
+fileIO.initialize(
+ImmutableMap.of(
+"client.region",
+"us-east-1",
+"s3.access-key-id",
+"keyIdFromProperties",
+"s3.secret-access-key",
+"accessKeyFromProperties",
+"s3.session-token",
+"sessionTokenFromProperties"));
+
+S3ServiceClientConfiguration actualConfiguration = 
fileIO.client().serviceClientConfiguration();
+assertThat(actualConfiguration).isNotNull();
+AwsSessionCredentialsIdentity actualCredIdentity =
+(AwsSessionCredentialsIdentity)
+actualConfiguration.credentialsProvider().resolveIdentity().get();
+
assertThat(actualCredIdentity.accessKeyId()).isEqualTo("updateKeyIdFromCredential");
+
assertThat(actualCredIdentity.secretAccessKey()).isEqualTo("updateAccessKeyFromCredential");
+
assertThat(actualCredIdentity.sessionToken()).isEqualTo("updateSessionTokenFromCredential");
+  }
+
+  @Test
+  public void storageCredentialsConfiguredWithoutCredentialsInProperties() {

Review Comment:
   this is already covered in existing tests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Spark: add IcebergConnectHiveDelegationTokenProvider [iceberg]

2025-05-26 Thread via GitHub


gaborgsomogyi commented on issue #13116:
URL: https://github.com/apache/iceberg/issues/13116#issuecomment-2909024138

   +1. Since I'm Flink and Spark committer and I'm the author of those DT 
frameworks on both side happy to help if there are some questions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: Add basic classes for writing table format-version 4 [iceberg]

2025-05-26 Thread via GitHub


nastra commented on code in PR #13123:
URL: https://github.com/apache/iceberg/pull/13123#discussion_r2106866848


##
api/src/test/java/org/apache/iceberg/TestHelpers.java:
##
@@ -54,7 +54,7 @@ public class TestHelpers {
 
   private TestHelpers() {}
 
-  public static final int MAX_FORMAT_VERSION = 3;
+  public static final int MAX_FORMAT_VERSION = 4;

Review Comment:
   By the time this code is released to users we should have v4 Spec items 
documented. As I mentioned earlier, we did the same approach with v3 as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: support decompress gzip metadata [iceberg-cpp]

2025-05-26 Thread via GitHub


yingcai-cy commented on code in PR #108:
URL: https://github.com/apache/iceberg-cpp/pull/108#discussion_r2106756398


##
src/iceberg/util/gzip_internal.cc:
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include "iceberg/util/gzip_internal.h"
+
+#include 
+
+#include 
+
+#include "iceberg/util/macros.h"
+
+namespace iceberg {
+
+class ZlibImpl {

Review Comment:
   IMHO, if this code is only used for a one-time decompression, introducing 
those classes might be unnecessarily complex. Maybe one util method is enough.



##
test/metadata_io_test.cc:
##
@@ -78,4 +83,66 @@ TEST_F(MetadataIOTest, ReadWriteMetadata) {
   EXPECT_EQ(*metadata_read, metadata);
 }
 
+TEST_F(MetadataIOTest, ReadWriteCompressedMetadata) {
+  std::vector schema_fields;
+  schema_fields.emplace_back(/*field_id=*/1, "x", std::make_shared(),
+ /*optional=*/false);
+  auto schema = std::make_shared(std::move(schema_fields), 
/*schema_id=*/1);
+
+  TableMetadata metadata{.format_version = 1,

Review Comment:
   Extract to a helper method to get TableMetadata?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: Add basic classes for writing table format-version 4 [iceberg]

2025-05-26 Thread via GitHub


nastra commented on code in PR #13123:
URL: https://github.com/apache/iceberg/pull/13123#discussion_r2106866848


##
api/src/test/java/org/apache/iceberg/TestHelpers.java:
##
@@ -54,7 +54,7 @@ public class TestHelpers {
 
   private TestHelpers() {}
 
-  public static final int MAX_FORMAT_VERSION = 3;
+  public static final int MAX_FORMAT_VERSION = 4;

Review Comment:
   By the time this code is released to users we should have [v4 Spec 
items](https://github.com/apache/iceberg/milestone/58) documented. As I 
mentioned earlier, we did the same approach with v3 as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Experiment implementation for catalog builder [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 commented on PR #1231:
URL: https://github.com/apache/iceberg-rust/pull/1231#issuecomment-2909044061

   > Are we sure that we need the CatalogBuilder trait? What would be its 
purpose?
   
   Hi, @c-thiel Sorry for misclarification. For more background, please refer 
to https://github.com/apache/iceberg-rust/issues/1228. In short, we are trying 
to develop a catalog loader so that it can be used by some applications such as 
iceberg-playground or data driven integration test framework. 
   
   > My initial idea would be to just have different typesafe builders for 
different catalogs. The current CatalogBuilder trait currently contains fields 
that are not required for some catalogs. For example in-memory or dynamo don't 
even need a uri.
   
   I've refined the trait definition as proposed by @Xuanwo in #1372 .
   
   > On the other hand we have the rest catalog that needs [significantly more 
configurations](https://py.iceberg.apache.org/configuration/#rest-catalog), and 
I don't think we want to make those key value pairs. I would be in favor to 
have the interfaces as typesafe as possible.
   
   I think a pure type safe approach may not be practical since we also need to 
pass file io's configurations through catalog, and the actual file io used is 
determined at runtime.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[I] Incrementally computing partition stats can missing deleted files [iceberg]

2025-05-26 Thread via GitHub


lirui-apache opened a new issue, #13155:
URL: https://github.com/apache/iceberg/issues/13155

   ### Apache Iceberg version
   
   main (development)
   
   ### Query engine
   
   None
   
   ### Please describe the bug 🐞
   
   It seems stats diff is computed only from the current snapshot, which may 
not track deleted files. For example the following test case would fail:
   ```java
 @Test
 public void test() throws Exception {
   Table testTable =
   TestTables.create(tempDir("my_test"), "my_test", SCHEMA, SPEC, 2, 
fileFormatProperty);
   
   DataFile dataFile1 =
   DataFiles.builder(SPEC)
   .withPath("/df1.parquet")
   .withPartitionPath("c2=a/c3=a")
   .withFileSizeInBytes(10)
   .withRecordCount(1)
   .build();
   DataFile dataFile2 =
   DataFiles.builder(SPEC)
   .withPath("/df2.parquet")
   .withPartitionPath("c2=b/c3=b")
   .withFileSizeInBytes(10)
   .withRecordCount(1)
   .build();
   
   
testTable.newAppend().appendFile(dataFile1).appendFile(dataFile2).commit();
   
   testTable
   .updatePartitionStatistics()
   
.setPartitionStatistics(PartitionStatsHandler.computeAndWriteStatsFile(testTable))
   .commit();
   
   testTable.newDelete().deleteFile(dataFile1).commit();
   testTable.newDelete().deleteFile(dataFile2).commit();
   
   PartitionStatisticsFile statsFile = 
PartitionStatsHandler.computeAndWriteStatsFile(testTable);
   
   assertThat(
   PartitionStatsHandler.readPartitionStatsFile(
   
PartitionStatsHandler.schema(Partitioning.partitionType(testTable)),
   Files.localInput(statsFile.path(
   .allMatch(s -> s.dataRecordCount() == 0);
 }
   ```
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Hive: Throw exception for when listing a non-existing namespace [iceberg]

2025-05-26 Thread via GitHub


pvary commented on code in PR #13130:
URL: https://github.com/apache/iceberg/pull/13130#discussion_r2106791809


##
hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java:
##
@@ -1209,11 +1208,4 @@ public void 
testDatabaseLocationWithSlashInWarehouseDir() {
 
 assertThat(database.getLocationUri()).isEqualTo("s3://bucket/database.db");
   }
-
-  @Test
-  @Override
-  @Disabled("Hive currently returns an empty list instead of throwing a 
NoSuchNamespaceException")
-  public void testListNonExistingNamespace() {
-super.testListNonExistingNamespace();
-  }

Review Comment:
   Should we highlight this behavioral change in some doc?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Incrementally computing partition stats can miss deleted files [iceberg]

2025-05-26 Thread via GitHub


lirui-apache commented on issue #13155:
URL: https://github.com/apache/iceberg/issues/13155#issuecomment-2909052640

   Hi @ajantha-bhat , could you help verify the issue, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: support decompress gzip metadata [iceberg-cpp]

2025-05-26 Thread via GitHub


yingcai-cy commented on code in PR #108:
URL: https://github.com/apache/iceberg-cpp/pull/108#discussion_r2103886619


##
src/iceberg/table_metadata.cc:
##
@@ -153,14 +154,70 @@ Result 
TableMetadataUtil::CodecFromFileName(
   return MetadataFileCodecType::kNone;
 }
 
+class GZipDecompressor {
+ public:
+  GZipDecompressor() : initialized_(false) {}
+
+  ~GZipDecompressor() {
+if (initialized_) {
+  inflateEnd(&stream_);
+}
+  }
+
+  Status Init() {
+int ret = inflateInit2(&stream_, 15 + 32);
+if (ret != Z_OK) {
+  return IOError("inflateInit2 failed, result:{}", ret);
+}
+initialized_ = true;
+return {};
+  }
+
+  Result Decompress(const std::string& compressed_data) {
+if (compressed_data.empty()) {
+  return {};
+}
+if (!initialized_) {
+  ICEBERG_RETURN_UNEXPECTED(Init());
+}
+stream_.avail_in = static_cast(compressed_data.size());
+stream_.next_in = 
reinterpret_cast(const_cast(compressed_data.data()));
+
+// TODO(xiao.dong) magic buffer 16k, can we get a estimated size from 
compressed data?
+std::vector outBuffer(32 * 1024);
+std::string result;
+int ret = 0;
+do {
+  outBuffer.resize(outBuffer.size());
+  stream_.avail_out = static_cast(outBuffer.size());
+  stream_.next_out = reinterpret_cast(outBuffer.data());
+  ret = inflate(&stream_, Z_NO_FLUSH);
+  if (ret != Z_OK && ret != Z_STREAM_END) {
+return IOError("inflate failed, result:{}", ret);
+  }
+  result.append(outBuffer.data(), outBuffer.size() - stream_.avail_out);
+} while (ret != Z_STREAM_END);
+return result;
+  }
+
+ private:
+  bool initialized_ = false;
+  z_stream stream_;
+};
+
 Result> TableMetadataUtil::Read(
 FileIO& io, const std::string& location, std::optional length) {
   ICEBERG_ASSIGN_OR_RAISE(auto codec_type, CodecFromFileName(location));
+
+  ICEBERG_ASSIGN_OR_RAISE(auto content, io.ReadFile(location, length));
   if (codec_type == MetadataFileCodecType::kGzip) {
-return NotImplemented("Reading gzip-compressed metadata files is not 
supported yet");
+auto gzip_decompressor = std::make_unique();

Review Comment:
   I think using functions like gzopen can simplify this code a bit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] chore: introduce `nightly` feature flag to provide error backtrace [iceberg-rust]

2025-05-26 Thread via GitHub


Xuanwo commented on PR #1340:
URL: https://github.com/apache/iceberg-rust/pull/1340#issuecomment-2908909319

   Detecting nightly features is not recommended by the Rust community, as 
nightly features can be renamed or removed at any time.
   
   For example: https://github.com/tkaitchuck/aHash/
   
   > Fundamentally this is caused by logic in build.rs that auto-enables a 
nightly feature when it detects that it is built with a nightly rustc. Such 
logic is fragile and prone to errors as nightly features evolve before 
stabilization.
   >
   > Crates should never automatically enable nightly features, this should 
generally be opt-in. If they try to do it automatically they need to be very 
careful and account for the fact that nightly features are not stable, and 
might look very different in future nightly versions.
   >
   > I see aHash still does the same kind of auto-detection with the specialize 
feature. That's a similar issue just waiting to happen.
   
   Similar things have happened many times before. We don't want to disrupt our 
users, especially those using the nightly version of rustc. I understand the 
motivation behind this change, but since we have already provided an 
alternative, perhaps we can wait until this feature is stabilized.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Flink: Migrate Flink TableSchema for IcebergSource [iceberg]

2025-05-26 Thread via GitHub


pvary commented on code in PR #13072:
URL: https://github.com/apache/iceberg/pull/13072#discussion_r2106783588


##
flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java:
##
@@ -192,7 +222,9 @@ public static Type convert(LogicalType flinkType) {
*
* @param rowType a RowType
* @return Flink TableSchema
+   * @deprecated use {@link #toResolvedSchema(RowType)} instead
*/
+  @Deprecated
   public static TableSchema toSchema(RowType rowType) {

Review Comment:
   What are these validations in `build`?
   - IIUC, we don't set watermarks, so those checks are not relevant
   - IIUC, we don't set primary keys, so those checks are not relevant
   
   By my understanding the only relevant check is for duplicated row names.
   Could you please check if there anything else we need to check?
   
   Thanks,
   Peter



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Flink: Support compact in iceberg sink v2 [iceberg]

2025-05-26 Thread via GitHub


pvary commented on code in PR #12979:
URL: https://github.com/apache/iceberg/pull/12979#discussion_r2106809132


##
flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/CommittableToTableChangeConverter.java:
##
@@ -0,0 +1,181 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.flink.sink;
+
+import java.io.IOException;
+import java.util.List;
+import org.apache.flink.api.common.functions.OpenContext;
+import org.apache.flink.api.common.state.CheckpointListener;
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.core.io.SimpleVersionedSerialization;
+import org.apache.flink.runtime.state.FunctionInitializationContext;
+import org.apache.flink.runtime.state.FunctionSnapshotContext;
+import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;
+import org.apache.flink.streaming.api.connector.sink2.CommittableMessage;
+import org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage;
+import org.apache.flink.streaming.api.functions.ProcessFunction;
+import org.apache.flink.util.Collector;
+import org.apache.iceberg.DataFile;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.ManifestFile;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.flink.TableLoader;
+import org.apache.iceberg.flink.maintenance.operator.TableChange;
+import org.apache.iceberg.io.WriteResult;
+import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
+import org.apache.iceberg.relocated.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class CommittableToTableChangeConverter
+extends ProcessFunction, 
TableChange>
+implements CheckpointedFunction, CheckpointListener {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(CommittableToTableChangeConverter.class);
+
+  private final TableLoader tableLoader;
+  private transient Table table;
+  private transient ListState manifestFilesToRemoveState;
+  private transient List manifestFilesToRemoveList;
+  private transient long lastCompletedCheckpointId = -1L;
+  private transient String flinkJobId;
+
+  public CommittableToTableChangeConverter(TableLoader tableLoader) {
+Preconditions.checkNotNull(tableLoader, "TableLoader should not be null");
+this.tableLoader = tableLoader;
+  }
+
+  @Override
+  public void initializeState(FunctionInitializationContext context) throws 
Exception {
+this.manifestFilesToRemoveList = Lists.newArrayList();
+this.manifestFilesToRemoveState =
+context
+.getOperatorStateStore()
+.getListState(new ListStateDescriptor<>("manifests-to-remove", 
ManifestFile.class));
+if (context.isRestored()) {
+  manifestFilesToRemoveList = 
Lists.newArrayList(manifestFilesToRemoveState.get());
+}
+  }
+
+  @Override
+  public void open(OpenContext openContext) throws Exception {
+super.open(openContext);
+this.flinkJobId = getRuntimeContext().getJobId().toString();
+if (!tableLoader.isOpen()) {
+  tableLoader.open();
+}
+this.table = tableLoader.loadTable();
+  }
+
+  @Override
+  public void snapshotState(FunctionSnapshotContext context) throws Exception {
+manifestFilesToRemoveState.update(manifestFilesToRemoveList);
+  }
+
+  @Override
+  public void processElement(
+  CommittableMessage value,
+  ProcessFunction, 
TableChange>.Context ctx,
+  Collector out)
+  throws Exception {
+if (value instanceof CommittableWithLineage) {
+  CommittableWithLineage committable =
+  (CommittableWithLineage) value;
+  TableChange tableChange = 
convertToTableChange(committable.getCommittable());
+  out.collect(tableChange);
+}
+  }
+
+  private TableChange convertToTableChange(IcebergCommittable 
icebergCommittable)
+  throws IOException {
+if (icebergCommittable == null || icebergCommittable.manifest().length == 
0) {
+  return TableChange.empty();
+}
+
+DeltaManifests deltaManifests =
+SimpleVersionedSerialization.readVersionAndDeSerialize(
+DeltaManifestsSerializer.INSTANCE, i

[I] Add docs of Spark SQL functions for Iceberg transforms [iceberg]

2025-05-26 Thread via GitHub


manuzhang opened a new issue, #13156:
URL: https://github.com/apache/iceberg/issues/13156

   ### Feature Request / Improvement
   
   Add docs of following Spark SQL functions for Iceberg transforms 
   - system.iceberg_version
   - system.bucket
   - system.years
   - system.month
   - system.days
   - system.hours
   - system.truncate
   
   ### Query engine
   
   Spark
   
   ### Willingness to contribute
   
   - [ ] I can contribute this improvement/feature independently
   - [ ] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] demo: an easy to use catalog loader [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 commented on code in PR #1372:
URL: https://github.com/apache/iceberg-rust/pull/1372#discussion_r2106849970


##
crates/iceberg/src/catalog/mod.rs:
##
@@ -36,7 +37,19 @@ use crate::spec::{
 use crate::table::Table;
 use crate::{Error, ErrorKind, Result};
 
+/// The CatalogLoader trait is used to load a catalog from a given name and 
properties.
+#[async_trait]
+pub trait CatalogLoader: Debug + Send + Sync {
+/// Load a catalog from the given name and properties.
+async fn load(properties: HashMap) -> Result>;

Review Comment:
   There are two problems with this approach:
   1. This is not easy to use when we know the concret type of catalog. For 
example, when the user just wants to create a RestCatalog. It will force user 
to do downcast. This is useful when the catalog has some extran functionality.
   2. This is not easy to use when the catalog builder has an advanced builder 
method, see https://github.com/apache/iceberg-rust/pull/1231/files#r2106848332
   
   I think I can simplify the methods in my original proposal like your one, 
while keeping other things same, WDYT?
   



##
crates/iceberg/src/catalog/mod.rs:
##
@@ -36,7 +37,19 @@ use crate::spec::{
 use crate::table::Table;
 use crate::{Error, ErrorKind, Result};
 
+/// The CatalogLoader trait is used to load a catalog from a given name and 
properties.
+#[async_trait]
+pub trait CatalogLoader: Debug + Send + Sync {
+/// Load a catalog from the given name and properties.
+async fn load(properties: HashMap) -> Result>;

Review Comment:
   ```suggestion
   async fn load(name: String, properties: HashMap) -> 
Result>;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump huggingface-hub from 0.31.4 to 0.32.1 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2046:
URL: https://github.com/apache/iceberg-python/pull/2046

   Bumps [huggingface-hub](https://github.com/huggingface/huggingface_hub) from 
0.31.4 to 0.32.1.
   
   Release notes
   Sourced from https://github.com/huggingface/huggingface_hub/releases";>huggingface-hub's
 releases.
   
   [v0.32.1]:  hot-fix: Fix tiny agents on Windows
   Patch release to fix https://redirect.github.com/huggingface/huggingface_hub/issues/3116";>#3116
   Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.32.0...v0.32.1";>https://github.com/huggingface/huggingface_hub/compare/v0.32.0...v0.32.1
   [v0.32.0]:  MCP Client, Tiny Agents CLI and more!
   🤖  Powering LLMs with Tools: MCP Client & Tiny Agents CLI
   ✨ The huggingface_hub library now includes an MCP Client, 
designed to empower Large Language Models (LLMs) with the ability to interact 
with external Tools via https://modelcontextprotocol.io/";>Model 
Context Protocol (MCP). This client extends the InfrenceClient 
and provides a seamless way to connect LLMs to both local and remote tool 
servers!
   pip install -U huggingface_hub[mcp]
   
   In the following example, we use the https://huggingface.co/Qwen/Qwen2.5-72B-Instruct";>Qwen/Qwen2.5-72B-Instruct
 model via the https://huggingface.co/docs/inference-providers/providers/nebius";>Nebius
 inference provider. We then add a remote MCP server, in this case, an SSE 
server which makes the Flux image generation tool available to the LLM:
   import os
   from huggingface_hub import ChatCompletionInputMessage, 
ChatCompletionStreamOutput, MCPClient
   async def main():
   async with MCPClient(
   provider="nebius",
   model="Qwen/Qwen2.5-72B-Instruct",
   api_key=os.environ["HF_TOKEN"],
   ) as client:
   await client.add_mcp_server(type="sse", url="https://evalstate-flux1-schnell.hf.space/gradio_api/mcp/sse";>https://evalstate-flux1-schnell.hf.space/gradio_api/mcp/sse")
   messages = [
   {
   "role": "user",
   "content": "Generate a picture of a cat on the moon",
   }
   ]
   async for chunk in client.process_single_turn_with_tools(messages):
   # Log messages
   if isinstance(chunk, ChatCompletionStreamOutput):
   delta = chunk.choices[0].delta
   if delta.content:
   print(delta.content, end="")
   # Or tool calls
   elif isinstance(chunk, ChatCompletionInputMessage):
   print(
   f"\nCalled tool '{chunk.name}'. Result: 
'{chunk.content if len(chunk.content) < 1000 else chunk.content[:1000] + 
'...'}'"
   )
   
   if name == "main":
   import asyncio
   asyncio.run(main())
   
   
   
   ... (truncated)
   
   
   Commits
   
   https://github.com/huggingface/huggingface_hub/commit/5add97934bca73d651adcbfd553b7f7d16f6662d";>5add979
 Release: v0.32.1
   https://github.com/huggingface/huggingface_hub/commit/7ea6e4858eb218398f21cac7a713980170fa27bf";>7ea6e48
 [MCP] fix tiny-agents on Windows (https://redirect.github.com/huggingface/huggingface_hub/issues/3116";>#3116)
   https://github.com/huggingface/huggingface_hub/commit/a97424f025a1a0141f0a7a8c65dfa5482af560c6";>a97424f
 Release: v0.32.0
   https://github.com/huggingface/huggingface_hub/commit/e9177eb5cc70d848718a70087e4209ce4b496fdb";>e9177eb
 Release: v0.32.0.rc1
   https://github.com/huggingface/huggingface_hub/commit/0b9c9ee6bd66b507b2566470a5c3c2ceca623424";>0b9c9ee
 [MCP] better tiny-agents CLI help (https://redirect.github.com/huggingface/huggingface_hub/issues/3105";>#3105)
   https://github.com/huggingface/huggingface_hub/commit/376584807892936396f45be200c383a3494ae88e";>3765848
 Release: v0.32.0.rc0
   https://github.com/huggingface/huggingface_hub/commit/9f34fb82fc3486ceb16942224d75cf523ccb7fbc";>9f34fb8
 Release: v0.32.O.rc0
   https://github.com/huggingface/huggingface_hub/commit/92619e4229895393ad5f72c25d3db73aa2d869ea";>92619e4
 [Internal] make hf-xet (again) a required dependency (https://redirect.github.com/huggingface/huggingface_hub/issues/3103";>#3103)
   https://github.com/huggingface/huggingface_hub/commit/cadb7a9e2d425c9ee5e893968ec311dfe0742683";>cadb7a9
 [MCP] Add documentation (https://redirect.github.com/huggingface/huggingface_hub/issues/3102";>#3102)
   https://github.com/huggingface/huggingface_hub/commit/417ad897432196087181302faff0e5f64f461bbd";>417ad89
 [Inference Providers] Fix structured output schema in chat completion (https://redirect.github.com/huggingface/huggingface_hub/issues/3082";>#3082)
   Additional commits viewable in https://github.com/huggingface/huggingface_hub/compare/v0.31.4...v0.32.1";>compare
 view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=huggingface-hub&package-manager=pip&previous-version=0.31.4&new-version=0.32.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve

[PR] Build: Bump mkdocstrings-python from 1.16.10 to 1.16.11 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2044:
URL: https://github.com/apache/iceberg-python/pull/2044

   Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 
1.16.10 to 1.16.11.
   
   Release notes
   Sourced from https://github.com/mkdocstrings/python/releases";>mkdocstrings-python's 
releases.
   
   1.16.11
   https://github.com/mkdocstrings/python/releases/tag/1.16.11";>1.16.11 
- 2025-05-24
   https://github.com/mkdocstrings/python/compare/1.16.10...1.16.11";>Compare 
with 1.16.10
   Bug Fixes
   
   Fix highlighting for signature with known special names like 
__init__ (https://github.com/mkdocstrings/python/commit/7f956868f93a766346455fedb296c26787894d5c";>7f95686
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/mkdocstrings/issues/757";>Issue-mkdocstrings-757
   Use default font-size for parameter headings (https://github.com/mkdocstrings/python/commit/0a35b20a6050a28ba8492d93e5f9940a69462ce3";>0a35b20
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/mkdocstrings/issues/697";>Issue-mkdocstrings-697
   Prevent uppercasing H5 titles (by Material for MkDocs) (https://github.com/mkdocstrings/python/commit/ba669697daad5067ea5db3fdf8a2d5ba2f966b25";>ba66969
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/mkdocstrings/issues/697";>Issue-mkdocstrings-697,
 https://redirect.github.com/mkdocstrings/python/issues/276";>Issue-276
   Use configured heading even when signature is not separated (https://github.com/mkdocstrings/python/commit/096960abd79831d6fd45e2a7962dfd2bd49e4edd";>096960a
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/mkdocstrings/issues/767";>Issue-mkdocstrings-767,
 https://redirect.github.com/mkdocstrings/python/pull/278";>PR-278
   Render attribute names without full path in ToC (https://github.com/mkdocstrings/python/commit/d4e618ab794747b84dced848b1be824639fea2b8";>d4e618a
 by David Lee). https://redirect.github.com/mkdocstrings/python/issues/271";>Issue-271,
 https://redirect.github.com/mkdocstrings/python/pull/272";>PR-272
   
   
   
   
   Changelog
   Sourced from https://github.com/mkdocstrings/python/blob/main/CHANGELOG.md";>mkdocstrings-python's
 changelog.
   
   https://github.com/mkdocstrings/python/releases/tag/1.16.11";>1.16.11 
- 2025-05-24
   https://github.com/mkdocstrings/python/compare/1.16.10...1.16.11";>Compare 
with 1.16.10
   Bug Fixes
   
   Fix highlighting for signature with known special names like 
__init__ (https://github.com/mkdocstrings/python/commit/7f956868f93a766346455fedb296c26787894d5c";>7f95686
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/mkdocstrings/issues/757";>Issue-mkdocstrings-757
   Use default font-size for parameter headings (https://github.com/mkdocstrings/python/commit/0a35b20a6050a28ba8492d93e5f9940a69462ce3";>0a35b20
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/mkdocstrings/issues/697";>Issue-mkdocstrings-697
   Prevent uppercasing H5 titles (by Material for MkDocs) (https://github.com/mkdocstrings/python/commit/ba669697daad5067ea5db3fdf8a2d5ba2f966b25";>ba66969
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/mkdocstrings/issues/697";>Issue-mkdocstrings-697,
 https://redirect.github.com/mkdocstrings/python/issues/276";>Issue-276
   Use configured heading even when signature is not separated (https://github.com/mkdocstrings/python/commit/096960abd79831d6fd45e2a7962dfd2bd49e4edd";>096960a
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/mkdocstrings/issues/767";>Issue-mkdocstrings-767,
 https://redirect.github.com/mkdocstrings/python/pull/278";>PR-278
   Render attribute names without full path in ToC (https://github.com/mkdocstrings/python/commit/d4e618ab794747b84dced848b1be824639fea2b8";>d4e618a
 by David Lee). https://redirect.github.com/mkdocstrings/python/issues/271";>Issue-271,
 https://redirect.github.com/mkdocstrings/python/pull/272";>PR-272
   
   
   
   
   Commits
   
   https://github.com/mkdocstrings/python/commit/5d2ba0aa557f683c3f7338d61810034c9af4ab11";>5d2ba0a
 chore: Prepare release 1.16.11
   https://github.com/mkdocstrings/python/commit/7f956868f93a766346455fedb296c26787894d5c";>7f95686
 fix: Fix highlighting for signature with known special names like 
__init__
   https://github.com/mkdocstrings/python/commit/0a35b20a6050a28ba8492d93e5f9940a69462ce3";>0a35b20
 fix: Use default font-size for parameter headings
   https://github.com/mkdocstrings/python/commit/ba669697daad5067ea5db3fdf8a2d5ba2f966b25";>ba66969
 fix: Prevent uppercasing H5 titles (by Material for MkDocs)
   https://github.com/mkdocstrings/python/commit/096960abd79831d6fd45e2a7962dfd2bd49e4edd";>096960a
 fix: Use configured heading even when signature is not separated
   https://github.com/mkdocstrings/python/commit/d4e618ab794747b84dced848b1be824639fea2b8";>d4e618a
 fix: Render attribute names without full path in ToC
   https://github.com/mkdocstrings/python/c

[PR] Build: Bump coverage from 7.8.0 to 7.8.2 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2047:
URL: https://github.com/apache/iceberg-python/pull/2047

   Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.8.0 to 7.8.2.
   
   Changelog
   Sourced from https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst";>coverage's 
changelog.
   
   Version 7.8.2 — 2025-05-23
   
   Wheels are provided for Windows ARM64 on Python 3.11, 3.12, and 3.13.
   Thanks, Finn Womack _.
   
   .. _issue 1971: https://redirect.github.com/nedbat/coveragepy/pull/1971";>nedbat/coveragepy#1971
   .. _pull 1972: https://redirect.github.com/nedbat/coveragepy/pull/1972";>nedbat/coveragepy#1972
   .. _changes_7-8-1:
   Version 7.8.1 — 2025-05-21
   
   
   A number of EncodingWarnings were fixed that could appear if you've 
enabled
   PYTHONWARNDEFAULTENCODING, fixing issue 1966.  Thanks, 
Henry Schreiner .
   
   
   Fixed a race condition when using sys.monitoring with free-threading 
Python,
   closing issue 1970_.
   
   
   .. _issue 1966: https://redirect.github.com/nedbat/coveragepy/issues/1966";>nedbat/coveragepy#1966
   .. _pull 1967: https://redirect.github.com/nedbat/coveragepy/pull/1967";>nedbat/coveragepy#1967
   .. _issue 1970: https://redirect.github.com/nedbat/coveragepy/issues/1970";>nedbat/coveragepy#1970
   .. _changes_7-8-0:
   
   
   
   Commits
   
   https://github.com/nedbat/coveragepy/commit/51ab2e503faebf7b302850f0b0329103f0efd1f4";>51ab2e5
 build: have to keep expected dist counts in sync
   https://github.com/nedbat/coveragepy/commit/be7bbf236aaec0875e73948b2f50d6ec9ac97311";>be7bbf2
 docs: sample HTML for 7.8.2
   https://github.com/nedbat/coveragepy/commit/3cee850f3b8a7bff52fa4dd5575abe8b67b1f2ec";>3cee850
 docs: prep for 7.8.2
   https://github.com/nedbat/coveragepy/commit/39bc6b0dc8550d1457a60683200cbdfb6786ffc6";>39bc6b0
 docs: provide more details if the kit matrix is edited.
   https://github.com/nedbat/coveragepy/commit/a608fb310d0eb34c6d6d004a1b62b01c6f8afd33";>a608fb3
 build: add support for Windows arm64 (https://redirect.github.com/nedbat/coveragepy/issues/1972";>#1972)
   https://github.com/nedbat/coveragepy/commit/2fe622506356424406ddcac01bf1775eca5a92d3";>2fe6225
 build: run tox lint if actions have changed
   https://github.com/nedbat/coveragepy/commit/3d93a78e1df4b9ab88bbf03f84de40873ddb5541";>3d93a78
 docs: docs need scriv for making github releases
   https://github.com/nedbat/coveragepy/commit/0c443a2775aa64fda77c6e34fcc5144f5000db9c";>0c443a2
 build: bump version to 7.8.2
   https://github.com/nedbat/coveragepy/commit/ed98b8708ccc380bcb1490cd73b3e476f69c234f";>ed98b87
 docs: sample HTML for 7.8.1
   https://github.com/nedbat/coveragepy/commit/b98bc9b9878ff8c23bcaa0d7c5b2a55269c6783f";>b98bc9b
 docs: prep for 7.8.1
   Additional commits viewable in https://github.com/nedbat/coveragepy/compare/7.8.0...7.8.2";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=coverage&package-manager=pip&previous-version=7.8.0&new-version=7.8.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us

Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910782152

   https://rust.iceberg.apache.org/release.html#official-release
   ```
   git checkout v0.5.0-rc.2
   git tag -s "v0.5.0"
   git push origin "v0.5.0"
   ```
   
   Tag push triggered
   - [Security 
audit](https://github.com/apache/iceberg-rust/actions/runs/15263485037)
   - [Publish Python 🐍 distribution 📦 to 
PyPI](https://github.com/apache/iceberg-rust/actions/runs/15263485039)
   - 
[Publish](https://github.com/apache/iceberg-rust/actions/runs/15263485043/job/42925109437)
   
   The `Publish` workflow fails
   ```
   Run cargo publish --all-features
 cargo publish --all-features
 shell: /usr/bin/bash -e {0}
 env:
   rust_msrv: 1.85
   CARGO_REGISTRY_TOKEN: 
   Updating crates.io index
   Updating crates.io index
  Packaging iceberg v0.5.0 
(/home/runner/work/iceberg-rust/iceberg-rust/crates/iceberg)
   Updating crates.io index
   error: failed to prepare local package for uploading
   
   Caused by:
 failed to select a version for the requirement `iceberg-catalog-memory = 
"^0.5.0"`
 candidate versions found which didn't match: 0.4.0, 
0.[3](https://github.com/apache/iceberg-rust/actions/runs/15263485043/job/42925109437#step:4:3).0
 location searched: crates.io index
 required by package `iceberg 
v0.[5](https://github.com/apache/iceberg-rust/actions/runs/15263485043/job/42925109437#step:4:5).0
 (/home/runner/work/iceberg-rust/iceberg-rust/crates/iceberg)`
   Error: Process completed with exit code 101.
   ```
   
   I think it might be due to `crates/iceberg/Cargo.toml` having 
`iceberg-catalog-memory = { workspace = true }` in `dev-dependencies`
   
https://github.com/apache/iceberg-rust/blob/fbf9b92e0a201d1e37d74a568687174eac823539/crates/iceberg/Cargo.toml#L93
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910783263

   I was able to verify `cargo publish` can run successfully with the 
`iceberg-catalog-memory` dev dependency commented out
   ```
   cargo publish --all-features --dry-run --allow-dirty
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910809436

   hm, these are hardcoded. should we add more packages here? 
   
   
https://github.com/apache/iceberg-rust/blob/fbf9b92e0a201d1e37d74a568687174eac823539/.github/workflows/publish.yml#L36-L45
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] ADLSFileIO cache DefaultAzureCredentials? [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] closed issue #11523: ADLSFileIO cache 
DefaultAzureCredentials?
URL: https://github.com/apache/iceberg/issues/11523


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910894995

   looks like `pyiceberg-core 0.5.0` already published on pypi
   https://pypi.org/project/pyiceberg-core/0.5.0/
   
   I think we need to yank this, fix the underlying issue and re-release as 
0.5.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: support decompress gzip metadata [iceberg-cpp]

2025-05-26 Thread via GitHub


wgtmac commented on code in PR #108:
URL: https://github.com/apache/iceberg-cpp/pull/108#discussion_r2108002007


##
src/iceberg/util/gzip_internal.cc:
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include "iceberg/util/gzip_internal.h"
+
+#include 
+
+#include 
+
+#include "iceberg/util/macros.h"
+
+namespace iceberg {
+
+class ZlibImpl {
+ public:
+  ZlibImpl() { memset(&stream_, 0, sizeof(stream_)); }
+
+  ~ZlibImpl() {
+if (initialized_) {
+  inflateEnd(&stream_);
+}
+  }
+
+  Status Init() {
+// Maximum window size
+static int kWindowBits = 15;

Review Comment:
   ```suggestion
   constexpr int kWindowBits = 15;
   ```



##
src/iceberg/util/gzip_internal.cc:
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include "iceberg/util/gzip_internal.h"
+
+#include 
+
+#include 
+
+#include "iceberg/util/macros.h"
+
+namespace iceberg {
+
+class ZlibImpl {

Review Comment:
   Perhaps we can define a nesting class in that function and the dtor will be 
called when the function exists. I think @yingcai-cy's point is that we can 
simply declare `Result Decompress(std::string_view input)` in the 
header file to make it easy to use.



##
src/iceberg/result.h:
##
@@ -38,6 +38,7 @@ enum class ErrorKind {
   kJsonParseError,
   kNoSuchNamespace,
   kNoSuchTable,
+  kDecompressError,

Review Comment:
   nit: sort alphabetically



##
src/iceberg/result.h:
##
@@ -80,6 +81,7 @@ DEFINE_ERROR_FUNCTION(IOError)
 DEFINE_ERROR_FUNCTION(JsonParseError)
 DEFINE_ERROR_FUNCTION(NoSuchNamespace)
 DEFINE_ERROR_FUNCTION(NoSuchTable)
+DEFINE_ERROR_FUNCTION(DecompressError)

Review Comment:
   nit: sort alphabetically



##
src/iceberg/util/gzip_internal.cc:
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include "iceberg/util/gzip_internal.h"
+
+#include 
+
+#include 
+
+#include "iceberg/util/macros.h"
+
+namespace iceberg {
+
+class ZlibImpl {
+ public:
+  ZlibImpl() { memset(&stream_, 0, sizeof(stream_)); }
+
+  ~ZlibImpl() {
+if (initialized_) {
+  inflateEnd(&stream_);
+}
+  }
+
+  Status Init() {
+// Maximum window size
+static int kWindowBits = 15;
+// Determine if this is libz or gzip from header.
+static int kDetectCodec = 32;

Review Comment:
   ```suggestion
   constexpr int kDetectCodec = 32;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.o

Re: [I] Spark: add IcebergConnectHiveDelegationTokenProvider [iceberg]

2025-05-26 Thread via GitHub


zhangwl9 commented on issue #13116:
URL: https://github.com/apache/iceberg/issues/13116#issuecomment-2910897592

   @pvary  @gaborgsomogyi  
For Spark , I'm going to use the scheme shown above to implement an 
IcebergHiveConnectorDelegationTokenProvider within the Iceberg_Spark module 
based on Spark's interface 
HadoopDelegationTokenProvider(https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/security/HadoopDelegationTokenProvider.scala)。
 For Flink, I'm going to  implement an 
IcebergHiveConnectorDelegationTokenProvider within the Iceberg_Flink module 
based on Flink's interface 
[DelegationTokenProvider](https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/security/token/DelegationTokenProvider.java)。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] static table metadata access support [iceberg-cpp]

2025-05-26 Thread via GitHub


lishuxu opened a new pull request, #111:
URL: https://github.com/apache/iceberg-cpp/pull/111

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[I] Flink: add IcebergConnectHiveDelegationTokenProvider [iceberg]

2025-05-26 Thread via GitHub


zhangwl9 opened a new issue, #13159:
URL: https://github.com/apache/iceberg/issues/13159

   ### Feature Request / Improvement
   
   test
   
   ### Query engine
   
   Flink
   
   ### Willingness to contribute
   
   - [x] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[I] Add IcebergHiveConnectorDelegationTokenProvider for Iceberg [iceberg]

2025-05-26 Thread via GitHub


zhangwl9 opened a new issue, #13158:
URL: https://github.com/apache/iceberg/issues/13158

   ### Feature Request / Improvement
   
   I want to implement both Spark and the Flink delegation token supported by 
the HiveCatalog;
   
   
   ### Query engine
   
   None
   
   ### Willingness to contribute
   
   - [x] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Add IcebergHiveConnectorDelegationTokenProvider for Iceberg [iceberg]

2025-05-26 Thread via GitHub


zhangwl9 commented on issue #13158:
URL: https://github.com/apache/iceberg/issues/13158#issuecomment-2910909964

   - [ ] Add IcebergConnectHiveDelegationTokenProvider for HiveCatalog in 
Iceberg_Spark module
   - [ ] Add IcebergConnectHiveDelegationTokenProvider for HiveCatalog in 
Iceberg_Flink module


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] fix 0.5.x release `cargo publish` [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu opened a new pull request, #1379:
URL: https://github.com/apache/iceberg-rust/pull/1379

   ## Which issue does this PR close?
   
   
   
   - Closes #.
   
   ## What changes are included in this PR?
   
   
   This PR removes `iceberg-catalog-memory` as a dev-dependency of 
`crates/iceberg`. The dependency is not used and caused `cargo publish` to fail 
during the 0.5.0 release (See 
https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910782152)
   
   This PR also changes `.github/workflows/release_python.yml` to depend on 
`publish.yml` since we only want to publish `pyiceberg-core` to pypi after the 
crates are successfully published. 
   
   
https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#running-a-workflow-based-on-the-conclusion-of-another-workflow
   
   ## Are these changes tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910937265

   Seems we need a way to detect all crates.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spec: add sort order to spec [iceberg]

2025-05-26 Thread via GitHub


Nicholasjl commented on PR #2055:
URL: https://github.com/apache/iceberg/pull/2055#issuecomment-2910943224

   Hi,
   
   I noticed that in IceNerg, the expression -0 < 0 evaluates to true. However, 
according to the IEEE 754 standard, -0.0 and 0.0 are considered numerically 
equal, and -0.0 < 0.0 should return false.
   
   Could you please clarify why this behavior occurs? Is it intentional or a 
potential issue?
   
   Thanks in advance!
   
   Best regards,
   J


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] fix 0.5.x release `cargo publish` [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 commented on PR #1379:
URL: https://github.com/apache/iceberg-rust/pull/1379#issuecomment-2910950841

   The ut failure is caused by generating doc. I think we could remove those 
docs for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2911104289

   ASF release 
   ```
   git checkout v0.5.1-rc.1
   ICEBERG_VERSION=0.5.1 ICEBERG_VERSION_RC=1 ./scripts/release.sh
   
   svn co https://dist.apache.org/repos/dist/dev/iceberg/ /tmp/iceberg-dist-dev
   release_version=0.5.1-rc.1
   mkdir /tmp/iceberg-dist-dev/apache-iceberg-rust-${release_version}
   cp dist/* /tmp/iceberg-dist-dev/apache-iceberg-rust-${release_version}/
   cd /tmp/iceberg-dist-dev/
   
   svn status
   svn add apache-iceberg-rust-${release_version}
   svn status
   svn commit -m "Prepare for iceberg-rust ${release_version}"
   ```
   
   Uploaded to dev dist
   
https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-rust-0.5.1-rc.1/
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2911096775

   > Do we need another vote for 0.5.1?
   
   yep, we should
   
   push a new tag for `v0.5.1-rc.1` 
   https://github.com/apache/iceberg-rust/releases/tag/v0.5.1-rc.1
   
   Triggered:
   - [Security 
audit](https://github.com/apache/iceberg-rust/actions/runs/15266930402)
   - [Publish](https://github.com/apache/iceberg-rust/actions/runs/15266930404)
   - [Publish Python 🐍 distribution 📦 to 
PyPI](https://github.com/apache/iceberg-rust/actions/runs/15266948994)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2911090349

   > Bump iceberg-rust version to 0.5.1 
[#1380](https://github.com/apache/iceberg-rust/pull/1380)
   
   Do we need another vote for 0.5.1?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] fix: add metadata_properties to _construct_parameters when update hive table [iceberg-python]

2025-05-26 Thread via GitHub


geruh commented on code in PR #2013:
URL: https://github.com/apache/iceberg-python/pull/2013#discussion_r2108177674


##
pyiceberg/catalog/hive.py:
##
@@ -211,11 +211,18 @@ def _construct_hive_storage_descriptor(
 DEFAULT_PROPERTIES = {TableProperties.PARQUET_COMPRESSION: 
TableProperties.PARQUET_COMPRESSION_DEFAULT}
 
 
-def _construct_parameters(metadata_location: str, previous_metadata_location: 
Optional[str] = None) -> Dict[str, Any]:
+def _construct_parameters(
+metadata_location: str, previous_metadata_location: Optional[str] = None, 
metadata_properties: Optional[Properties] = None
+) -> Dict[str, Any]:
 properties = {PROP_EXTERNAL: "TRUE", PROP_TABLE_TYPE: "ICEBERG", 
PROP_METADATA_LOCATION: metadata_location}
 if previous_metadata_location:
 properties[PROP_PREVIOUS_METADATA_LOCATION] = 
previous_metadata_location
 
+if metadata_properties:
+for key, value in metadata_properties.items():

Review Comment:
   Ahh I see, nice! We are just resetting the properties based on the metadata 
now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-296409

   0.5.1 RC1 devlist thread, 
https://lists.apache.org/thread/h1qw21cdc56vltt9mzkn0sq36mp9tmpw
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[I] Literals.LongLiteral conversion issue to TimestampNanoLiteral [iceberg]

2025-05-26 Thread via GitHub


xxlaykxx opened a new issue, #13160:
URL: https://github.com/apache/iceberg/issues/13160

   ### Apache Iceberg version
   
   1.9.0 (latest release)
   
   ### Query engine
   
   Dremio
   
   ### Please describe the bug 🐞
   
   Because for the conversion to TimestampNanoLiteral, LongLiteral use
   
   ```
   case TIMESTAMP_NANO:
 // assume micros and convert to nanos to match the behavior in the 
timestamp case above
 return new TimestampLiteral(value()).to(type);
   ```
   and if the long value representation of the timestamp nano, it causes
   
   ```
   SYSTEM ERROR: ArithmeticException: long overflow
 (java.lang.ArithmeticException) long overflow
   java.lang.Math.multiplyExact():1032
   org.apache.iceberg.util.DateTimeUtil.microsToNanos():101
   org.apache.iceberg.expressions.Literals$TimestampLiteral.to():445
   org.apache.iceberg.expressions.Literals$LongLiteral.to():304
   ```
   
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [x] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Bump iceberg-rust version to 0.5.1 [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on code in PR #1380:
URL: https://github.com/apache/iceberg-rust/pull/1380#discussion_r2108118748


##
bindings/python/pyproject.toml:
##
@@ -33,7 +33,7 @@ classifiers = [
 name = "pyiceberg-core"
 readme = "project-description.md"
 requires-python = "~=3.9"
-version = "0.5.0"
+dynamic = ["version"]

Review Comment:
   based on `bindings/python/Cargo.toml`'s `version` 
   See https://www.maturin.rs/metadata.html#dynamic-metadata
   
   I also verified locally



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] feat: basic table scan planning [iceberg-cpp]

2025-05-26 Thread via GitHub


gty404 opened a new pull request, #112:
URL: https://github.com/apache/iceberg-cpp/pull/112

   Introducing basic scan table data interface


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Flink: Migrate Flink TableSchema for IcebergSource [iceberg]

2025-05-26 Thread via GitHub


liamzwbao commented on code in PR #13072:
URL: https://github.com/apache/iceberg/pull/13072#discussion_r2108061992


##
flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java:
##
@@ -192,7 +222,9 @@ public static Type convert(LogicalType flinkType) {
*
* @param rowType a RowType
* @return Flink TableSchema
+   * @deprecated use {@link #toResolvedSchema(RowType)} instead
*/
+  @Deprecated
   public static TableSchema toSchema(RowType rowType) {

Review Comment:
   Hi Peter! Yes, we don't need validations for watermarks and primary keys for 
this method, the only thing we need is duplication check.
   
   But there is another 
[toSchema](https://github.com/apache/iceberg/blob/5d2c014ad51028dccd049b447619e08f90bb560e/flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java#L260-L283)
 in this util that I plan to deprecate in the next PR. That one does require a 
primary key check, and I’m wondering whether we should implement that check 
manually or rely on something else.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910963966

   `iceberg_catalog_memory` is used in `crates/iceberg`'s docstring. For 
example, 
https://github.com/apache/iceberg-rust/blob/fbf9b92e0a201d1e37d74a568687174eac823539/crates/iceberg/src/lib.rs#L28
   
   We also cannot reorder publishing since `iceberg-catalog-memory` depends on 
`crates/iceberg`
   
https://github.com/apache/iceberg-rust/blob/fbf9b92e0a201d1e37d74a568687174eac823539/crates/catalog/memory/Cargo.toml#L34


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] static table metadata access support [iceberg-cpp]

2025-05-26 Thread via GitHub


lishuxu closed pull request #111: static table metadata access support
URL: https://github.com/apache/iceberg-cpp/pull/111


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] chore(deps): Bump aws-sdk-glue from 1.94.0 to 1.97.0 [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 merged PR #1376:
URL: https://github.com/apache/iceberg-rust/pull/1376


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910985626

   #1379 to fix forward the issue and release under 0.5.1 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] fix 0.5.x release `cargo publish` [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on code in PR #1379:
URL: https://github.com/apache/iceberg-rust/pull/1379#discussion_r2108078879


##
crates/iceberg/src/lib.rs:
##
@@ -21,7 +21,9 @@
 //!
 //! ## Scan A Table
 //!
-//! ```rust, no_run
+//! ```rust, ignore

Review Comment:
   @liurenjie1024 wdyt of using `ignore` here for now? i didnt want to remove 
all the docs 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] fix 0.5.x release `cargo publish` [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 commented on code in PR #1379:
URL: https://github.com/apache/iceberg-rust/pull/1379#discussion_r2108084363


##
crates/iceberg/src/lib.rs:
##
@@ -21,7 +21,9 @@
 //!
 //! ## Scan A Table
 //!
-//! ```rust, no_run
+//! ```rust, ignore

Review Comment:
   Yes, it's fine to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump getdaft from 0.4.15 to 0.4.16 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2042:
URL: https://github.com/apache/iceberg-python/pull/2042

   Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.4.15 to 0.4.16.
   
   Release notes
   Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's 
releases.
   
   v0.4.16
   What's Changed 🚀
   ✨ Features
   
   feat: Add put_multipart to s3_like https://github.com/rohitkulshreshtha";>@​rohitkulshreshtha
 (https://redirect.github.com/Eventual-Inc/Daft/issues/4360";>#4360)
   feat: adds partitioning classes for python https://github.com/rchowell";>@​rchowell (https://redirect.github.com/Eventual-Inc/Daft/issues/4366";>#4366)
   feat: Flotilla scheduler https://github.com/colin-ho";>@​colin-ho (https://redirect.github.com/Eventual-Inc/Daft/issues/4349";>#4349)
   feat: Add a native local parquet writer https://github.com/desmondcheongzx";>@​desmondcheongzx 
(https://redirect.github.com/Eventual-Inc/Daft/issues/4260";>#4260)
   feat: Flotilla plan result https://github.com/colin-ho";>@​colin-ho (https://redirect.github.com/Eventual-Inc/Daft/issues/4275";>#4275)
   feat: Add optional spark dependency for pyspark connector https://github.com/desmondcheongzx";>@​desmondcheongzx 
(https://redirect.github.com/Eventual-Inc/Daft/issues/4368";>#4368)
   feat: add a repr_json for logical plans. https://github.com/universalmind303";>@​universalmind303 
(https://redirect.github.com/Eventual-Inc/Daft/issues/4354";>#4354)
   feat: Flotilla utils https://github.com/colin-ho";>@​colin-ho (https://redirect.github.com/Eventual-Inc/Daft/issues/4345";>#4345)
   
   🐛 Bug Fixes
   
   fix: More mypy fixes https://github.com/colin-ho";>@​colin-ho (https://redirect.github.com/Eventual-Inc/Daft/issues/4388";>#4388)
   fix: skip flaky actor pool GPU test https://github.com/kevinzwang";>@​kevinzwang (https://redirect.github.com/Eventual-Inc/Daft/issues/4397";>#4397)
   fix: Remove botocore dependency when working with deltalake https://github.com/desmondcheongzx";>@​desmondcheongzx 
(https://redirect.github.com/Eventual-Inc/Daft/issues/4369";>#4369)
   fix: mypy catalog fixes https://github.com/rchowell";>@​rchowell (https://redirect.github.com/Eventual-Inc/Daft/issues/4384";>#4384)
   fix(mypy): part of the mypy strict mode errors https://github.com/kevinzwang";>@​kevinzwang (https://redirect.github.com/Eventual-Inc/Daft/issues/4386";>#4386)
   fix: Add Rust Testing Import https://github.com/srilman";>@​srilman (https://redirect.github.com/Eventual-Inc/Daft/issues/4382";>#4382)
   fix: adds HTTP and HF retry logic based upon exist GCS retry logic https://github.com/rchowell";>@​rchowell (https://redirect.github.com/Eventual-Inc/Daft/issues/4371";>#4371)
   fix: Remove dbg! in url download https://github.com/colin-ho";>@​colin-ho (https://redirect.github.com/Eventual-Inc/Daft/issues/4374";>#4374)
   fix: ignores tests/integration for make test target https://github.com/rchowell";>@​rchowell (https://redirect.github.com/Eventual-Inc/Daft/issues/4365";>#4365)
   fix: mirror raw.githubusercontent via S3 to avoid CI throttling https://github.com/rchowell";>@​rchowell (https://redirect.github.com/Eventual-Inc/Daft/issues/4367";>#4367)
   fix: CSV read with disjoint predicate pushdown https://github.com/kevinzwang";>@​kevinzwang (https://redirect.github.com/Eventual-Inc/Daft/issues/4363";>#4363)
   
   🚀 Performance
   
   perf: Split projections with expressions that need granular batching https://github.com/srilman";>@​srilman (https://redirect.github.com/Eventual-Inc/Daft/issues/4329";>#4329)
   perf: Use url_download max_connections for projection batch size https://github.com/srilman";>@​srilman (https://redirect.github.com/Eventual-Inc/Daft/issues/4328";>#4328)
   perf: Morsel size ranges for project and filter operators https://github.com/srilman";>@​srilman (https://redirect.github.com/Eventual-Inc/Daft/issues/4344";>#4344)
   
   ♻️ Refactor
   
   refactor(exprs): move all list exprs to new expr https://github.com/universalmind303";>@​universalmind303 
(https://redirect.github.com/Eventual-Inc/Daft/issues/4340";>#4340)
   refactor(exprs): 6 of 6 move all utf8 exprs to own crate https://github.com/universalmind303";>@​universalmind303 
(https://redirect.github.com/Eventual-Inc/Daft/issues/4312";>#4312)
   refactor(exprs): proc macro for parsing arguments into structs https://github.com/kevinzwang";>@​kevinzwang (https://redirect.github.com/Eventual-Inc/Daft/issues/4348";>#4348)
   refactor(exprs): remove unused sql json module https://github.com/universalmind303";>@​universalmind303 
(https://redirect.github.com/Eventual-Inc/Daft/issues/4364";>#4364)
   refactor(exprs): url upload/download to new pattern https://github.com/universalmind303";>@​universalmind303 
(https://redirect.github.com/Eventual-Inc/Daft/issues/4352";>#4352)
   refactor(ordinals): bound expressions in table statistics https://github.com/kevinzwang";>@​kevinzwang (https://redirect.github.com/Eventual-Inc/Daft/issues/4342";>#4

[PR] Build: Bump thrift from 0.21.0 to 0.22.0 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2043:
URL: https://github.com/apache/iceberg-python/pull/2043

   Bumps [thrift](https://github.com/apache/thrift) from 0.21.0 to 0.22.0.
   
   Release notes
   Sourced from https://github.com/apache/thrift/releases";>thrift's releases.
   
   Version 0.22.0
   Please head over to the official release download source:
   http://thrift.apache.org/download";>http://thrift.apache.org/download
   The assets listed below are added by Github based on the release tag and 
they will therefore not match the checkums published on the Thrift project 
website.
   
   
   
   Commits
   
   See full diff in https://github.com/apache/thrift/compare/v0.21.0...v0.22.0";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=thrift&package-manager=pip&previous-version=0.21.0&new-version=0.22.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Avoid Avro recursive schema for Variant schema. [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] closed pull request #12459: Avoid Avro recursive schema for 
Variant schema.
URL: https://github.com/apache/iceberg/pull/12459


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Avoid Avro recursive schema for Variant schema. [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] commented on PR #12459:
URL: https://github.com/apache/iceberg/pull/12459#issuecomment-2910801756

   This pull request has been closed due to lack of activity. This is not a 
judgement on the merit of the PR in any way. It is just a way of keeping the PR 
queue manageable. If you think that is incorrect, or the pull request requires 
review, you can revive the PR at any time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910995021

   @liurenjie1024 / @Xuanwo / @sungwy are any of you owners for 
https://pypi.org/project/pyiceberg-core/? Only owners can yank a release 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910979921

   
   
   
   
   > `iceberg_catalog_memory` is used in `crates/iceberg`'s docstring. For 
example,
   > 
   > 
[iceberg-rust/crates/iceberg/src/lib.rs](https://github.com/apache/iceberg-rust/blob/fbf9b92e0a201d1e37d74a568687174eac823539/crates/iceberg/src/lib.rs#L28)
   > 
   > Line 28 in 
[fbf9b92](/apache/iceberg-rust/commit/fbf9b92e0a201d1e37d74a568687174eac823539)
   > 
   >  //! use iceberg_catalog_memory::MemoryCatalog; 
   > We also cannot reorder publishing since `iceberg-catalog-memory` depends 
on `crates/iceberg`
   > 
   > 
[iceberg-rust/crates/catalog/memory/Cargo.toml](https://github.com/apache/iceberg-rust/blob/fbf9b92e0a201d1e37d74a568687174eac823539/crates/catalog/memory/Cargo.toml#L34)
   > 
   > Line 34 in 
[fbf9b92](/apache/iceberg-rust/commit/fbf9b92e0a201d1e37d74a568687174eac823539)
   > 
   >  iceberg = { workspace = true }
   
   Hi, @kevinjqliu I think it's fine to remove the doc for now.  We could add 
the doc later in rust doc website.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] java.io.IOException: can not read class org.apache.iceberg.shaded.org.apache.parquet.format.PageHeader: Required field 'num_values' was not found in serialized data [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] commented on issue #11614:
URL: https://github.com/apache/iceberg/issues/11614#issuecomment-2910801702

   This issue has been automatically marked as stale because it has been open 
for 180 days with no activity. It will be closed in next 14 days if no further 
activity occurs. To permanently prevent this issue from being considered stale, 
add the label 'not-stale', but commenting on the issue is preferred when 
possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: SnapshotTableSparkAction add validation for non-overlapping source/dest table paths. [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] closed pull request #12779: Spark: SnapshotTableSparkAction 
add validation for non-overlapping source/dest table paths.
URL: https://github.com/apache/iceberg/pull/12779


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: SnapshotTableSparkAction add validation for non-overlapping source/dest table paths. [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] commented on PR #12779:
URL: https://github.com/apache/iceberg/pull/12779#issuecomment-2910801791

   This pull request has been closed due to lack of activity. This is not a 
judgement on the merit of the PR in any way. It is just a way of keeping the PR 
queue manageable. If you think that is incorrect, or the pull request requires 
review, you can revive the PR at any time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] ADLSFileIO cache DefaultAzureCredentials? [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] commented on issue #11523:
URL: https://github.com/apache/iceberg/issues/11523#issuecomment-2910801639

   This issue has been closed because it has not received any activity in the 
last 14 days since being marked as 'stale'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] How to run streaming upserts and maintenance simultaneously? [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] commented on issue #11530:
URL: https://github.com/apache/iceberg/issues/11530#issuecomment-2910801666

   This issue has been closed because it has not received any activity in the 
last 14 days since being marked as 'stale'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] How to run streaming upserts and maintenance simultaneously? [iceberg]

2025-05-26 Thread via GitHub


github-actions[bot] closed issue #11530: How to run streaming upserts and 
maintenance simultaneously?
URL: https://github.com/apache/iceberg/issues/11530


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Enhance `catalog.create_table` API to enable creation of table with matching `field_ids` to provided Schema [iceberg-python]

2025-05-26 Thread via GitHub


github-actions[bot] commented on issue #1284:
URL: 
https://github.com/apache/iceberg-python/issues/1284#issuecomment-2910804096

   This issue has been automatically marked as stale because it has been open 
for 180 days with no activity. It will be closed in next 14 days if no further 
activity occurs. To permanently prevent this issue from being considered stale, 
add the label 'not-stale', but commenting on the issue is preferred when 
possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump mypy-boto3-glue from 1.38.18 to 1.38.22 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2038:
URL: https://github.com/apache/iceberg-python/pull/2038

   Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 
1.38.18 to 1.38.22.
   
   Release notes
   Sourced from https://github.com/youtype/mypy_boto3_builder/releases";>mypy-boto3-glue's 
releases.
   
   8.8.0 - Python 3.8 runtime is back
   Changed
   
   [services] install_requires section is 
calculated based on dependencies in use, so typing-extensions 
version is set properly
   [all] Replaced typing imports with 
collections.abc with a fallback to typing for Python 
<3.9
   [all] Added aliases for builtins.list, 
builtins.set, builtins.dict, and 
builtins.type, so Python 3.8 runtime should work as expected again 
(reported by https://github.com/YHallouard";>@​YHallouard in https://redirect.github.com/youtype/mypy_boto3_builder/issues/340";>#340
 and https://github.com/Omri-Ben-Yair";>@​Omri-Ben-Yair in https://redirect.github.com/youtype/mypy_boto3_builder/issues/336";>#336)
   [all] Unions use the same type annotations as the rest of 
the structures due to proper fallbacks
   
   Fixed
   
   [services] Universal input/output shapes were not replaced 
properly in service subresources
   [docs] Simplified doc links rendering for services
   [services] Cleaned up unnecessary imports in 
client.pyi
   [builder] Import records with fallback are always 
rendered
   
   
   
   
   Commits
   
   See full diff in https://github.com/youtype/mypy_boto3_builder/commits";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=mypy-boto3-glue&package-manager=pip&previous-version=1.38.18&new-version=1.38.22)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump mkdocs-autorefs from 1.4.1 to 1.4.2 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2048:
URL: https://github.com/apache/iceberg-python/pull/2048

   Bumps [mkdocs-autorefs](https://github.com/mkdocstrings/autorefs) from 1.4.1 
to 1.4.2.
   
   Release notes
   Sourced from https://github.com/mkdocstrings/autorefs/releases";>mkdocs-autorefs's 
releases.
   
   1.4.2
   https://github.com/mkdocstrings/autorefs/releases/tag/1.4.2";>1.4.2 - 
2025-05-20
   https://github.com/mkdocstrings/autorefs/compare/1.4.1...1.4.2";>Compare 
with 1.4.1
   Build
   
   Exclude mypy cache from dists (https://github.com/mkdocstrings/autorefs/commit/5e77f7fdbd79d4aca6473e9224270752ed9b9165";>5e77f7f
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/autorefs/issues/71";>Issue-71
   
   
   
   
   Changelog
   Sourced from https://github.com/mkdocstrings/autorefs/blob/main/CHANGELOG.md";>mkdocs-autorefs's
 changelog.
   
   https://github.com/mkdocstrings/autorefs/releases/tag/1.4.2";>1.4.2 - 
2025-05-20
   https://github.com/mkdocstrings/autorefs/compare/1.4.1...1.4.2";>Compare 
with 1.4.1
   Build
   
   Exclude mypy cache from dists (https://github.com/mkdocstrings/autorefs/commit/5e77f7fdbd79d4aca6473e9224270752ed9b9165";>5e77f7f
 by Timothée Mazzucotelli). https://redirect.github.com/mkdocstrings/autorefs/issues/71";>Issue-71
   
   
   
   
   Commits
   
   https://github.com/mkdocstrings/autorefs/commit/ca304f019fc0e79a604074d37699b55f77df4ae9";>ca304f0
 chore: Prepare release 1.4.2
   https://github.com/mkdocstrings/autorefs/commit/5e77f7fdbd79d4aca6473e9224270752ed9b9165";>5e77f7f
 build: Exclude mypy cache from dists
   https://github.com/mkdocstrings/autorefs/commit/47238d44cc9121b0f8594a4a9bd4c78acbe39930";>47238d4
 chore: Template upgrade
   See full diff in https://github.com/mkdocstrings/autorefs/compare/1.4.1...1.4.2";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=mkdocs-autorefs&package-manager=pip&previous-version=1.4.1&new-version=1.4.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910764412

   vote passed! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] chore(deps): Bump uuid from 1.16.0 to 1.17.0 [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 merged PR #1375:
URL: https://github.com/apache/iceberg-rust/pull/1375


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Bump iceberg-rust version to 0.5.0 [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on PR #1345:
URL: https://github.com/apache/iceberg-rust/pull/1345#issuecomment-2911019657

   thanks for the review @liurenjie1024 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] fix 0.5.x release `cargo publish` [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu merged PR #1379:
URL: https://github.com/apache/iceberg-rust/pull/1379


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2911035129

   Bump iceberg-rust version to 0.5.1 #1380
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] chore(deps): Bump ordered-float from 2.10.1 to 4.6.0 [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 merged PR #1374:
URL: https://github.com/apache/iceberg-rust/pull/1374


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Spark: add IcebergHiveConnectorDelegationTokenProvider for HiveCatalog [iceberg]

2025-05-26 Thread via GitHub


gaborgsomogyi commented on issue #13116:
URL: https://github.com/apache/iceberg/issues/13116#issuecomment-2911197379

   Happy to hear experiences when it's in place.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump pydantic from 2.11.4 to 2.11.5 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2045:
URL: https://github.com/apache/iceberg-python/pull/2045

   Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.11.4 to 2.11.5.
   
   Release notes
   Sourced from https://github.com/pydantic/pydantic/releases";>pydantic's 
releases.
   
   v2.11.5 2025-05-22
   
   What's Changed
   Fixes
   
   Check if FieldInfo is complete after applying type variable 
map by https://github.com/Viicos";>@​Viicos in https://redirect.github.com/pydantic/pydantic/pull/11855";>#11855
   Do not delete mock validator/serializer in model_rebuild() 
by https://github.com/Viicos";>@​Viicos in https://redirect.github.com/pydantic/pydantic/pull/11890";>#11890
   Do not duplicate metadata on model rebuild by https://github.com/Viicos";>@​Viicos in https://redirect.github.com/pydantic/pydantic/pull/11902";>#11902
   
   Full Changelog: https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5";>https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5
   
   
   
   Changelog
   Sourced from https://github.com/pydantic/pydantic/blob/main/HISTORY.md";>pydantic's 
changelog.
   
   v2.11.5 (2025-05-22)
   https://github.com/pydantic/pydantic/releases/tag/v2.11.5";>GitHub 
release
   What's Changed
   Fixes
   
   Check if FieldInfo is complete after applying type variable 
map by https://github.com/Viicos";>@​Viicos in https://redirect.github.com/pydantic/pydantic/pull/11855";>#11855
   Do not delete mock validator/serializer in model_rebuild() 
by https://github.com/Viicos";>@​Viicos in https://redirect.github.com/pydantic/pydantic/pull/11890";>#11890
   Do not duplicate metadata on model rebuild by https://github.com/Viicos";>@​Viicos in https://redirect.github.com/pydantic/pydantic/pull/11902";>#11902
   
   
   
   
   Commits
   
   https://github.com/pydantic/pydantic/commit/5e6d1dc71fe9bd832635cb2e9b4af92286fd00b8";>5e6d1dc
 Prepare release v2.11.5
   https://github.com/pydantic/pydantic/commit/1b63218c42b515bd1f6b0dd323190236ead14bdb";>1b63218
 Do not duplicate metadata on model rebuild (https://redirect.github.com/pydantic/pydantic/issues/11902";>#11902)
   https://github.com/pydantic/pydantic/commit/5aefad873b3dfd60c419bd081ffaf0ac197c7b60";>5aefad8
 Do not delete mock validator/serializer in model_rebuild()
   https://github.com/pydantic/pydantic/commit/8fbe6585f4d6179e5234ab61de00059c52e57975";>8fbe658
 Check if FieldInfo is complete after applying type variable 
map
   https://github.com/pydantic/pydantic/commit/12b371a0f7f800bf65daa3eaada1b4348348d9c4";>12b371a
 Update documentation about @dataclass_transform support
   https://github.com/pydantic/pydantic/commit/3a6aef4400afe6ac1fcaab4f31774c1ee4aadcb3";>3a6aef4
 Fix missing link in documentation
   https://github.com/pydantic/pydantic/commit/0506b9cd8b3d544f135c624f4a7584dd53098cb7";>0506b9c
 Fix light/dark mode documentation toggle
   https://github.com/pydantic/pydantic/commit/58078c8b5624d800ec80dff295972737149f8080";>58078c8
 Fix typo in documentation
   See full diff in https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pydantic&package-manager=pip&previous-version=2.11.4&new-version=2.11.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Depen

[PR] Build: Bump moto from 5.1.4 to 5.1.5 [iceberg-python]

2025-05-26 Thread via GitHub


dependabot[bot] opened a new pull request, #2039:
URL: https://github.com/apache/iceberg-python/pull/2039

   Bumps [moto](https://github.com/getmoto/moto) from 5.1.4 to 5.1.5.
   
   Changelog
   Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's 
changelog.
   
   5.1.5
   Docker Digest for 5.1.5: 
sha256:b9dbd12d211c88e5799d023db15ec809bca4cc6df93a8aa78f26ccbfb073d18a
   New Services:
   * Connect Campaign:
   * create_campaign()
   * delete_campaign()
   * describe_campaign()
   * get_connect_instance_config()
   * start_instance_onboarding_job()
   * CloudDirectory:
   * create_directory()
   * delete_directory()
   * get_directory()
   * list_directories()
   * list_tags_for_resource()
   * tag_resource()
   * untag_resource()
   
   * Network Firewall:
   * create_firewall()
   * describe_firewall()
   * describe_logging_configuration()
   * list_firewalls()
   * update_logging_configuration()
   
   * ServiceCatalog-AppRegistry:
   * associate_resource()
   * create_application()
   * list_applications()
   * list_associated_resources()
   
   New Methods:
   * ACM PCA:
   * list_certificate_authorities()
   * CloudWatch:
   * delete_insight_rules()
   * describe_insight_rules()
   * disable_insight_rules()
   * enable_insight_rules()
   * put_insight_rule()
   
   * CodeDeploy:
   * list_tags_for_resource()
   * tag_resource()
   * untag_resource()
   
   
   
   
   ... (truncated)
   
   
   Commits
   
   https://github.com/getmoto/moto/commit/b7c66df079edcc1dff895ace82dd0bb457d73793";>b7c66df
 Pre-Release: Up Version Number
   https://github.com/getmoto/moto/commit/5b0a4ff6573ac063cd5e6ba07afd5f2585c79170";>5b0a4ff
 Prep release 5.1.5 (https://redirect.github.com/getmoto/moto/issues/8924";>#8924)
   https://github.com/getmoto/moto/commit/60015862bdade2178a2f3dbe487b7e173dbdcaa7";>6001586
 [EC2] Add modify_instance_metadata_options support (https://redirect.github.com/getmoto/moto/issues/8922";>#8922)
   https://github.com/getmoto/moto/commit/865b9a7f4613413c81dd8ecbdea052da1fab890f";>865b9a7
 SQS: Return MessageGroupId (and similar attributes) on receive_messages (https://redirect.github.com/getmoto/moto/issues/8902";>#8902)
   https://github.com/getmoto/moto/commit/f04c3a6ec2eb4d70787f678b9e8a2f1206ff5d5d";>f04c3a6
 EventBridge: Update describe_event_bus schema (https://redirect.github.com/getmoto/moto/issues/8893";>#8893)
   https://github.com/getmoto/moto/commit/3ef1ae7ae7a058c2cfd99f09889bae9147743b91";>3ef1ae7
 [Connect Campaign] - Config, Job, Campaign (https://redirect.github.com/getmoto/moto/issues/8905";>#8905)
   https://github.com/getmoto/moto/commit/2f6807fdcfc2702217f8d70a02a4c1e9b7bf245c";>2f6807f
 Add tagging support for CodeDeploy resources (https://redirect.github.com/getmoto/moto/issues/8878";>#8878)
   https://github.com/getmoto/moto/commit/a981fe0b889806eb2b6ca665a904f3faec00499b";>a981fe0
 ResourceGroups: support retrieval by the name or the ARN of the resource 
grou...
   https://github.com/getmoto/moto/commit/ca847fab14ff15d6692a3fed052f6b77bc23c474";>ca847fa
 [APIGateway] Add ApiKeyNotFoundException for update/delete_api_key (https://redirect.github.com/getmoto/moto/issues/8918";>#8918)
   https://github.com/getmoto/moto/commit/c04116b3556f8c5d309934ec322ca3ae93aaa34f";>c04116b
 EC2: Fix Flaky Test (https://redirect.github.com/getmoto/moto/issues/8914";>#8914)
   Additional commits viewable in https://github.com/getmoto/moto/compare/5.1.4...5.1.5";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=moto&package-manager=pip&previous-version=5.1.4&new-version=5.1.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore condition

Re: [PR] Bump iceberg-rust version to 0.5.1 [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on PR #1380:
URL: https://github.com/apache/iceberg-rust/pull/1380#issuecomment-2911043321

   Double checked publishing:
   ```
   cd crates/iceberg
   cargo publish --all-features --dry-run
   ```
   ran successfully


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Tracking issues of Iceberg Rust 0.5.0 Release (May 2025) [iceberg-rust]

2025-05-26 Thread via GitHub


kevinjqliu commented on issue #1325:
URL: https://github.com/apache/iceberg-rust/issues/1325#issuecomment-2910986496

   @liurenjie1024 should we yank pyiceberg-core 0.5.0 
(https://pypi.org/project/pyiceberg-core/0.5.0/) since its not an official 
release? 
   I dont have the proper permission to do that on pypi


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Bump iceberg-rust version to 0.5.1 [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 merged PR #1380:
URL: https://github.com/apache/iceberg-rust/pull/1380


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: support decompress gzip metadata [iceberg-cpp]

2025-05-26 Thread via GitHub


dongxiao1198 commented on code in PR #108:
URL: https://github.com/apache/iceberg-cpp/pull/108#discussion_r2107001574


##
src/iceberg/util/gzip_internal.cc:
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include "iceberg/util/gzip_internal.h"
+
+#include 
+
+#include 
+
+#include "iceberg/util/macros.h"
+
+namespace iceberg {
+
+class ZlibImpl {

Review Comment:
   c interface is better to use dtor to make sure no memleak



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] test: Add missing tests for update_namespace method in sql catalog [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 commented on PR #1373:
URL: https://github.com/apache/iceberg-rust/pull/1373#issuecomment-2909262129

   > I'm looking at the behaviour of the method, and it seems as though an edge 
case may cause unwanted behaviour.
   > 
   > The method is structured into two parts
   > 
   > 1. Read the table of namespace properties to separate the property edits 
into inserts and updates
   > 2. Make a transaction to do all inserts and updates at the same time
   > 
   > Consider two calls both trying to add different values to the same 
previously non-existed property: If their reads both happen before either of 
their transactions, they will both decide they must insert their properties. 
Then, one will successfully insert it and the other's transaction will fail 
because the property is already present, returning an SQLX error.
   > 
   > Is this acceptable behaviour, or should we move to using `UPSERT` instead?
   
   I think current behavior is acceptable, using `UPSERT` may lead to lost 
updates of first transaction..
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] AWS: Close the S3SeekableInputStreamFactory before removing from cache [iceberg]

2025-05-26 Thread via GitHub


nastra merged PR #12891:
URL: https://github.com/apache/iceberg/pull/12891


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Feature/write to branch [iceberg-python]

2025-05-26 Thread via GitHub


SebastienN15 closed pull request #2009: Feature/write to branch
URL: https://github.com/apache/iceberg-python/pull/2009


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Incrementally computing partition stats can miss deleted files [iceberg]

2025-05-26 Thread via GitHub


ajantha-bhat commented on issue #13155:
URL: https://github.com/apache/iceberg/issues/13155#issuecomment-2909461143

   Hi, @lirui-apache thanks for reporting this. 
   
   Looks like this is an edge case with copy-on-write delete. (Test cases 
covers only merge on read!). 
   
   I assumed that current snapshots track/reuse the manifests from previous 
snapshot. So, that `PartitionStatsHandler.computeStats` will include previous 
manifests as the predicate is to include manifests with older snapshot ids. 
   
   But in this case `snapshot.allmanifests()` gives only one manifest file with 
second delete. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] feat(catalog/glue): add option to customize IO loader function [iceberg-go]

2025-05-26 Thread via GitHub


vbekiaris opened a new pull request, #441:
URL: https://github.com/apache/iceberg-go/pull/441

   Adds an option to customize the function used to determine which `io.IO` 
implementation to load when constructing a Glue catalog.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Docs: add Tinybird to the list of vendors and blog posts [iceberg]

2025-05-26 Thread via GitHub


nastra commented on code in PR #13128:
URL: https://github.com/apache/iceberg/pull/13128#discussion_r2106662135


##
site/docs/vendors.md:
##
@@ -124,10 +124,18 @@ Starburst is a commercial offering for the [Trino query 
engine](https://trino.io
 
 [Tabular](https://tabular.io/) is a managed warehouse and automation platform. 
Tabular offers a central store for analytic data that can be used with any 
query engine or processing framework that supports Iceberg. Tabular warehouses 
add role-based access control and automatic optimization, clustering, and 
compaction to Iceberg tables.
 
+### [Tinybird](https://tinybird.co)
+
+[Tinybird](https://tinybird.co) is a real-time data platform that lets 
developers and data teams build fast APIs on top of analytical data using SQL. 
It now offers native support for Apache Iceberg through ClickHouse’s [iceberg() 
table 
function](https://www.tinybird.co/docs/forward/get-data-in/table-functions/iceberg),
 allowing seamless querying of Iceberg tables stored in S3.
+
+This integration enables low-latency, high-concurrency access to Iceberg data, 
with Tinybird handling ingestion, transformation, and API publishing. 
Developers can now leverage Iceberg for open storage and governance, while 
using Tinybird for blazing-fast query performance and real-time delivery.
+
+Learn more in the [Tinybird 
documentation](https://www.tinybird.co/docs/forward/get-data-in/table-functions/iceberg).
+
 ### [Upsolver](https://upsolver.com)
 
 [Upsolver](https://upsolver.com) is a streaming data ingestion and table 
management solution for Apache Iceberg. With Upsolver, users can easily ingest 
batch and streaming data from files, streams and databases (CDC) into [Iceberg 
tables](https://docs.upsolver.com/content/reference-1/sql-commands/iceberg-tables/upsolver-managed-tables).
 In addition, Upsolver connects to your existing REST and Hive catalogs, and 
[analyzes the 
health](https://docs.upsolver.com/content/how-to-guides-1/apache-iceberg/optimize-your-iceberg-tables)
 of your tables. Use Upsolver to continuously optimize tables by compacting 
small files, sorting and compressing, repartitioning, and cleaning up dangling 
files and expired manifests. Upsolver is available from the [Upsolver 
Cloud](https://www.upsolver.com/sqlake-signup-wp) or can be deployed in your 
AWS VPC.
 
 ### [VeloDB](https://velodb.io)
 
-[VeloDB](https://www.velodb.io/) is a commercial data warehouse powered by 
[Apache Doris](https://doris.apache.org/), an open-source, real-time data 
warehouse. It also provides powerful [query acceleration for Iceberg tables and 
efficient data 
writeback](https://doris.apache.org/docs/dev/lakehouse/catalogs/iceberg-catalog).
 VeloDB offers [enterprise version](https://www.velodb.io/enterprise) and 
[cloud service](https://www.velodb.io/cloud), which are fully compatible with 
open-source Apache Doris. Quick start with Apache Doris and Apache Iceberg 
[here](https://doris.apache.org/docs/lakehouse/lakehouse-best-practices/doris-iceberg).
+[VeloDB](https://www.velodb.io/) is a commercial data warehouse powered by 
[Apache Doris](https://doris.apache.org/), an open-source, real-time data 
warehouse. It also provides powerful [query acceleration for Iceberg tables and 
efficient data 
writeback](https://doris.apache.org/docs/dev/lakehouse/catalogs/iceberg-catalog).
 VeloDB offers [enterprise version](https://www.velodb.io/enterprise) and 
[cloud service](https://www.velodb.io/cloud), which are fully compatible with 
open-source Apache Doris. Quick start with Apache Doris and Apache Iceberg 
[here](https://doris.apache.org/docs/lakehouse/lakehouse-best-practices/doris-iceberg).

Review Comment:
   seems like this removed a newline at the end of the file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Flink: Backport fix npe in TaskResultAggregator when job recovery to Flink 1.19 and 1.20 [iceberg]

2025-05-26 Thread via GitHub


pvary merged PR #13140:
URL: https://github.com/apache/iceberg/pull/13140


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] test: Add missing tests for update_namespace method in sql catalog [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 merged PR #1373:
URL: https://github.com/apache/iceberg-rust/pull/1373


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Flink: Support compact in iceberg sink v2 [iceberg]

2025-05-26 Thread via GitHub


pvary commented on code in PR #12979:
URL: https://github.com/apache/iceberg/pull/12979#discussion_r2106806912


##
flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/CommittableToTableChangeConverter.java:
##
@@ -0,0 +1,181 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.flink.sink;
+
+import java.io.IOException;
+import java.util.List;
+import org.apache.flink.api.common.functions.OpenContext;
+import org.apache.flink.api.common.state.CheckpointListener;
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.core.io.SimpleVersionedSerialization;
+import org.apache.flink.runtime.state.FunctionInitializationContext;
+import org.apache.flink.runtime.state.FunctionSnapshotContext;
+import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;
+import org.apache.flink.streaming.api.connector.sink2.CommittableMessage;
+import org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage;
+import org.apache.flink.streaming.api.functions.ProcessFunction;
+import org.apache.flink.util.Collector;
+import org.apache.iceberg.DataFile;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.ManifestFile;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.flink.TableLoader;
+import org.apache.iceberg.flink.maintenance.operator.TableChange;
+import org.apache.iceberg.io.WriteResult;
+import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
+import org.apache.iceberg.relocated.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class CommittableToTableChangeConverter
+extends ProcessFunction, 
TableChange>
+implements CheckpointedFunction, CheckpointListener {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(CommittableToTableChangeConverter.class);
+
+  private final TableLoader tableLoader;
+  private transient Table table;
+  private transient ListState manifestFilesToRemoveState;
+  private transient List manifestFilesToRemoveList;

Review Comment:
   What happens when a committable is retried? There is some logic in the 
`IcebergCommitter` to avoid duplicate commits, and we use 
`signalAlreadyCommitted` to mark them as they are already committed. Do we 
still emit them to the `PostCommitTopology`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Spark: add IcebergConnectHiveDelegationTokenProvider [iceberg]

2025-05-26 Thread via GitHub


gaborgsomogyi commented on issue #13116:
URL: https://github.com/apache/iceberg/issues/13116#issuecomment-2909026174

   This can decrease significanly the number of authentication requests 
allowing to deploy more scalable applications.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: implement initial MemoryCatalog functionality with namespace and table support [iceberg-cpp]

2025-05-26 Thread via GitHub


gty404 commented on PR #80:
URL: https://github.com/apache/iceberg-cpp/pull/80#issuecomment-2909485546

   @Xuanwo  @Fokko  Could you help review this? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Incrementally computing partition stats can miss deleted files [iceberg]

2025-05-26 Thread via GitHub


ajantha-bhat commented on issue #13155:
URL: https://github.com/apache/iceberg/issues/13155#issuecomment-2909762512

   > I think when a snapshot is being created, previous manifests without live 
entries will not be tracked. So maybe we have to apply the incremental 
snapshots one by one? And probably limit the max number of such snapshots, i.e. 
if there're too many, a full computation could be a better choice.
   
   We can apply one by one and throw exception if the snapshots removed 
(expired) in between? So, user does full compute? Thats what make sense to me. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Flink: Support compact in iceberg sink v2 [iceberg]

2025-05-26 Thread via GitHub


Guosmilesmile commented on code in PR #12979:
URL: https://github.com/apache/iceberg/pull/12979#discussion_r2107397429


##
flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/CommittableToTableChangeConverter.java:
##
@@ -0,0 +1,181 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.flink.sink;
+
+import java.io.IOException;
+import java.util.List;
+import org.apache.flink.api.common.functions.OpenContext;
+import org.apache.flink.api.common.state.CheckpointListener;
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.core.io.SimpleVersionedSerialization;
+import org.apache.flink.runtime.state.FunctionInitializationContext;
+import org.apache.flink.runtime.state.FunctionSnapshotContext;
+import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;
+import org.apache.flink.streaming.api.connector.sink2.CommittableMessage;
+import org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage;
+import org.apache.flink.streaming.api.functions.ProcessFunction;
+import org.apache.flink.util.Collector;
+import org.apache.iceberg.DataFile;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.ManifestFile;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.flink.TableLoader;
+import org.apache.iceberg.flink.maintenance.operator.TableChange;
+import org.apache.iceberg.io.WriteResult;
+import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
+import org.apache.iceberg.relocated.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class CommittableToTableChangeConverter
+extends ProcessFunction, 
TableChange>
+implements CheckpointedFunction, CheckpointListener {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(CommittableToTableChangeConverter.class);
+
+  private final TableLoader tableLoader;
+  private transient Table table;
+  private transient ListState manifestFilesToRemoveState;
+  private transient List manifestFilesToRemoveList;
+  private transient long lastCompletedCheckpointId = -1L;
+  private transient String flinkJobId;
+
+  public CommittableToTableChangeConverter(TableLoader tableLoader) {
+Preconditions.checkNotNull(tableLoader, "TableLoader should not be null");
+this.tableLoader = tableLoader;
+  }
+
+  @Override
+  public void initializeState(FunctionInitializationContext context) throws 
Exception {
+this.manifestFilesToRemoveList = Lists.newArrayList();
+this.manifestFilesToRemoveState =
+context
+.getOperatorStateStore()
+.getListState(new ListStateDescriptor<>("manifests-to-remove", 
ManifestFile.class));
+if (context.isRestored()) {
+  manifestFilesToRemoveList = 
Lists.newArrayList(manifestFilesToRemoveState.get());
+}
+  }
+
+  @Override
+  public void open(OpenContext openContext) throws Exception {
+super.open(openContext);
+this.flinkJobId = getRuntimeContext().getJobId().toString();
+if (!tableLoader.isOpen()) {
+  tableLoader.open();
+}
+this.table = tableLoader.loadTable();
+  }
+
+  @Override
+  public void snapshotState(FunctionSnapshotContext context) throws Exception {
+manifestFilesToRemoveState.update(manifestFilesToRemoveList);
+  }
+
+  @Override
+  public void processElement(
+  CommittableMessage value,
+  ProcessFunction, 
TableChange>.Context ctx,
+  Collector out)
+  throws Exception {
+if (value instanceof CommittableWithLineage) {
+  CommittableWithLineage committable =
+  (CommittableWithLineage) value;
+  TableChange tableChange = 
convertToTableChange(committable.getCommittable());
+  out.collect(tableChange);
+}

Review Comment:
   Good suggestion ! Get it .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Flink: Support compact in iceberg sink v2 [iceberg]

2025-05-26 Thread via GitHub


Guosmilesmile commented on code in PR #12979:
URL: https://github.com/apache/iceberg/pull/12979#discussion_r2107399213


##
flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/RewriteDataFilesConfig.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.flink.maintenance.api;
+
+import java.util.Map;
+import org.apache.iceberg.actions.RewriteDataFiles;
+import org.apache.iceberg.relocated.com.google.common.collect.Maps;
+import org.apache.iceberg.util.PropertyUtil;
+
+public class RewriteDataFilesConfig {
+  public static final String CONFIG_PREFIX = 
TableMaintenanceConfig.CONFIG_PREFIX + "rewrite.";
+
+  private final Map properties;
+
+  public RewriteDataFilesConfig(Map newProperties) {
+this.properties = Maps.newHashMap();
+newProperties.forEach(
+(key, value) -> {
+  if (key.startsWith(CONFIG_PREFIX)) {
+properties.put(key.substring(CONFIG_PREFIX.length()), value);
+  }
+});
+  }
+
+  public static final String PARTIAL_PROGRESS_ENABLE =
+  org.apache.iceberg.actions.RewriteDataFiles.PARTIAL_PROGRESS_ENABLED;
+
+  public static final String PARTIAL_PROGRESS_MAX_COMMITS =
+  org.apache.iceberg.actions.RewriteDataFiles.PARTIAL_PROGRESS_MAX_COMMITS;

Review Comment:
   I removed it and used this variable directly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Flink: Support compact in iceberg sink v2 [iceberg]

2025-05-26 Thread via GitHub


Guosmilesmile commented on code in PR #12979:
URL: https://github.com/apache/iceberg/pull/12979#discussion_r2107402061


##
flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/TableMaintenanceConfig.java:
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.flink.maintenance.api;
+
+import java.util.Map;
+import org.apache.iceberg.util.PropertyUtil;
+
+public class TableMaintenanceConfig {

Review Comment:
   I add FlinkConfParser to the FlinkMaintenanceConfig , use it like 
`FlinkWriteConf ` and `FlinkReadConf `



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: Add basic classes for writing table format-version 4 [iceberg]

2025-05-26 Thread via GitHub


nastra commented on code in PR #13123:
URL: https://github.com/apache/iceberg/pull/13123#discussion_r2106692536


##
api/src/test/java/org/apache/iceberg/TestHelpers.java:
##
@@ -54,7 +54,7 @@ public class TestHelpers {
 
   private TestHelpers() {}
 
-  public static final int MAX_FORMAT_VERSION = 3;
+  public static final int MAX_FORMAT_VERSION = 4;

Review Comment:
   @ajantha-bhat are you referring to the version bump here in `TestHelpers` or 
to the entire diff in this PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: Add basic classes for writing table format-version 4 [iceberg]

2025-05-26 Thread via GitHub


ajantha-bhat commented on code in PR #13123:
URL: https://github.com/apache/iceberg/pull/13123#discussion_r2106699014


##
api/src/test/java/org/apache/iceberg/TestHelpers.java:
##
@@ -54,7 +54,7 @@ public class TestHelpers {
 
   private TestHelpers() {}
 
-  public static final int MAX_FORMAT_VERSION = 3;
+  public static final int MAX_FORMAT_VERSION = 4;

Review Comment:
   Mainly in the `TableMetadata.java`, 
   
   I am not against these changes. I would like to quickly jump on v4 as well. 
Just that it is odd for the users that Iceberg don't have spec v4 documented or 
approved but allows table creation with v4. Maybe better if we get one of the 
v4 feature spec approved and start enabling v4 table creation with that feature 
enabled. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: Add basic classes for writing table format-version 4 [iceberg]

2025-05-26 Thread via GitHub


ajantha-bhat commented on code in PR #13123:
URL: https://github.com/apache/iceberg/pull/13123#discussion_r2106699014


##
api/src/test/java/org/apache/iceberg/TestHelpers.java:
##
@@ -54,7 +54,7 @@ public class TestHelpers {
 
   private TestHelpers() {}
 
-  public static final int MAX_FORMAT_VERSION = 3;
+  public static final int MAX_FORMAT_VERSION = 4;

Review Comment:
   Mainly in the `TableMetadata.java`, 
   
   I am not against these changes. I would like to quickly jump on v4 
development as well. Just that it is odd for the users that Iceberg don't have 
spec v4 documented or approved but allows table creation with v4. Maybe better 
if we get one of the v4 feature spec approved and start enabling v4 table 
creation with that feature enabled. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Docs, Flink: Fix equality fields requirement in upsert mode [iceberg]

2025-05-26 Thread via GitHub


pvary commented on code in PR #13127:
URL: https://github.com/apache/iceberg/pull/13127#discussion_r2106716484


##
docs/docs/flink-writes.md:
##
@@ -75,7 +75,7 @@ Iceberg supports `UPSERT` based on the primary key when 
writing data into v2 tab
 ```
 
 !!! info
-OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the table 
is partitioned, the partition fields should be included in equality fields.
+OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the table 
is partitioned, the source columns of partition fields should be included in 
equality fields. For partition field `days(ts)`, the source column `ts` should 
be included in equality fields.

Review Comment:
   Maybe:
   
   > OVERWRITE and UPSERT modes are mutually exclusive and cannot be enabled at 
the same time. When using UPSERT mode with a partitioned table, all of the 
source columns corresponding to the partition fields must be included in the 
equality fields. For example, if the partition field is `days(ts)`, then the id 
of the column `ts` must be part of the equality fields.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Flink: Backport fix npe in TaskResultAggregator when job recovery to Flink 1.19 and 1.20 [iceberg]

2025-05-26 Thread via GitHub


pvary commented on PR #13140:
URL: https://github.com/apache/iceberg/pull/13140#issuecomment-2908797656

   Merged to main.
   Thanks for the backport @Guosmilesmile!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[I] doc: Add how to verify release section in doc. [iceberg-rust]

2025-05-26 Thread via GitHub


liurenjie1024 opened a new issue, #1378:
URL: https://github.com/apache/iceberg-rust/issues/1378

   ### Is your feature request related to a problem or challenge?
   
   Add a section of how to verify release in doc.
   
   ### Describe the solution you'd like
   
   _No response_
   
   ### Willingness to contribute
   
   I can contribute to this feature independently


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



  1   2   >