date:20240308

Re: [I] About /v1/config REST API endpoint [iceberg]

2024-03-08 Thread via GitHub



nastra commented on issue #9880:
URL: https://github.com/apache/iceberg/issues/9880#issuecomment-1985238077

   The `warehouse` query parameter for that endpoint refers to 
https://github.com/apache/iceberg/blob/5b84f34a5386fc61b17bfe7dc7c1cbe565550958/core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java#L896-L906,
 indicating that the client can send the used warehouse location to the server 
to keep the location in-sync.
   
   The response in the config endpoint then has defaults and overrides, which 
can control how the catalog is configured. I don't think any other catalog 
properties make sense to pass as a query param, because `warehouse` might be 
the only one that can be configured on the client-side and might be used on the 
server-side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1517380056


##
core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java:
##
@@ -394,20 +430,24 @@ public static class AuthSession {
 private volatile boolean keepRefreshed = true;
 private final String oauth2ServerUri;
 
+private volatile Map optionalOAuthParams = 
ImmutableMap.of();

Review Comment:
   why does this have to be `volatile`? The others are `volatile` because they 
are being updated when the session token is refreshed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1517381648


##
core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java:
##
@@ -1444,7 +1484,7 @@ public void testCatalogTokenRefresh(String 
oauth2ServerUri) {
   eq(HTTPMethod.POST),
   eq(oauth2ServerUri),
   any(),
-  Mockito.argThat(firstRefreshRequest::equals),

Review Comment:
   unrelated change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1517389925


##
core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java:
##
@@ -471,7 +515,13 @@ static void setTokenRefreshNumRetries(int retries) {
  */
 public static AuthSession empty() {
   return new AuthSession(
-  ImmutableMap.of(), null, null, null, OAuth2Properties.CATALOG_SCOPE, 
null);
+  ImmutableMap.of(),
+  null,
+  null,
+  null,
+  OAuth2Properties.CATALOG_SCOPE,
+  null,
+  ImmutableMap.of());

Review Comment:
   it has become quite painful to add new functionality/parameters to 
`AuthSession`, since it requires adding a single param at a bunch of places. 
The most recent change was done by #8976.
   
   A while ago I experimented with an 
[`AuthConfig`](https://github.com/apache/iceberg/commit/3bed59907f4230d10a97742c58ab4b85ce7b1ebf#diff-d783470ff004c360395125d467f1aadd776c8c9e925ba52639a405a03a72b870R647),
 which would carry all parameters required for an `AuthSession`, so that it 
would be easier to add new configuration parameters to it. So I wonder if we 
should refactor `AuthSession` and its configuration so that it's less painful 
to add new stuff to it?
   
   /cc @danielcweeks @amogh-jahagirdar @rdblue 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Add test for renaming table to a non-existing namespace [iceberg]

2024-03-08 Thread via GitHub



nastra commented on PR #9895:
URL: https://github.com/apache/iceberg/pull/9895#issuecomment-1985258825

   thanks everyone for reviewing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Add test for renaming table to a non-existing namespace [iceberg]

2024-03-08 Thread via GitHub



nastra merged PR #9895:
URL: https://github.com/apache/iceberg/pull/9895


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1517380056


##
core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java:
##
@@ -394,20 +430,24 @@ public static class AuthSession {
 private volatile boolean keepRefreshed = true;
 private final String oauth2ServerUri;
 
+private volatile Map optionalOAuthParams = 
ImmutableMap.of();

Review Comment:
   why does this have to be `volatile`? The others are `volatile` because they 
are being updated when the session token is refreshed but I don't think this 
appleis for that map



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



himadripal commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1517402474


##
core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java:
##
@@ -471,7 +515,13 @@ static void setTokenRefreshNumRetries(int retries) {
  */
 public static AuthSession empty() {
   return new AuthSession(
-  ImmutableMap.of(), null, null, null, OAuth2Properties.CATALOG_SCOPE, 
null);
+  ImmutableMap.of(),
+  null,
+  null,
+  null,
+  OAuth2Properties.CATALOG_SCOPE,
+  null,
+  ImmutableMap.of());

Review Comment:
   +1 to the painful to add parameters. With this PR, I have tried to add all 
optional oauth parameters (including scope) in a map but having a class like 
`[AuthConfig](https://github.com/apache/iceberg/commit/3bed59907f4230d10a97742c58ab4b85ce7b1ebf#diff-d783470ff004c360395125d467f1aadd776c8c9e925ba52639a405a03a72b870R647),`
 which holds all params with validation logic and passing it around would be 
best. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



himadripal commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1517402474


##
core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java:
##
@@ -471,7 +515,13 @@ static void setTokenRefreshNumRetries(int retries) {
  */
 public static AuthSession empty() {
   return new AuthSession(
-  ImmutableMap.of(), null, null, null, OAuth2Properties.CATALOG_SCOPE, 
null);
+  ImmutableMap.of(),
+  null,
+  null,
+  null,
+  OAuth2Properties.CATALOG_SCOPE,
+  null,
+  ImmutableMap.of());

Review Comment:
   +1 to the painful to add parameters. With this PR, I have tried to add all 
optional oauth parameters (including scope) in a map but having a class like 
`[AuthConfig](https://github.com/apache/iceberg/commit/3bed59907f4230d10a97742c58ab4b85ce7b1ebf#diff-d783470ff004c360395125d467f1aadd776c8c9e925ba52639a405a03a72b870R647)`
 which holds all params with validation logic and passing it around would be 
best. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



himadripal commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1517402474


##
core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java:
##
@@ -471,7 +515,13 @@ static void setTokenRefreshNumRetries(int retries) {
  */
 public static AuthSession empty() {
   return new AuthSession(
-  ImmutableMap.of(), null, null, null, OAuth2Properties.CATALOG_SCOPE, 
null);
+  ImmutableMap.of(),
+  null,
+  null,
+  null,
+  OAuth2Properties.CATALOG_SCOPE,
+  null,
+  ImmutableMap.of());

Review Comment:
   +1 to the painful to add parameters. With this PR, I have tried to add all 
optional oauth parameters (including scope) in a map but having a class like 
`AuthConfig` which holds all params with validation logic and passing it around 
would be best. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



nastra commented on PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#issuecomment-1985279889

   overall I think adding these optional OAuth2 params make sense to me. I'll 
also link to the [OAuth2 
spec](https://datatracker.ietf.org/doc/html/rfc8693#name-token-exchange-request-and-)
 for other reviewers to better understand what those optional parameters do.
   I think we should decide how we want to deal with `AuthSession` 
configuration paramters as it causes lots of changes to add a single parameter 
to `AuthSession`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



himadripal commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1517414956


##
core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java:
##
@@ -257,8 +292,9 @@ private static Map tokenExchangeRequest(
   formData.put(ACTOR_TOKEN, actorToken);
   formData.put(ACTOR_TOKEN_TYPE, actorTokenType);
 }
+formData.putAll(optionalParams);
 
-return formData.build();
+return formData.buildKeepingLast();

Review Comment:
   please note, changed it to `buildKeepingLast()` , I noticed everywhere else 
`build` is used which throws exception if it encounters duplicate keys.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] [feature request] easier API to set table properties [iceberg-python]

2024-03-08 Thread via GitHub



HonahX commented on issue #502:
URL: https://github.com/apache/iceberg-python/issues/502#issuecomment-1985304133

   Thanks @syun64 @kevinjqliu for the thoughts and explanation. Since users 
seem to benefit from dictionaries and kwargs, I think it is good idea to 
support both (as in #503 )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Set table properties with dictionary [iceberg-python]

2024-03-08 Thread via GitHub



HonahX merged PR #503:
URL: https://github.com/apache/iceberg-python/pull/503


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] [feature request] easier API to set table properties [iceberg-python]

2024-03-08 Thread via GitHub



HonahX closed issue #502: [feature request] easier API to set table properties 
URL: https://github.com/apache/iceberg-python/issues/502


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Imports decouple [iceberg-python]

2024-03-08 Thread via GitHub



HonahX merged PR #505:
URL: https://github.com/apache/iceberg-python/pull/505


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] About /v1/config REST API endpoint [iceberg]

2024-03-08 Thread via GitHub



ajantha-bhat commented on issue #9880:
URL: https://github.com/apache/iceberg/issues/9880#issuecomment-1985360044

   Thanks. But may be we should use a separate PATCH/PUT instead of sending in 
a GET request as we want to update the resource here. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Free disk space before running action in Spark CI [iceberg]

2024-03-08 Thread via GitHub



nastra commented on PR #9786:
URL: https://github.com/apache/iceberg/pull/9786#issuecomment-1985450508

   Seems to be still an issue in the latest runs: 
https://github.com/apache/iceberg/actions/runs/8200693411/job/22427953001


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#issuecomment-1985474689

   @nastra Update the snapshot files. Could you review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Handle listing tables/views in empty namespaces [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9890:
URL: https://github.com/apache/iceberg/pull/9890#discussion_r1517567287


##
core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java:
##
@@ -168,6 +168,14 @@ protected boolean supportsNamesWithDot() {
 return true;
   }
 
+  protected boolean supportsCreatingEmptyNamespace() {

Review Comment:
   currently, the only reason that two flags exist is because Nessie's behavior 
is different from JDBC + InMemory. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Handle listing tables/views in empty namespaces [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9890:
URL: https://github.com/apache/iceberg/pull/9890#discussion_r1517571160


##
core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java:
##
@@ -987,6 +995,61 @@ public void testListTables() {
 assertEmpty("Should not contain ns_2.table_1 after drop", catalog, ns2);
   }
 
+  @Test
+  public void listNamespacesWithEmptyNamespace() {

Review Comment:
   this behavior applies for all catalogs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Handle listing tables/views in empty namespaces [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9890:
URL: https://github.com/apache/iceberg/pull/9890#discussion_r1517613054


##
core/src/main/java/org/apache/iceberg/inmemory/InMemoryCatalog.java:
##
@@ -395,7 +395,9 @@ public void doCommit(TableMetadata base, TableMetadata 
metadata) {
   String oldLocation = base == null ? null : base.metadataFileLocation();
 
   synchronized (InMemoryCatalog.this) {
-if (null == base && !namespaceExists(tableIdentifier.namespace())) {
+if (null == base
+&& !namespaceExists(tableIdentifier.namespace())
+&& !tableIdentifier.namespace().isEmpty()) {

Review Comment:
   I've reverted this change as it should actually require that the empty 
namespace has been created first



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517614107


##
core/src/test/java/org/apache/iceberg/LocalTableOperations.java:
##
@@ -63,7 +65,9 @@ public String metadataFileLocation(String fileName) {
 fileName,
 name -> {
   try {
-return temp.newFile(name).getAbsolutePath();
+return java.nio.file.Files.createTempDirectory(temp, "junit")

Review Comment:
   shouldn't this create a temp file with the given name rather than a temp 
directory?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517614411


##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -19,23 +19,24 @@
 package org.apache.iceberg;
 
 import static org.apache.iceberg.Files.localInput;
+import static org.assertj.core.api.Assertions.assertThat;
 
 import java.io.File;
 import java.io.IOException;
+import java.nio.file.Path;
 import java.util.List;
-import java.util.UUID;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
-import org.assertj.core.api.Assertions;
-import org.junit.Assert;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.TemporaryFolder;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
 
 public class TestSnapshotJson {
-  @Rule public TemporaryFolder temp = new TemporaryFolder();
+  @TempDir private static Path temp;

Review Comment:
   ```suggestion
 @TempDir private Path temp;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517614955


##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -19,23 +19,24 @@
 package org.apache.iceberg;
 
 import static org.apache.iceberg.Files.localInput;
+import static org.assertj.core.api.Assertions.assertThat;
 
 import java.io.File;
 import java.io.IOException;
+import java.nio.file.Path;
 import java.util.List;
-import java.util.UUID;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
-import org.assertj.core.api.Assertions;
-import org.junit.Assert;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.TemporaryFolder;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
 
 public class TestSnapshotJson {
-  @Rule public TemporaryFolder temp = new TemporaryFolder();
+  @TempDir private static Path temp;
 
-  public TableOperations ops = new LocalTableOperations(temp);
+  public TableOperations ops =
+  new LocalTableOperations(java.nio.file.Files.createTempDirectory(temp, 
"junit"));

Review Comment:
   why does this create a new temp directory? I think it should just pass `temp`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517615186


##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -19,23 +19,24 @@
 package org.apache.iceberg;
 
 import static org.apache.iceberg.Files.localInput;
+import static org.assertj.core.api.Assertions.assertThat;
 
 import java.io.File;
 import java.io.IOException;
+import java.nio.file.Path;
 import java.util.List;
-import java.util.UUID;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
-import org.assertj.core.api.Assertions;
-import org.junit.Assert;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.TemporaryFolder;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
 
 public class TestSnapshotJson {
-  @Rule public TemporaryFolder temp = new TemporaryFolder();
+  @TempDir private static Path temp;
 
-  public TableOperations ops = new LocalTableOperations(temp);
+  public TableOperations ops =
+  new LocalTableOperations(java.nio.file.Files.createTempDirectory(temp, 
"junit"));
+
+  public TestSnapshotJson() throws IOException {}

Review Comment:
   why is the constructor here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517616912


##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -149,29 +182,34 @@ public void testJsonConversionWithV1Manifests() {
 timestampMillis);
 
 String json = SnapshotParser.toJson(expected, true);
-Assertions.assertThat(json).isEqualTo(expectedJson);
+assertThat(json).isEqualTo(expectedJson);
 Snapshot snapshot = SnapshotParser.fromJson(json);
-Assertions.assertThat(snapshot).isEqualTo(expected);
-
-Assert.assertEquals("Sequence number should default to 0 for v1", 0, 
snapshot.sequenceNumber());
-Assert.assertEquals("Snapshot ID should match", expected.snapshotId(), 
snapshot.snapshotId());
-Assert.assertEquals(
-"Timestamp should match", expected.timestampMillis(), 
snapshot.timestampMillis());
-Assert.assertEquals("Parent ID should match", expected.parentId(), 
snapshot.parentId());
-Assert.assertEquals(
-"Manifest list should match",
-expected.manifestListLocation(),
-snapshot.manifestListLocation());
-Assert.assertEquals(
-"Files should match", expected.allManifests(ops.io()), 
snapshot.allManifests(ops.io()));
-Assert.assertEquals("Operation should match", expected.operation(), 
snapshot.operation());
-Assert.assertEquals("Summary should match", expected.summary(), 
snapshot.summary());
-Assert.assertEquals("Schema ID should match", expected.schemaId(), 
snapshot.schemaId());
+assertThat(snapshot).isEqualTo(expected);
+
+assertThat(snapshot.sequenceNumber()).isEqualTo(0);
+assertThat(snapshot.snapshotId()).isEqualTo(expected.snapshotId());
+
assertThat(snapshot.timestampMillis()).isEqualTo(expected.timestampMillis());
+assertThat(snapshot.parentId()).isEqualTo(expected.parentId());
+
assertThat(snapshot.manifestListLocation()).isEqualTo(expected.manifestListLocation());
+
+assertThat(snapshot.allManifests(ops.io())).hasSize(2);
+
+for (int i = 0; i < snapshot.allManifests(ops.io()).size(); i++) {
+  ManifestFile actualManifestFile = snapshot.allManifests(ops.io()).get(i);
+  ManifestFile expectedManifestFile = 
expected.allManifests(ops.io()).get(i);
+
+  
assertThat(actualManifestFile.content()).isEqualTo(expectedManifestFile.content());
+  
assertThat(actualManifestFile.path()).isEqualTo(expectedManifestFile.path());
+}
+
+assertThat(snapshot.operation()).isEqualTo(expected.operation());
+assertThat(snapshot.summary()).isEqualTo(expected.summary());
+assertThat(snapshot.schemaId()).isEqualTo(expected.schemaId());
   }
 
   private String createManifestListWithManifestFiles(long snapshotId, Long 
parentSnapshotId)
   throws IOException {
-File manifestList = temp.newFile("manifests" + UUID.randomUUID());
+File manifestList = java.nio.file.Files.createTempDirectory(temp, 
"junit").toFile();

Review Comment:
   this should create a new temp file instead of a temp directory



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517617834


##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -49,12 +50,23 @@ public void testJsonConversion() throws IOException {
 String json = SnapshotParser.toJson(expected);
 Snapshot snapshot = SnapshotParser.fromJson(json);
 
-Assert.assertEquals("Snapshot ID should match", expected.snapshotId(), 
snapshot.snapshotId());
-Assert.assertEquals(
-"Files should match", expected.allManifests(ops.io()), 
snapshot.allManifests(ops.io()));
-Assert.assertNull("Operation should be null", snapshot.operation());
-Assert.assertNull("Summary should be null", snapshot.summary());
-Assert.assertEquals("Schema ID should match", Integer.valueOf(1), 
snapshot.schemaId());
+assertThat(snapshot.snapshotId()).isEqualTo(expected.snapshotId());
+assertThat(snapshot.allManifests(ops.io())).hasSize(2);
+
+for (int i = 0; i < snapshot.allManifests(ops.io()).size(); i++) {

Review Comment:
   are these new checks? can you elaborate why those are being added?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517620107


##
core/src/test/java/org/apache/iceberg/TestTableMetadata.java:
##
@@ -87,7 +86,7 @@ public class TestTableMetadata {
   .desc(Expressions.bucket("z", 4), NullOrder.NULLS_LAST)
   .build();
 
-  @Rule public TemporaryFolder temp = new TemporaryFolder();
+  @TempDir protected static Path temp;

Review Comment:
   ```suggestion
 @TempDir private Path temp;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517620401


##
core/src/test/java/org/apache/iceberg/TestTableMetadata.java:
##
@@ -1731,7 +1730,7 @@ public void testNoTrailingLocationSlash() {
 
   private String createManifestListWithManifestFile(
   long snapshotId, Long parentSnapshotId, String manifestFile) throws 
IOException {
-File manifestList = temp.newFile("manifests" + UUID.randomUUID());
+File manifestList = java.nio.file.Files.createTempDirectory(temp, 
"junit").toFile();

Review Comment:
   same as above, shouldn't this create a temp file with the given name rather 
than a temp folder?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517621786


##
core/src/test/java/org/apache/iceberg/TestSnapshotManager.java:
##
@@ -177,120 +177,115 @@ public void testCherryPickFromBranch() {
 long lastSnapshotId = table.currentSnapshot().snapshotId();
 
 // pick the snapshot into the current state
-Assertions.assertThatThrownBy(
-() -> 
table.manageSnapshots().cherrypick(replaceSnapshotId).commit())
+assertThatThrownBy(() -> 
table.manageSnapshots().cherrypick(replaceSnapshotId).commit())
 .isInstanceOf(ValidationException.class)
 .hasMessageStartingWith(
 "Cannot cherry-pick overwrite not based on an ancestor of the 
current state");
 
-Assert.assertEquals(
-"Failed cherry-pick should not change the table state",
-lastSnapshotId,
-table.currentSnapshot().snapshotId());
+assertThat(table.currentSnapshot().snapshotId())
+.as("Failed cherry-pick should not change the table state")
+.isEqualTo(lastSnapshotId);
 validateTableFiles(table, FILE_A);
   }
 
-  @Test
+  @TestTemplate
   public void testCherryPickOverwrite() {
 table.newAppend().appendFile(FILE_A).commit();
 
 // stage an overwrite to replace FILE_A
 
table.newOverwrite().deleteFile(FILE_A).addFile(REPLACEMENT_FILE_A).stageOnly().commit();
 
 Snapshot staged = Iterables.getLast(table.snapshots());
-Assert.assertEquals(
-"Should find the staged overwrite snapshot", DataOperations.OVERWRITE, 
staged.operation());
+assertThat(staged.operation())
+.as("Should find the staged overwrite snapshot")
+.isEqualTo(DataOperations.OVERWRITE);
 
 // add another append so that the original commit can't be fast-forwarded
 table.newAppend().appendFile(FILE_B).commit();
 long lastSnapshotId = table.currentSnapshot().snapshotId();
 
 // pick the snapshot into the current state
-Assertions.assertThatThrownBy(
-() -> 
table.manageSnapshots().cherrypick(staged.snapshotId()).commit())
+assertThatThrownBy(() -> 
table.manageSnapshots().cherrypick(staged.snapshotId()).commit())
 .isInstanceOf(ValidationException.class)
 .hasMessageEndingWith("not append, dynamic overwrite, or 
fast-forward");
 
-Assert.assertEquals(
-"Failed cherry-pick should not change the table state",
-lastSnapshotId,
-table.currentSnapshot().snapshotId());
+assertThat(table.currentSnapshot().snapshotId())
+.as("Failed cherry-pick should not change the table state")
+.isEqualTo(lastSnapshotId);
 validateTableFiles(table, FILE_A, FILE_B);
   }
 
-  @Test
+  @TestTemplate
   public void testCreateBranch() {
 table.newAppend().appendFile(FILE_A).commit();
 long snapshotId = table.currentSnapshot().snapshotId();
 // Test a basic case of creating a branch
 table.manageSnapshots().createBranch("branch1", snapshotId).commit();
 SnapshotRef expectedBranch = table.ops().refresh().ref("branch1");
-Assert.assertTrue(
-expectedBranch != null
-&& 
expectedBranch.equals(SnapshotRef.branchBuilder(snapshotId).build()));
+assertThat(expectedBranch).isNotNull();
+
assertThat(expectedBranch).isEqualTo(SnapshotRef.branchBuilder(snapshotId).build());

Review Comment:
   ```suggestion
   
assertThat(expectedBranch).isNotNull().isEqualTo(SnapshotRef.branchBuilder(snapshotId).build());
   ```



##
core/src/test/java/org/apache/iceberg/TestSnapshotManager.java:
##
@@ -177,120 +177,115 @@ public void testCherryPickFromBranch() {
 long lastSnapshotId = table.currentSnapshot().snapshotId();
 
 // pick the snapshot into the current state
-Assertions.assertThatThrownBy(
-() -> 
table.manageSnapshots().cherrypick(replaceSnapshotId).commit())
+assertThatThrownBy(() -> 
table.manageSnapshots().cherrypick(replaceSnapshotId).commit())
 .isInstanceOf(ValidationException.class)
 .hasMessageStartingWith(
 "Cannot cherry-pick overwrite not based on an ancestor of the 
current state");
 
-Assert.assertEquals(
-"Failed cherry-pick should not change the table state",
-lastSnapshotId,
-table.currentSnapshot().snapshotId());
+assertThat(table.currentSnapshot().snapshotId())
+.as("Failed cherry-pick should not change the table state")
+.isEqualTo(lastSnapshotId);
 validateTableFiles(table, FILE_A);
   }
 
-  @Test
+  @TestTemplate
   public void testCherryPickOverwrite() {
 table.newAppend().appendFile(FILE_A).commit();
 
 // stage an overwrite to replace FILE_A
 
table.newOverwrite().deleteFile(FILE_A).addFile(REPLACEMENT_FILE_A).stageOnly().commit();
 
 Snapshot staged = Iterables.getLast(table.snapshots());
-Assert.assertEquals(
-"Should find the staged overwrite snapshot", DataOperations.OVERWRITE, 
staged.oper

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517622216


##
core/src/test/java/org/apache/iceberg/TestSnapshotManager.java:
##
@@ -177,120 +177,115 @@ public void testCherryPickFromBranch() {
 long lastSnapshotId = table.currentSnapshot().snapshotId();
 
 // pick the snapshot into the current state
-Assertions.assertThatThrownBy(
-() -> 
table.manageSnapshots().cherrypick(replaceSnapshotId).commit())
+assertThatThrownBy(() -> 
table.manageSnapshots().cherrypick(replaceSnapshotId).commit())
 .isInstanceOf(ValidationException.class)
 .hasMessageStartingWith(
 "Cannot cherry-pick overwrite not based on an ancestor of the 
current state");
 
-Assert.assertEquals(
-"Failed cherry-pick should not change the table state",
-lastSnapshotId,
-table.currentSnapshot().snapshotId());
+assertThat(table.currentSnapshot().snapshotId())
+.as("Failed cherry-pick should not change the table state")
+.isEqualTo(lastSnapshotId);
 validateTableFiles(table, FILE_A);
   }
 
-  @Test
+  @TestTemplate
   public void testCherryPickOverwrite() {
 table.newAppend().appendFile(FILE_A).commit();
 
 // stage an overwrite to replace FILE_A
 
table.newOverwrite().deleteFile(FILE_A).addFile(REPLACEMENT_FILE_A).stageOnly().commit();
 
 Snapshot staged = Iterables.getLast(table.snapshots());
-Assert.assertEquals(
-"Should find the staged overwrite snapshot", DataOperations.OVERWRITE, 
staged.operation());
+assertThat(staged.operation())
+.as("Should find the staged overwrite snapshot")
+.isEqualTo(DataOperations.OVERWRITE);
 
 // add another append so that the original commit can't be fast-forwarded
 table.newAppend().appendFile(FILE_B).commit();
 long lastSnapshotId = table.currentSnapshot().snapshotId();
 
 // pick the snapshot into the current state
-Assertions.assertThatThrownBy(
-() -> 
table.manageSnapshots().cherrypick(staged.snapshotId()).commit())
+assertThatThrownBy(() -> 
table.manageSnapshots().cherrypick(staged.snapshotId()).commit())
 .isInstanceOf(ValidationException.class)
 .hasMessageEndingWith("not append, dynamic overwrite, or 
fast-forward");
 
-Assert.assertEquals(
-"Failed cherry-pick should not change the table state",
-lastSnapshotId,
-table.currentSnapshot().snapshotId());
+assertThat(table.currentSnapshot().snapshotId())
+.as("Failed cherry-pick should not change the table state")
+.isEqualTo(lastSnapshotId);
 validateTableFiles(table, FILE_A, FILE_B);
   }
 
-  @Test
+  @TestTemplate
   public void testCreateBranch() {
 table.newAppend().appendFile(FILE_A).commit();
 long snapshotId = table.currentSnapshot().snapshotId();
 // Test a basic case of creating a branch
 table.manageSnapshots().createBranch("branch1", snapshotId).commit();
 SnapshotRef expectedBranch = table.ops().refresh().ref("branch1");
-Assert.assertTrue(
-expectedBranch != null
-&& 
expectedBranch.equals(SnapshotRef.branchBuilder(snapshotId).build()));
+assertThat(expectedBranch).isNotNull();
+
assertThat(expectedBranch).isEqualTo(SnapshotRef.branchBuilder(snapshotId).build());
   }
 
-  @Test
+  @TestTemplate
   public void testCreateBranchWithoutSnapshotId() {
 table.newAppend().appendFile(FILE_A).commit();
 long snapshotId = table.currentSnapshot().snapshotId();
 // Test a basic case of creating a branch
 table.manageSnapshots().createBranch("branch1").commit();
 SnapshotRef actualBranch = table.ops().refresh().ref("branch1");
-Assertions.assertThat(actualBranch).isNotNull();
-
Assertions.assertThat(actualBranch).isEqualTo(SnapshotRef.branchBuilder(snapshotId).build());
+assertThat(actualBranch).isNotNull();
+
assertThat(actualBranch).isEqualTo(SnapshotRef.branchBuilder(snapshotId).build());

Review Comment:
   both checks can be combined 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517622548


##
core/src/test/java/org/apache/iceberg/TestSnapshotManager.java:
##
@@ -301,32 +296,31 @@ public void testCreateBranchFailsWhenRefAlreadyExists() {
 .hasMessage("Ref branch2 already exists");
   }
 
-  @Test
+  @TestTemplate
   public void testCreateTag() {
 table.newAppend().appendFile(FILE_A).commit();
 long snapshotId = table.currentSnapshot().snapshotId();
 // Test a basic case of creating a tag
 table.manageSnapshots().createTag("tag1", snapshotId).commit();
 SnapshotRef expectedTag = table.ops().refresh().ref("tag1");
 
-Assert.assertTrue(
-expectedTag != null && 
expectedTag.equals(SnapshotRef.tagBuilder(snapshotId).build()));
+assertThat(expectedTag).isNotNull();
+
assertThat(expectedTag).isEqualTo(SnapshotRef.tagBuilder(snapshotId).build());

Review Comment:
   same as above



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517624708


##
core/src/test/java/org/apache/iceberg/TestSnapshotManager.java:
##
@@ -700,81 +689,82 @@ public void testAttemptToRollbackToCurrentSnapshot() {
 table.manageSnapshots().rollbackTo(currentSnapshotId).commit();
   }
 
-  @Test
+  @TestTemplate
   public void testSnapshotManagerThroughTransaction() {
 table.newAppend().appendFile(FILE_A).commit();
 Snapshot snapshotAfterFirstAppend = readMetadata().currentSnapshot();
 validateSnapshot(null, snapshotAfterFirstAppend, FILE_A);
 
 table.newAppend().appendFile(FILE_B).commit();
 validateSnapshot(snapshotAfterFirstAppend, 
readMetadata().currentSnapshot(), FILE_B);
-Assert.assertEquals("Table should be on version 2 after appending twice", 
2, (int) version());
+assertThat(version()).as("Table should be on version 2 after appending 
twice").isEqualTo(2);
 
 TableMetadata base = readMetadata();
 Transaction txn = table.newTransaction();
 
-Assert.assertSame(
-"Base metadata should not change when transaction is created", base, 
readMetadata());
-Assert.assertEquals(
-"Table should be on version 2 after creating transaction", 2, (int) 
version());
+assertThat(readMetadata())
+.as("Base metadata should not change when transaction is created")
+.isEqualTo(base);
+assertThat(version())
+.as("Table should be on version 2 after creating transaction")
+.isEqualTo(2);
 
 ManageSnapshots manageSnapshots = txn.manageSnapshots();
-Assert.assertNotNull(manageSnapshots);
+assertThat(manageSnapshots).isNotNull();
 
-Assert.assertSame(
-"Base metadata should not change when manageSnapshots is created", 
base, readMetadata());
-Assert.assertEquals(
-"Table should be on version 2 after creating manageSnapshots", 2, 
(int) version());
+assertThat(readMetadata())
+.as("Base metadata should not change when manageSnapshots is created")
+.isEqualTo(base);

Review Comment:
   this should be `isSameAs()` as this check actually verifies that both 
objects are referentially equal. Same for the other places that previously used 
`assertSame()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Handle listing tables/views in empty namespaces [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9890:
URL: https://github.com/apache/iceberg/pull/9890#discussion_r1517627456


##
core/src/main/java/org/apache/iceberg/inmemory/InMemoryCatalog.java:
##
@@ -288,6 +288,7 @@ public List listNamespaces(Namespace namespace) 
throws NoSuchNamespac
 
 List filteredNamespaces =
 namespaces.keySet().stream()
+.filter(n -> !n.isEmpty())

Review Comment:
   this would previously fail in L303 when calling 
`listNamespaces(Namespace.empty())`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Library public api isolation and import decoupling [iceberg-python]

2024-03-08 Thread via GitHub



ndrluis commented on issue #499:
URL: https://github.com/apache/iceberg-python/issues/499#issuecomment-1985722458

   @kevinjqliu Agreed, I haven't found anything that addresses this issue or 
provides a warning when it occurs. It seems like something the community has 
accepted, but I believe it's a problem. I will open an issue on Ruff to ask 
about this and understand how we can write a rule.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.43.0 [iceberg]

2024-03-08 Thread via GitHub



nastra closed pull request #9858: Build: Bump 
com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.43.0
URL: https://github.com/apache/iceberg/pull/9858


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.43.0 [iceberg]

2024-03-08 Thread via GitHub



dependabot[bot] commented on PR #9858:
URL: https://github.com/apache/iceberg/pull/9858#issuecomment-1985726060

   OK, I won't notify you again about this release, but will get in touch when 
a new version is available. If you'd rather skip all updates until the next 
major or minor version, let me know by commenting `@dependabot ignore this 
major version` or `@dependabot ignore this minor version`. You can also ignore 
all major, minor, or patch releases for a dependency by adding an [`ignore` 
condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore)
 with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts 
on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump spring-boot from 2.5.4 to 3.2.3 [iceberg]

2024-03-08 Thread via GitHub



nastra closed pull request #9796: Build: Bump spring-boot from 2.5.4 to 3.2.3
URL: https://github.com/apache/iceberg/pull/9796


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump spring-boot from 2.5.4 to 3.2.3 [iceberg]

2024-03-08 Thread via GitHub



dependabot[bot] commented on PR #9796:
URL: https://github.com/apache/iceberg/pull/9796#issuecomment-1985729696

   OK, I won't notify you again about this release, but will get in touch when 
a new version is available. You can also ignore all major, minor, or patch 
releases for a dependency by adding an [`ignore` 
condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore)
 with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts 
on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump com.google.cloud:libraries-bom from 26.28.0 to 26.33.0 [iceberg]

2024-03-08 Thread via GitHub



nastra closed pull request #9794: Build: Bump com.google.cloud:libraries-bom 
from 26.28.0 to 26.33.0
URL: https://github.com/apache/iceberg/pull/9794


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump com.google.cloud:libraries-bom from 26.28.0 to 26.33.0 [iceberg]

2024-03-08 Thread via GitHub



dependabot[bot] commented on PR #9794:
URL: https://github.com/apache/iceberg/pull/9794#issuecomment-1985744377

   OK, I won't notify you again about this release, but will get in touch when 
a new version is available. If you'd rather skip all updates until the next 
major or minor version, let me know by commenting `@dependabot ignore this 
major version` or `@dependabot ignore this minor version`. You can also ignore 
all major, minor, or patch releases for a dependency by adding an [`ignore` 
condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore)
 with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts 
on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump net.snowflake:snowflake-jdbc from 3.14.5 to 3.15.0 [iceberg]

2024-03-08 Thread via GitHub



nastra closed pull request #9797: Build: Bump net.snowflake:snowflake-jdbc from 
3.14.5 to 3.15.0
URL: https://github.com/apache/iceberg/pull/9797


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump net.snowflake:snowflake-jdbc from 3.14.5 to 3.15.0 [iceberg]

2024-03-08 Thread via GitHub



dependabot[bot] commented on PR #9797:
URL: https://github.com/apache/iceberg/pull/9797#issuecomment-1985745044

   OK, I won't notify you again about this release, but will get in touch when 
a new version is available. If you'd rather skip all updates until the next 
major or minor version, let me know by commenting `@dependabot ignore this 
major version` or `@dependabot ignore this minor version`. You can also ignore 
all major, minor, or patch releases for a dependency by adding an [`ignore` 
condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore)
 with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts 
on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump org.springframework:spring-web from 5.3.30 to 6.1.4 [iceberg]

2024-03-08 Thread via GitHub



nastra closed pull request #9748: Build: Bump org.springframework:spring-web 
from 5.3.30 to 6.1.4
URL: https://github.com/apache/iceberg/pull/9748


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump org.springframework:spring-web from 5.3.30 to 6.1.4 [iceberg]

2024-03-08 Thread via GitHub



dependabot[bot] commented on PR #9748:
URL: https://github.com/apache/iceberg/pull/9748#issuecomment-1985745769

   OK, I won't notify you again about this release, but will get in touch when 
a new version is available. If you'd rather skip all updates until the next 
major or minor version, let me know by commenting `@dependabot ignore this 
major version` or `@dependabot ignore this minor version`. You can also ignore 
all major, minor, or patch releases for a dependency by adding an [`ignore` 
condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore)
 with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts 
on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump com.google.errorprone:error_prone_annotations from 2.24.1 to 2.25.0 [iceberg]

2024-03-08 Thread via GitHub



nastra closed pull request #9746: Build: Bump 
com.google.errorprone:error_prone_annotations from 2.24.1 to 2.25.0
URL: https://github.com/apache/iceberg/pull/9746


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump com.google.errorprone:error_prone_annotations from 2.24.1 to 2.25.0 [iceberg]

2024-03-08 Thread via GitHub



dependabot[bot] commented on PR #9746:
URL: https://github.com/apache/iceberg/pull/9746#issuecomment-1985746185

   OK, I won't notify you again about this release, but will get in touch when 
a new version is available. If you'd rather skip all updates until the next 
major or minor version, let me know by commenting `@dependabot ignore this 
major version` or `@dependabot ignore this minor version`. You can also ignore 
all major, minor, or patch releases for a dependency by adding an [`ignore` 
condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore)
 with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts 
on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump org.openapitools:openapi-generator-gradle-plugin from 6.6.0 to 7.3.0 [iceberg]

2024-03-08 Thread via GitHub



nastra closed pull request #9703: Build: Bump 
org.openapitools:openapi-generator-gradle-plugin from 6.6.0 to 7.3.0
URL: https://github.com/apache/iceberg/pull/9703


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump org.openapitools:openapi-generator-gradle-plugin from 6.6.0 to 7.3.0 [iceberg]

2024-03-08 Thread via GitHub



dependabot[bot] commented on PR #9703:
URL: https://github.com/apache/iceberg/pull/9703#issuecomment-1985747945

   OK, I won't notify you again about this release, but will get in touch when 
a new version is available. If you'd rather skip all updates until the next 
major or minor version, let me know by commenting `@dependabot ignore this 
major version` or `@dependabot ignore this minor version`. You can also ignore 
all major, minor, or patch releases for a dependency by adding an [`ignore` 
condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore)
 with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts 
on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] HIVE-28021: escape percent symbol [iceberg]

2024-03-08 Thread via GitHub



nastra commented on code in PR #9667:
URL: https://github.com/apache/iceberg/pull/9667#discussion_r1517758155


##
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java:
##
@@ -966,4 +966,41 @@ private String 
getCurrentSnapshotForHiveCatalogTable(org.apache.iceberg.Table ic
 return ((BaseMetastoreTableOperations) ((BaseTable) 
icebergTable).operations())
 .currentMetadataLocation();
   }
+
+  @Test
+  public void testCreateTableWithPercentInName() throws IOException {
+Assume.assumeTrue("This test requires Hive Version 4.", 
HiveVersion.min(HiveVersion.HIVE_4));

Review Comment:
   is this test even applicable then for the Iceberg codebase? Iceberg + Hive4 
are maintained by the Hive community in 
https://github.com/apache/hive/tree/master/iceberg



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Add hive metastore catalog support [iceberg-rust]

2024-03-08 Thread via GitHub

marvinlanhenke commented on issue #113:
URL: https://github.com/apache/iceberg-rust/issues/113#issuecomment-1985760454

@Xuanwo @liurenjie1024
...just a quick update and a request for feedback.

I 'finished' the work on all the namespace operations.
Since this would be my first contribution to this repo, I would appreciate
it if someone could take a quick look at the current status.

I just want to make sure it aligns with the rest of the codebase, before I
put in the effort to implement the rest of the catalog.

https://github.com/marvinlanhenke/iceberg-rust/blob/hms/crates/catalog/hms/src/catalog.rs#L260-L428

sidenote:
For the hive-metastore docker-compose, I used a file I had from a previous
project; however this needs some optimization since the connection to the mysql
backend slows down the integration tests. My approach would be to use `derby`
instead of mysql to speed things up - but I havent gotten around to take a look
at it.

Anyway best regards and thanks in advance for the feedback.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump com.diffplug.spotless:spotless-plugin-gradle from 6.13.0 to 6.25.0 [iceberg]

2024-03-08 Thread via GitHub



dependabot[bot] commented on PR #9569:
URL: https://github.com/apache/iceberg/pull/9569#issuecomment-1985765560

   OK, I won't notify you again about this release, but will get in touch when 
a new version is available. If you'd rather skip all updates until the next 
major or minor version, let me know by commenting `@dependabot ignore this 
major version` or `@dependabot ignore this minor version`. You can also ignore 
all major, minor, or patch releases for a dependency by adding an [`ignore` 
condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore)
 with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts 
on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Bump com.diffplug.spotless:spotless-plugin-gradle from 6.13.0 to 6.25.0 [iceberg]

2024-03-08 Thread via GitHub



nastra closed pull request #9569: Build: Bump 
com.diffplug.spotless:spotless-plugin-gradle from 6.13.0 to 6.25.0
URL: https://github.com/apache/iceberg/pull/9569


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517821079


##
core/src/test/java/org/apache/iceberg/LocalTableOperations.java:
##
@@ -63,7 +65,9 @@ public String metadataFileLocation(String fileName) {
 fileName,
 name -> {
   try {
-return temp.newFile(name).getAbsolutePath();
+return java.nio.file.Files.createTempDirectory(temp, "junit")

Review Comment:
   thanks. yes, that should create a temp file. Will fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517827090


##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -19,23 +19,24 @@
 package org.apache.iceberg;
 
 import static org.apache.iceberg.Files.localInput;
+import static org.assertj.core.api.Assertions.assertThat;
 
 import java.io.File;
 import java.io.IOException;
+import java.nio.file.Path;
 import java.util.List;
-import java.util.UUID;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
-import org.assertj.core.api.Assertions;
-import org.junit.Assert;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.TemporaryFolder;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
 
 public class TestSnapshotJson {
-  @Rule public TemporaryFolder temp = new TemporaryFolder();
+  @TempDir private static Path temp;
 
-  public TableOperations ops = new LocalTableOperations(temp);
+  public TableOperations ops =
+  new LocalTableOperations(java.nio.file.Files.createTempDirectory(temp, 
"junit"));
+
+  public TestSnapshotJson() throws IOException {}

Review Comment:
   It doesn't need. As the above comment, I previously set 
`createTempDirectory` for `LocalTableOperations`, then the 
`createTempDirectory` needs the `IOException` handling. But as your feedback, 
`temp` should be set for `LocalTableOperations` and this `IOException` is not 
needed. Remove in the next commit.



##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -19,23 +19,24 @@
 package org.apache.iceberg;
 
 import static org.apache.iceberg.Files.localInput;
+import static org.assertj.core.api.Assertions.assertThat;
 
 import java.io.File;
 import java.io.IOException;
+import java.nio.file.Path;
 import java.util.List;
-import java.util.UUID;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
-import org.assertj.core.api.Assertions;
-import org.junit.Assert;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.TemporaryFolder;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
 
 public class TestSnapshotJson {
-  @Rule public TemporaryFolder temp = new TemporaryFolder();
+  @TempDir private static Path temp;
 
-  public TableOperations ops = new LocalTableOperations(temp);
+  public TableOperations ops =
+  new LocalTableOperations(java.nio.file.Files.createTempDirectory(temp, 
"junit"));
+
+  public TestSnapshotJson() throws IOException {}

Review Comment:
   It doesn't need. As the above comment, I previously set 
`createTempDirectory` for `LocalTableOperations`, then the 
`createTempDirectory` needs the `IOException` handling. But as your feedback, 
`temp` should be set for `LocalTableOperations` and this `IOException` is not 
needed. Remove it in the next commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[I] Parquet bloom filter doesn't work with nested fields [iceberg]

2024-03-08 Thread via GitHub



hussein-awala opened a new issue, #9898:
URL: https://github.com/apache/iceberg/issues/9898

   ### Apache Iceberg version
   
   1.4.3 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   I have an Iceberg table, and I want to create two bloom filters on a root 
string column and nested string column in a struct, I've set the properties 
`write.parquet.bloom-filter-enabled.column.a` and 
`write.parquet.bloom-filter-enabled.column.b.c` to `true`, and I checked with 
`parquet-cli`:
   ```bash
   $ parquet bloom-filter /path/to/file.parquet -c a -v 
   
   Row group 0:
   

   value  NOT exists.
   
   $ parquet bloom-filter /path/to/file.parquet -c a -v 
   
   Row group 0:
   

   value  maybe exists.
   
   $ parquet bloom-filter /path/to/file.parquet -c b.c -v 
   
   Row group 0:
   

   column b.c has no bloom filter
   
   # check if it's an issue with column name parsing:
   $ parquet bloom-filter /path/to/file.parquet -c b.d -v 
   Argument error: Schema doesn't have column: b.d
   ```
   
   However, I tried with Spark and parquet, and it worker without any issue:
   ```scala
   import org.apache.spark.sql.types._
   import org.apache.spark.sql.Row
   import spark.implicits._
   
   val schema = StructType(Array(
   StructField("a", StringType, true),
   StructField("b", StringType, true),
   StructField("nested", StructType(Array(
 StructField("c", StringType, true),
 StructField("d", StringType, true)
   )), true)
   ))
   
   val data = Seq(
   Row("1", "25", Row("100", "a")),
   Row("2", "30", Row("200", "b")),
   Row("3", "35", Row("300", "c")),
   Row("4", "40", Row("400", "d")),
   Row("5", "45", Row("500", "e"))
   )
   
   val df = spark.createDataFrame(
   spark.sparkContext.parallelize(data),
   schema
   )
   
   df.write.format("parquet")
   .option("parquet.bloom.filter.enabled#a", "true")
   .option("parquet.bloom.filter.enabled#nested.c", "true")
   .save("bloom_parquet")
   ```
   Check with `parquet-cli`
   ```bash
   $  github parquet bloom-filter 
bloom_parquet/part-2-9fac4c38-7113-45df-8db9-d96c3f6b6a8e-c000.snappy.parquet
 -c a -v "1"
   
   Row group 0:
   

   value 1 maybe exists.
   
   $  github parquet bloom-filter 
bloom_parquet/part-2-9fac4c38-7113-45df-8db9-d96c3f6b6a8e-c000.snappy.parquet
 -c nested.c -v "1"
   
   Row group 0:
   

   value 1 NOT exists.
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517838222


##
core/src/test/java/org/apache/iceberg/TestTableMetadata.java:
##
@@ -1731,7 +1730,7 @@ public void testNoTrailingLocationSlash() {
 
   private String createManifestListWithManifestFile(
   long snapshotId, Long parentSnapshotId, String manifestFile) throws 
IOException {
-File manifestList = temp.newFile("manifests" + UUID.randomUUID());
+File manifestList = java.nio.file.Files.createTempDirectory(temp, 
"junit").toFile();

Review Comment:
   yes it should be a temp file. Will fix this part.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517839588


##
core/src/test/java/org/apache/iceberg/TestSnapshotManager.java:
##
@@ -177,120 +177,115 @@ public void testCherryPickFromBranch() {
 long lastSnapshotId = table.currentSnapshot().snapshotId();
 
 // pick the snapshot into the current state
-Assertions.assertThatThrownBy(
-() -> 
table.manageSnapshots().cherrypick(replaceSnapshotId).commit())
+assertThatThrownBy(() -> 
table.manageSnapshots().cherrypick(replaceSnapshotId).commit())
 .isInstanceOf(ValidationException.class)
 .hasMessageStartingWith(
 "Cannot cherry-pick overwrite not based on an ancestor of the 
current state");
 
-Assert.assertEquals(
-"Failed cherry-pick should not change the table state",
-lastSnapshotId,
-table.currentSnapshot().snapshotId());
+assertThat(table.currentSnapshot().snapshotId())
+.as("Failed cherry-pick should not change the table state")
+.isEqualTo(lastSnapshotId);
 validateTableFiles(table, FILE_A);
   }
 
-  @Test
+  @TestTemplate
   public void testCherryPickOverwrite() {
 table.newAppend().appendFile(FILE_A).commit();
 
 // stage an overwrite to replace FILE_A
 
table.newOverwrite().deleteFile(FILE_A).addFile(REPLACEMENT_FILE_A).stageOnly().commit();
 
 Snapshot staged = Iterables.getLast(table.snapshots());
-Assert.assertEquals(
-"Should find the staged overwrite snapshot", DataOperations.OVERWRITE, 
staged.operation());
+assertThat(staged.operation())
+.as("Should find the staged overwrite snapshot")
+.isEqualTo(DataOperations.OVERWRITE);
 
 // add another append so that the original commit can't be fast-forwarded
 table.newAppend().appendFile(FILE_B).commit();
 long lastSnapshotId = table.currentSnapshot().snapshotId();
 
 // pick the snapshot into the current state
-Assertions.assertThatThrownBy(
-() -> 
table.manageSnapshots().cherrypick(staged.snapshotId()).commit())
+assertThatThrownBy(() -> 
table.manageSnapshots().cherrypick(staged.snapshotId()).commit())
 .isInstanceOf(ValidationException.class)
 .hasMessageEndingWith("not append, dynamic overwrite, or 
fast-forward");
 
-Assert.assertEquals(
-"Failed cherry-pick should not change the table state",
-lastSnapshotId,
-table.currentSnapshot().snapshotId());
+assertThat(table.currentSnapshot().snapshotId())
+.as("Failed cherry-pick should not change the table state")
+.isEqualTo(lastSnapshotId);
 validateTableFiles(table, FILE_A, FILE_B);
   }
 
-  @Test
+  @TestTemplate
   public void testCreateBranch() {
 table.newAppend().appendFile(FILE_A).commit();
 long snapshotId = table.currentSnapshot().snapshotId();
 // Test a basic case of creating a branch
 table.manageSnapshots().createBranch("branch1", snapshotId).commit();
 SnapshotRef expectedBranch = table.ops().refresh().ref("branch1");
-Assert.assertTrue(
-expectedBranch != null
-&& 
expectedBranch.equals(SnapshotRef.branchBuilder(snapshotId).build()));
+assertThat(expectedBranch).isNotNull();
+
assertThat(expectedBranch).isEqualTo(SnapshotRef.branchBuilder(snapshotId).build());
   }
 
-  @Test
+  @TestTemplate
   public void testCreateBranchWithoutSnapshotId() {
 table.newAppend().appendFile(FILE_A).commit();
 long snapshotId = table.currentSnapshot().snapshotId();
 // Test a basic case of creating a branch
 table.manageSnapshots().createBranch("branch1").commit();
 SnapshotRef actualBranch = table.ops().refresh().ref("branch1");
-Assertions.assertThat(actualBranch).isNotNull();
-
Assertions.assertThat(actualBranch).isEqualTo(SnapshotRef.branchBuilder(snapshotId).build());
+assertThat(actualBranch).isNotNull();
+
assertThat(actualBranch).isEqualTo(SnapshotRef.branchBuilder(snapshotId).build());

Review Comment:
   sure, thank you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2024-03-08 Thread via GitHub



stevenzwu commented on code in PR #9321:
URL: https://github.com/apache/iceberg/pull/9321#discussion_r1517846647


##
flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java:
##
@@ -0,0 +1,378 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.flink.sink.shuffle;
+
+import java.util.Arrays;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.NavigableMap;
+import java.util.concurrent.ThreadLocalRandom;
+import java.util.concurrent.TimeUnit;
+import org.apache.flink.api.common.functions.Partitioner;
+import org.apache.flink.table.data.RowData;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.SortKey;
+import org.apache.iceberg.SortOrder;
+import org.apache.iceberg.SortOrderComparators;
+import org.apache.iceberg.StructLike;
+import org.apache.iceberg.flink.FlinkSchemaUtil;
+import org.apache.iceberg.flink.RowDataWrapper;
+import 
org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting;
+import org.apache.iceberg.relocated.com.google.common.base.MoreObjects;
+import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
+import org.apache.iceberg.relocated.com.google.common.collect.Lists;
+import org.apache.iceberg.relocated.com.google.common.collect.Maps;
+import org.apache.iceberg.util.Pair;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Internal partitioner implementation that supports MapDataStatistics, which 
is typically used for
+ * low-cardinality use cases. While MapDataStatistics can keep accurate 
counters, it can't be used
+ * for high-cardinality use cases. Otherwise, the memory footprint is too high.
+ *
+ * It is a greedy algorithm for bin packing. With close file cost, the 
calculation isn't always
+ * precise when calculating close cost for every file, target weight per 
subtask, padding residual
+ * weight, assigned weight without close cost.
+ *
+ * All actions should be executed in a single Flink mailbox thread. So 
there is no need to make
+ * it thread safe.
+ */
+class MapRangePartitioner implements Partitioner {
+  private static final Logger LOG = 
LoggerFactory.getLogger(MapRangePartitioner.class);
+
+  private final RowDataWrapper rowDataWrapper;
+  private final SortKey sortKey;
+  private final Comparator comparator;
+  private final Map mapStatistics;
+  private final double closeFileCostInWeightPercentage;
+
+  // Counter that tracks how many times a new key encountered
+  // where there is no traffic statistics learned about it.
+  private long newSortKeyCounter;
+  private long lastNewSortKeyLogTimeMilli;
+
+  // lazily computed due to the need of numPartitions
+  private Map assignment;
+  private NavigableMap sortedStatsWithCloseFileCost;
+
+  MapRangePartitioner(
+  Schema schema,
+  SortOrder sortOrder,
+  MapDataStatistics dataStatistics,
+  double closeFileCostInWeightPercentage) {
+dataStatistics
+.statistics()
+.entrySet()
+.forEach(
+entry ->
+Preconditions.checkArgument(
+entry.getValue() > 0,
+"Invalid statistics: weight is 0 for key %s",
+entry.getKey()));
+
+this.rowDataWrapper = new RowDataWrapper(FlinkSchemaUtil.convert(schema), 
schema.asStruct());
+this.sortKey = new SortKey(schema, sortOrder);
+this.comparator = SortOrderComparators.forSchema(schema, sortOrder);
+this.mapStatistics = dataStatistics.statistics();
+this.closeFileCostInWeightPercentage = closeFileCostInWeightPercentage;
+this.newSortKeyCounter = 0;
+this.lastNewSortKeyLogTimeMilli = System.currentTimeMillis();
+  }
+
+  @Override
+  public int partition(RowData row, int numPartitions) {
+// assignment table can only be built lazily when first referenced here,
+// because number of partitions (downstream subtasks) is needed
+Map assignmentMap = assignment(numPartitions);
+// reuse the sortKey and rowDataWrapper
+sortKey.wrap(rowDataWrapper.wrap(row));
+KeyAssignment keyAssignment = assignmentMap.get(sortKey);
+if (keyAssignment

Re: [I] Add hive metastore catalog support [iceberg-rust]

2024-03-08 Thread via GitHub



Xuanwo commented on issue #113:
URL: https://github.com/apache/iceberg-rust/issues/113#issuecomment-1985888081

   > I 'finished' the work on all the namespace operations.
   
   Awesome!
   
   > I just want to make sure it aligns with the rest of the codebase, before I 
put in the effort to implement the rest of the catalog.
   
   Let's merge this PR first and then start another one. There's no need to 
combine them into a single large PR.
   
   In short, submit a PR and then we can polish it together.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Rename `stat_table` in `Catalog` trait [iceberg-rust]

2024-03-08 Thread via GitHub



Xuanwo commented on issue #236:
URL: https://github.com/apache/iceberg-rust/issues/236#issuecomment-1985889243

   I remember that `stat_table` will also return some metadata of this table.
   
   cc @Fokko @liurenjie1024, any comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517865244


##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -49,12 +50,23 @@ public void testJsonConversion() throws IOException {
 String json = SnapshotParser.toJson(expected);
 Snapshot snapshot = SnapshotParser.fromJson(json);
 
-Assert.assertEquals("Snapshot ID should match", expected.snapshotId(), 
snapshot.snapshotId());
-Assert.assertEquals(
-"Files should match", expected.allManifests(ops.io()), 
snapshot.allManifests(ops.io()));
-Assert.assertNull("Operation should be null", snapshot.operation());
-Assert.assertNull("Summary should be null", snapshot.summary());
-Assert.assertEquals("Schema ID should match", Integer.valueOf(1), 
snapshot.schemaId());
+assertThat(snapshot.snapshotId()).isEqualTo(expected.snapshotId());
+assertThat(snapshot.allManifests(ops.io())).hasSize(2);
+
+for (int i = 0; i < snapshot.allManifests(ops.io()).size(); i++) {

Review Comment:
   @nastra I tried to check each element in the manifest list, but it's a bit 
redundant. I believe it's enough to check with `isEqualTo` between two lists. 
Let me fix this part and other part like this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517865244


##
core/src/test/java/org/apache/iceberg/TestSnapshotJson.java:
##
@@ -49,12 +50,23 @@ public void testJsonConversion() throws IOException {
 String json = SnapshotParser.toJson(expected);
 Snapshot snapshot = SnapshotParser.fromJson(json);
 
-Assert.assertEquals("Snapshot ID should match", expected.snapshotId(), 
snapshot.snapshotId());
-Assert.assertEquals(
-"Files should match", expected.allManifests(ops.io()), 
snapshot.allManifests(ops.io()));
-Assert.assertNull("Operation should be null", snapshot.operation());
-Assert.assertNull("Summary should be null", snapshot.summary());
-Assert.assertEquals("Schema ID should match", Integer.valueOf(1), 
snapshot.schemaId());
+assertThat(snapshot.snapshotId()).isEqualTo(expected.snapshotId());
+assertThat(snapshot.allManifests(ops.io())).hasSize(2);
+
+for (int i = 0; i < snapshot.allManifests(ops.io()).size(); i++) {

Review Comment:
   @nastra I tried to check each element in the manifest list, but it's a bit 
redundant. I believe it's enough to check with `isEqualTo` between two lists. 
Let me revert this part and other part like this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Library public api isolation and import decoupling [iceberg-python]

2024-03-08 Thread via GitHub



ndrluis commented on issue #499:
URL: https://github.com/apache/iceberg-python/issues/499#issuecomment-1985907062

   Issue to follow up https://github.com/astral-sh/ruff/issues/10300


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on code in PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#discussion_r1517877405


##
core/src/test/java/org/apache/iceberg/TestSnapshotManager.java:
##
@@ -700,81 +689,82 @@ public void testAttemptToRollbackToCurrentSnapshot() {
 table.manageSnapshots().rollbackTo(currentSnapshotId).commit();
   }
 
-  @Test
+  @TestTemplate
   public void testSnapshotManagerThroughTransaction() {
 table.newAppend().appendFile(FILE_A).commit();
 Snapshot snapshotAfterFirstAppend = readMetadata().currentSnapshot();
 validateSnapshot(null, snapshotAfterFirstAppend, FILE_A);
 
 table.newAppend().appendFile(FILE_B).commit();
 validateSnapshot(snapshotAfterFirstAppend, 
readMetadata().currentSnapshot(), FILE_B);
-Assert.assertEquals("Table should be on version 2 after appending twice", 
2, (int) version());
+assertThat(version()).as("Table should be on version 2 after appending 
twice").isEqualTo(2);
 
 TableMetadata base = readMetadata();
 Transaction txn = table.newTransaction();
 
-Assert.assertSame(
-"Base metadata should not change when transaction is created", base, 
readMetadata());
-Assert.assertEquals(
-"Table should be on version 2 after creating transaction", 2, (int) 
version());
+assertThat(readMetadata())
+.as("Base metadata should not change when transaction is created")
+.isEqualTo(base);
+assertThat(version())
+.as("Table should be on version 2 after creating transaction")
+.isEqualTo(2);
 
 ManageSnapshots manageSnapshots = txn.manageSnapshots();
-Assert.assertNotNull(manageSnapshots);
+assertThat(manageSnapshots).isNotNull();
 
-Assert.assertSame(
-"Base metadata should not change when manageSnapshots is created", 
base, readMetadata());
-Assert.assertEquals(
-"Table should be on version 2 after creating manageSnapshots", 2, 
(int) version());
+assertThat(readMetadata())
+.as("Base metadata should not change when manageSnapshots is created")
+.isEqualTo(base);

Review Comment:
   Okay sure. Thanks for the advice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Rename `stat_table` in `Catalog` trait [iceberg-rust]

2024-03-08 Thread via GitHub



marvinlanhenke commented on issue #236:
URL: https://github.com/apache/iceberg-rust/issues/236#issuecomment-1985944355

   > I remember that `stat_table` will also return some metadata of this table.
   
   I only took a look at the java & python hive implementation and I couldn't 
find a similiar table operation - but perhaps `stat_table` is used in a 
different catalag implementation that does indeed return the metadata.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Core: add missing `@Test` annotation in TestRESTCatalog [iceberg]

2024-03-08 Thread via GitHub



adutra opened a new pull request, #9899:
URL: https://github.com/apache/iceberg/pull/9899

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Add hive metastore catalog support (part 1/2) [iceberg-rust]

2024-03-08 Thread via GitHub



marvinlanhenke opened a new pull request, #237:
URL: https://github.com/apache/iceberg-rust/pull/237

   ### Which issue does this PR close?
   Closes #113 partly.
   
   ### Rationale for this change
   Add hive metastore support, to reach feature parity with other iceberg 
implementations.
   
   ### What changes are included in this PR?
   Added test infra (hive-metastore, docker-compose) for integration test.
   Implemented all of the namespace operations from the `Catalog` trait.
   
   ### Are these changes tested?
   Yes. Unit and integration tests are included.
   
   ### Are there any user-facing changes?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Add hive metastore catalog support [iceberg-rust]

2024-03-08 Thread via GitHub



marvinlanhenke commented on issue #113:
URL: https://github.com/apache/iceberg-rust/issues/113#issuecomment-1985986494

   > In short, submit a PR and then we can polish it together.
   
   Sounds good to me - looking forward to it.
   
   Best regards.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Allow setting non-string typed values in `set_properties` [iceberg-python]

2024-03-08 Thread via GitHub



kevinjqliu commented on PR #504:
URL: https://github.com/apache/iceberg-python/pull/504#issuecomment-1985996781

   rebased after #503 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518068800


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2068,6 +2162,145 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.
+  type: object
+
+FileScanTask:
+  type: object
+  required:
+- schema
+- spec
+- start
+- length
+- data-file
+  properties:
+data-file:
+  $ref: '#/components/schemas/ContentFile'
+partition:
+  type: object
+  additionalProperties:
+type: string
+size-bytes:
+  type: number
+start:
+  type: number
+length:
+  type: number
+estimated-rows-count:
+  type: number
+delete-files:
+  type: array
+  items:
+$ref: '#/components/schemas/ContentFile'
+schema:
+  $ref: '#/components/schemas/Schema'
+spec:
+  $ref: '#/components/schemas/PartitionSpec'
+residual-filter:
+  $ref: '#/components/schemas/Expression'
+
+TypeValue:
+  oneOf:
+- $ref: '#/components/schemas/PrimitiveTypeValue'
+- $ref: '#/components/schemas/MapTypeValue'
+- $ref: '#/components/schemas/StructTypeValue'
+
+MapTypeValue:
+  type: object
+  properties:
+keys:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+values:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+StructTypeValue:
+  type: object
+  additionalProperties:
+oneOf:
+  - $ref: '#/components/schemas/TypeValue'
+
+PrimitiveTypeValue:
+  oneOf:
+- type: boolean
+- type: integer
+- type: integer
+  format: int64
+- type: number
+- type: string
+- type: string
+  format: byte
+- type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+ContentFile:
+  type: object
+  required:
+- spec-id
+- content
+- file-path
+- file-format
+- file-size-in-bytes
+- record-count
+  properties:
+spec-id:
+  type: integer
+content:
+  $ref: '#/components/schemas/FileContent'
+file-path:
+  type: string
+file-format:
+  type: string
+partition:
+  $ref: '#/components/schemas/StructTypeValue'
+file-size-in-bytes:
+  type: integer
+  format: int64
+record-count:
+  type: integer
+  format: int64
+column-sizes:
+  $ref: '#/components/schemas/MapTypeValue'
+value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+null-value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+nan-value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+lower-bounds:
+  $ref: '#/components/schemas/MapTypeValue'
+upper-bounds:
+  $ref: '#/components/schemas/MapTypeValue'
+key-metadata:
+  type: string
+  format: byte
+split-offsets:
+  type: array
+  items:
+type: integer
+format: int64
+equality-ids:
+  type: array
+  items:
+type: integer
+sort-order-id:

Review Comment:
   Looks like we kept this in the ContentFile pr: 
https://github.com/apache/iceberg/commit/bb53c3d4e0e27ac6706803c2371793ad2476ae04
 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518068800


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2068,6 +2162,145 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.
+  type: object
+
+FileScanTask:
+  type: object
+  required:
+- schema
+- spec
+- start
+- length
+- data-file
+  properties:
+data-file:
+  $ref: '#/components/schemas/ContentFile'
+partition:
+  type: object
+  additionalProperties:
+type: string
+size-bytes:
+  type: number
+start:
+  type: number
+length:
+  type: number
+estimated-rows-count:
+  type: number
+delete-files:
+  type: array
+  items:
+$ref: '#/components/schemas/ContentFile'
+schema:
+  $ref: '#/components/schemas/Schema'
+spec:
+  $ref: '#/components/schemas/PartitionSpec'
+residual-filter:
+  $ref: '#/components/schemas/Expression'
+
+TypeValue:
+  oneOf:
+- $ref: '#/components/schemas/PrimitiveTypeValue'
+- $ref: '#/components/schemas/MapTypeValue'
+- $ref: '#/components/schemas/StructTypeValue'
+
+MapTypeValue:
+  type: object
+  properties:
+keys:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+values:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+StructTypeValue:
+  type: object
+  additionalProperties:
+oneOf:
+  - $ref: '#/components/schemas/TypeValue'
+
+PrimitiveTypeValue:
+  oneOf:
+- type: boolean
+- type: integer
+- type: integer
+  format: int64
+- type: number
+- type: string
+- type: string
+  format: byte
+- type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+ContentFile:
+  type: object
+  required:
+- spec-id
+- content
+- file-path
+- file-format
+- file-size-in-bytes
+- record-count
+  properties:
+spec-id:
+  type: integer
+content:
+  $ref: '#/components/schemas/FileContent'
+file-path:
+  type: string
+file-format:
+  type: string
+partition:
+  $ref: '#/components/schemas/StructTypeValue'
+file-size-in-bytes:
+  type: integer
+  format: int64
+record-count:
+  type: integer
+  format: int64
+column-sizes:
+  $ref: '#/components/schemas/MapTypeValue'
+value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+null-value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+nan-value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+lower-bounds:
+  $ref: '#/components/schemas/MapTypeValue'
+upper-bounds:
+  $ref: '#/components/schemas/MapTypeValue'
+key-metadata:
+  type: string
+  format: byte
+split-offsets:
+  type: array
+  items:
+type: integer
+format: int64
+equality-ids:
+  type: array
+  items:
+type: integer
+sort-order-id:

Review Comment:
   Looks like we kept this in Drews ContentFile pr: 
https://github.com/apache/iceberg/commit/bb53c3d4e0e27ac6706803c2371793ad2476ae04
 
   so I believe we can keep this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518075237


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2068,6 +2162,145 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.
+  type: object
+
+FileScanTask:
+  type: object
+  required:
+- schema
+- spec
+- start
+- length
+- data-file
+  properties:
+data-file:
+  $ref: '#/components/schemas/ContentFile'
+partition:
+  type: object
+  additionalProperties:
+type: string
+size-bytes:
+  type: number
+start:
+  type: number
+length:
+  type: number
+estimated-rows-count:
+  type: number
+delete-files:
+  type: array
+  items:
+$ref: '#/components/schemas/ContentFile'
+schema:
+  $ref: '#/components/schemas/Schema'
+spec:
+  $ref: '#/components/schemas/PartitionSpec'
+residual-filter:
+  $ref: '#/components/schemas/Expression'
+
+TypeValue:
+  oneOf:
+- $ref: '#/components/schemas/PrimitiveTypeValue'
+- $ref: '#/components/schemas/MapTypeValue'
+- $ref: '#/components/schemas/StructTypeValue'
+
+MapTypeValue:
+  type: object
+  properties:
+keys:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+values:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+StructTypeValue:
+  type: object
+  additionalProperties:
+oneOf:
+  - $ref: '#/components/schemas/TypeValue'
+
+PrimitiveTypeValue:
+  oneOf:
+- type: boolean
+- type: integer
+- type: integer
+  format: int64
+- type: number
+- type: string
+- type: string
+  format: byte
+- type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+ContentFile:
+  type: object
+  required:
+- spec-id

Review Comment:
   in the recent content file pr that was merged: 
https://github.com/apache/iceberg/commit/bb53c3d4e0e27ac6706803c2371793ad2476ae04
   we have `spec-id` as required but not `partition` so can we resolve this 
thread?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518075237


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2068,6 +2162,145 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.
+  type: object
+
+FileScanTask:
+  type: object
+  required:
+- schema
+- spec
+- start
+- length
+- data-file
+  properties:
+data-file:
+  $ref: '#/components/schemas/ContentFile'
+partition:
+  type: object
+  additionalProperties:
+type: string
+size-bytes:
+  type: number
+start:
+  type: number
+length:
+  type: number
+estimated-rows-count:
+  type: number
+delete-files:
+  type: array
+  items:
+$ref: '#/components/schemas/ContentFile'
+schema:
+  $ref: '#/components/schemas/Schema'
+spec:
+  $ref: '#/components/schemas/PartitionSpec'
+residual-filter:
+  $ref: '#/components/schemas/Expression'
+
+TypeValue:
+  oneOf:
+- $ref: '#/components/schemas/PrimitiveTypeValue'
+- $ref: '#/components/schemas/MapTypeValue'
+- $ref: '#/components/schemas/StructTypeValue'
+
+MapTypeValue:
+  type: object
+  properties:
+keys:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+values:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+StructTypeValue:
+  type: object
+  additionalProperties:
+oneOf:
+  - $ref: '#/components/schemas/TypeValue'
+
+PrimitiveTypeValue:
+  oneOf:
+- type: boolean
+- type: integer
+- type: integer
+  format: int64
+- type: number
+- type: string
+- type: string
+  format: byte
+- type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+ContentFile:
+  type: object
+  required:
+- spec-id

Review Comment:
   in the recent content file pr that was merged: 
https://github.com/apache/iceberg/commit/bb53c3d4e0e27ac6706803c2371793ad2476ae04
   we have `spec-id` only as required so can we resolve this thread?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518068800


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2068,6 +2162,145 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.
+  type: object
+
+FileScanTask:
+  type: object
+  required:
+- schema
+- spec
+- start
+- length
+- data-file
+  properties:
+data-file:
+  $ref: '#/components/schemas/ContentFile'
+partition:
+  type: object
+  additionalProperties:
+type: string
+size-bytes:
+  type: number
+start:
+  type: number
+length:
+  type: number
+estimated-rows-count:
+  type: number
+delete-files:
+  type: array
+  items:
+$ref: '#/components/schemas/ContentFile'
+schema:
+  $ref: '#/components/schemas/Schema'
+spec:
+  $ref: '#/components/schemas/PartitionSpec'
+residual-filter:
+  $ref: '#/components/schemas/Expression'
+
+TypeValue:
+  oneOf:
+- $ref: '#/components/schemas/PrimitiveTypeValue'
+- $ref: '#/components/schemas/MapTypeValue'
+- $ref: '#/components/schemas/StructTypeValue'
+
+MapTypeValue:
+  type: object
+  properties:
+keys:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+values:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+StructTypeValue:
+  type: object
+  additionalProperties:
+oneOf:
+  - $ref: '#/components/schemas/TypeValue'
+
+PrimitiveTypeValue:
+  oneOf:
+- type: boolean
+- type: integer
+- type: integer
+  format: int64
+- type: number
+- type: string
+- type: string
+  format: byte
+- type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+ContentFile:
+  type: object
+  required:
+- spec-id
+- content
+- file-path
+- file-format
+- file-size-in-bytes
+- record-count
+  properties:
+spec-id:
+  type: integer
+content:
+  $ref: '#/components/schemas/FileContent'
+file-path:
+  type: string
+file-format:
+  type: string
+partition:
+  $ref: '#/components/schemas/StructTypeValue'
+file-size-in-bytes:
+  type: integer
+  format: int64
+record-count:
+  type: integer
+  format: int64
+column-sizes:
+  $ref: '#/components/schemas/MapTypeValue'
+value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+null-value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+nan-value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+lower-bounds:
+  $ref: '#/components/schemas/MapTypeValue'
+upper-bounds:
+  $ref: '#/components/schemas/MapTypeValue'
+key-metadata:
+  type: string
+  format: byte
+split-offsets:
+  type: array
+  items:
+type: integer
+format: int64
+equality-ids:
+  type: array
+  items:
+type: integer
+sort-order-id:

Review Comment:
   Looks like we kept this in the ContentFile pr: 
https://github.com/apache/iceberg/commit/bb53c3d4e0e27ac6706803c2371793ad2476ae04
 , so wondering if we can resolve this thread
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518113629


##
open-api/rest-catalog-open-api.yaml:
##
@@ -537,6 +537,108 @@ paths:
 5XX:
   $ref: '#/components/responses/ServerErrorResponse'
 
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/preplan:
+parameters:
+  - $ref: '#/components/parameters/prefix'
+  - $ref: '#/components/parameters/namespace'
+  - $ref: '#/components/parameters/table'
+post:
+  tags:
+- Catalog API
+  summary: Find plan-tasks based on a plan context.

Review Comment:
   Will fix this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518118829


##
open-api/rest-catalog-open-api.yaml:
##
@@ -3789,6 +4097,21 @@ components:
 EmptyResponse:
   $ref: '#/components/examples/ListNamespacesEmptyExample'
 
+PayLoadTooLargeResponse:

Review Comment:
   will fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Getting java.lang.NoSuchMethodError org.apache.iceberg.util.JsonUtil.mapper() With Hive catalog [iceberg]

2024-03-08 Thread via GitHub



ismailsimsek commented on issue #9848:
URL: https://github.com/apache/iceberg/issues/9848#issuecomment-1986226824

   Fixed https://github.com/memiiso/debezium-server-iceberg/issues/271


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Getting java.lang.NoSuchMethodError org.apache.iceberg.util.JsonUtil.mapper() With Hive catalog [iceberg]

2024-03-08 Thread via GitHub



ismailsimsek closed issue #9848: Getting java.lang.NoSuchMethodError 
org.apache.iceberg.util.JsonUtil.mapper() With Hive catalog
URL: https://github.com/apache/iceberg/issues/9848


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Feature make oauth `audience` configurable [iceberg]

2024-03-08 Thread via GitHub



flyrain commented on code in PR #9839:
URL: https://github.com/apache/iceberg/pull/9839#discussion_r1518181685


##
core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java:
##
@@ -471,7 +515,13 @@ static void setTokenRefreshNumRetries(int retries) {
  */
 public static AuthSession empty() {
   return new AuthSession(
-  ImmutableMap.of(), null, null, null, OAuth2Properties.CATALOG_SCOPE, 
null);
+  ImmutableMap.of(),
+  null,
+  null,
+  null,
+  OAuth2Properties.CATALOG_SCOPE,
+  null,
+  ImmutableMap.of());

Review Comment:
   Can we generate the request class(e.g., 
[OAuthClientCredentialsRequest](https://github.com/apache/iceberg/blob/bcbcbb263ea7e13ab22d0feb918e207c7e42dbbd/open-api/rest-catalog-open-api.yaml#L2919)
 ) from the spec so that we can pass a request object with all required and 
optional params? By doing that, we will always make sure the impl. synced with 
spec.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518181730


##
open-api/rest-catalog-open-api.yaml:
##
@@ -3789,6 +4097,21 @@ components:
 EmptyResponse:
   $ref: '#/components/examples/ListNamespacesEmptyExample'
 
+PayLoadTooLargeResponse:
+  description:
+Payload is too large, clients should leverage the preplan endpoint and 
provide a plan-task before doing a plan.

Review Comment:
   Will make description generic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518182591


##
open-api/rest-catalog-open-api.yaml:
##
@@ -537,6 +537,108 @@ paths:
 5XX:
   $ref: '#/components/responses/ServerErrorResponse'
 
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/preplan:
+parameters:
+  - $ref: '#/components/parameters/prefix'
+  - $ref: '#/components/parameters/namespace'
+  - $ref: '#/components/parameters/table'
+post:
+  tags:
+- Catalog API
+  summary: Find plan-tasks based on a plan context.
+  description:
+Scan pre-planning creates a set of opaque planning tasks for a set of 
scan configuration options. 
+Each task can be passed to the plan endpoint to fetch a (disjoint) 
subset of the file scan tasks for the scan.
+
+Scan pre-planning enables breaking scan planning across multiple 
tasks. 
+This can be used to parallelize scan planning requests, use fewer 
resources in each planning request,
+or to delay parts of scan planning that may not be needed.

Review Comment:
   will fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518183297


##
open-api/rest-catalog-open-api.yaml:
##
@@ -537,6 +537,108 @@ paths:
 5XX:
   $ref: '#/components/responses/ServerErrorResponse'
 
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/preplan:
+parameters:
+  - $ref: '#/components/parameters/prefix'
+  - $ref: '#/components/parameters/namespace'
+  - $ref: '#/components/parameters/table'
+post:
+  tags:
+- Catalog API
+  summary: Find plan-tasks based on a plan context.
+  description:
+Scan pre-planning creates a set of opaque planning tasks for a set of 
scan configuration options. 
+Each task can be passed to the plan endpoint to fetch a (disjoint) 
subset of the file scan tasks for the scan.
+
+Scan pre-planning enables breaking scan planning across multiple 
tasks. 
+This can be used to parallelize scan planning requests, use fewer 
resources in each planning request,
+or to delay parts of scan planning that may not be needed.
+  operationId: PreplanTable
+  requestBody:
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/PreplanTableRequest'
+  responses:
+200:
+  $ref: '#/components/responses/PreplanTableResponse'
+400:
+  $ref: '#/components/responses/BadRequestErrorResponse'
+401:
+  $ref: '#/components/responses/UnauthorizedResponse'
+403:
+  $ref: '#/components/responses/ForbiddenResponse'
+404:
+  description:
+Not Found
+- NoSuchTableException, the table does not exist
+- NoSuchNamespaceException, the namespace does not exist
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/ErrorModel'
+  examples:
+TableDoesNotExist:
+  $ref: '#/components/examples/NoSuchTableError'
+NamespaceDoesNotExist:
+  $ref: '#/components/examples/NoSuchNamespaceError'
+419:
+  $ref: '#/components/responses/AuthenticationTimeoutResponse'
+503:
+  $ref: '#/components/responses/ServiceUnavailableResponse'
+5XX:
+  $ref: '#/components/responses/ServerErrorResponse'
+
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan:
+parameters:
+  - $ref: '#/components/parameters/prefix'
+  - $ref: '#/components/parameters/namespace'
+  - $ref: '#/components/parameters/table'
+post:
+  tags:
+- Catalog API
+  summary: Initiate plan on table to find relevant file scan tasks for a 
specific query.

Review Comment:
   will fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518184139


##
open-api/rest-catalog-open-api.yaml:
##
@@ -537,6 +537,108 @@ paths:
 5XX:
   $ref: '#/components/responses/ServerErrorResponse'
 
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/preplan:
+parameters:
+  - $ref: '#/components/parameters/prefix'
+  - $ref: '#/components/parameters/namespace'
+  - $ref: '#/components/parameters/table'
+post:
+  tags:
+- Catalog API
+  summary: Find plan-tasks based on a plan context.
+  description:
+Scan pre-planning creates a set of opaque planning tasks for a set of 
scan configuration options. 
+Each task can be passed to the plan endpoint to fetch a (disjoint) 
subset of the file scan tasks for the scan.
+
+Scan pre-planning enables breaking scan planning across multiple 
tasks. 
+This can be used to parallelize scan planning requests, use fewer 
resources in each planning request,
+or to delay parts of scan planning that may not be needed.
+  operationId: PreplanTable
+  requestBody:
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/PreplanTableRequest'
+  responses:
+200:
+  $ref: '#/components/responses/PreplanTableResponse'
+400:
+  $ref: '#/components/responses/BadRequestErrorResponse'
+401:
+  $ref: '#/components/responses/UnauthorizedResponse'
+403:
+  $ref: '#/components/responses/ForbiddenResponse'
+404:
+  description:
+Not Found
+- NoSuchTableException, the table does not exist
+- NoSuchNamespaceException, the namespace does not exist
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/ErrorModel'
+  examples:
+TableDoesNotExist:
+  $ref: '#/components/examples/NoSuchTableError'
+NamespaceDoesNotExist:
+  $ref: '#/components/examples/NoSuchNamespaceError'
+419:
+  $ref: '#/components/responses/AuthenticationTimeoutResponse'
+503:
+  $ref: '#/components/responses/ServiceUnavailableResponse'
+5XX:
+  $ref: '#/components/responses/ServerErrorResponse'
+
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan:
+parameters:
+  - $ref: '#/components/parameters/prefix'
+  - $ref: '#/components/parameters/namespace'
+  - $ref: '#/components/parameters/table'
+post:
+  tags:
+- Catalog API
+  summary: Initiate plan on table to find relevant file scan tasks for a 
specific query.
+  operationId: PlanTable
+  description:
+When a user submits a query, a plan will be initiated to find the 
relevant file scan tasks
+based on the user's selected columns, and filters. For enhanced query 
performance, 
+users can provide a plan-task which will distribute the planning work.

Review Comment:
   will fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518068800


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2068,6 +2162,145 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.
+  type: object
+
+FileScanTask:
+  type: object
+  required:
+- schema
+- spec
+- start
+- length
+- data-file
+  properties:
+data-file:
+  $ref: '#/components/schemas/ContentFile'
+partition:
+  type: object
+  additionalProperties:
+type: string
+size-bytes:
+  type: number
+start:
+  type: number
+length:
+  type: number
+estimated-rows-count:
+  type: number
+delete-files:
+  type: array
+  items:
+$ref: '#/components/schemas/ContentFile'
+schema:
+  $ref: '#/components/schemas/Schema'
+spec:
+  $ref: '#/components/schemas/PartitionSpec'
+residual-filter:
+  $ref: '#/components/schemas/Expression'
+
+TypeValue:
+  oneOf:
+- $ref: '#/components/schemas/PrimitiveTypeValue'
+- $ref: '#/components/schemas/MapTypeValue'
+- $ref: '#/components/schemas/StructTypeValue'
+
+MapTypeValue:
+  type: object
+  properties:
+keys:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+values:
+  type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+StructTypeValue:
+  type: object
+  additionalProperties:
+oneOf:
+  - $ref: '#/components/schemas/TypeValue'
+
+PrimitiveTypeValue:
+  oneOf:
+- type: boolean
+- type: integer
+- type: integer
+  format: int64
+- type: number
+- type: string
+- type: string
+  format: byte
+- type: array
+  items:
+$ref: '#/components/schemas/TypeValue'
+
+ContentFile:
+  type: object
+  required:
+- spec-id
+- content
+- file-path
+- file-format
+- file-size-in-bytes
+- record-count
+  properties:
+spec-id:
+  type: integer
+content:
+  $ref: '#/components/schemas/FileContent'
+file-path:
+  type: string
+file-format:
+  type: string
+partition:
+  $ref: '#/components/schemas/StructTypeValue'
+file-size-in-bytes:
+  type: integer
+  format: int64
+record-count:
+  type: integer
+  format: int64
+column-sizes:
+  $ref: '#/components/schemas/MapTypeValue'
+value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+null-value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+nan-value-counts:
+  $ref: '#/components/schemas/MapTypeValue'
+lower-bounds:
+  $ref: '#/components/schemas/MapTypeValue'
+upper-bounds:
+  $ref: '#/components/schemas/MapTypeValue'
+key-metadata:
+  type: string
+  format: byte
+split-offsets:
+  type: array
+  items:
+type: integer
+format: int64
+equality-ids:
+  type: array
+  items:
+type: integer
+sort-order-id:

Review Comment:
   Looks like we kept this in the ContentFile pr: 
https://github.com/apache/iceberg/commit/bb53c3d4e0e27ac6706803c2371793ad2476ae04
 , so wondering if we can resolve this thread?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1517180435


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2838,6 +3093,59 @@ components:
   additionalProperties:
 type: string
 
+PreplanTableRequest:
+  type: object
+  required:
+- select
+- filter
+- case-sensitive
+  properties:
+select:
+  description: A list of the selected column names
+  type: array
+  items:
+type: string
+filter:
+  $ref: '#/components/schemas/Expression'
+case-sensitive:
+  description: Indicates whether column selection and filtering should 
be case sensitive
+  type: boolean
+snapshot-id:
+  description: an int64 snapshot ID (if snapshot-range is not 
present); optional and defaults to the table's current snapshot
+  type: integer
+  format: int64
+snapshot-range:
+  description: a JSON list containing exactly 2 int64 snapshot IDs (if 
snapshot-id is not present) representing the start (exclusive) and end 
(inclusive) snapshots

Review Comment:
   > Isn't a range typically inclusive start and exclusive end? why it is 
reversed here? I am starting to feel maybe using start-snapshot-id and 
end-snapshot-id might be a better choice to be less ambiguous, and those values 
can also be null to indicate no start or no end.
   
   I think @stevenzwu had also brought this up for the `IncrementalScan` for 
having inclusive and exclusive for the from/start snapshot. 
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/IncrementalScan.java#L34
   
   On further looking at the incrementalScan it seems that for to/end snapshot 
it is only inclusive 
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/IncrementalScan.java#L87
   
   ```
 ThisT toSnapshot(long toSnapshotId);
   
 /**
  * Instructs this scan to look for changes up to a particular snapshot ref 
(inclusive
  * 
   ```
   
   So should we have a boolean flag to tell if from/start is inclusive or 
exclusive? And for end we always make it inclusive?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1517180435


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2838,6 +3093,59 @@ components:
   additionalProperties:
 type: string
 
+PreplanTableRequest:
+  type: object
+  required:
+- select
+- filter
+- case-sensitive
+  properties:
+select:
+  description: A list of the selected column names
+  type: array
+  items:
+type: string
+filter:
+  $ref: '#/components/schemas/Expression'
+case-sensitive:
+  description: Indicates whether column selection and filtering should 
be case sensitive
+  type: boolean
+snapshot-id:
+  description: an int64 snapshot ID (if snapshot-range is not 
present); optional and defaults to the table's current snapshot
+  type: integer
+  format: int64
+snapshot-range:
+  description: a JSON list containing exactly 2 int64 snapshot IDs (if 
snapshot-id is not present) representing the start (exclusive) and end 
(inclusive) snapshots

Review Comment:
   > Isn't a range typically inclusive start and exclusive end? why it is 
reversed here? I am starting to feel maybe using start-snapshot-id and 
end-snapshot-id might be a better choice to be less ambiguous, and those values 
can also be null to indicate no start or no end.
   
   I think @stevenzwu had also brought this up for the `IncrementalScan` for 
having inclusive and exclusive for the from/start snapshot. 
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/IncrementalScan.java#L34
   
   On further looking at the incrementalScan it seems that for to/end snapshot 
it is only inclusive 
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/IncrementalScan.java#L87
   
   ```
 ThisT toSnapshot(long toSnapshotId);
   
 /**
  * Instructs this scan to look for changes up to a particular snapshot ref 
(inclusive
  * 
   ```
   
   So should we have a boolean flag to tell if from/start is inclusive or 
exclusive? And for end we always make it inclusive? What do you think @rdblue ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



jackye1995 commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518328719


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2106,6 +2204,41 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.

Review Comment:
   ```suggestion
   An opaque JSON object that contains information provided by the REST 
server,
   to be utilized by clients for distributed table scan planning, 
should be supplied
   as is for input in `PlanTable` operation.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



jackye1995 commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518329140


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2068,6 +2162,145 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.
+  type: object
+
+FileScanTask:
+  type: object
+  required:
+- schema
+- spec
+- start
+- length
+- data-file
+  properties:
+data-file:
+  $ref: '#/components/schemas/ContentFile'
+partition:
+  type: object
+  additionalProperties:
+type: string
+size-bytes:
+  type: number
+start:
+  type: number
+length:

Review Comment:
   This comment seems not addressed? I agree that for scanning an entire file, 
length is not needed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



jackye1995 commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518330379


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2106,6 +2204,41 @@ components:
   items:
 $ref: '#/components/schemas/PartitionStatisticsFile'
 
+PlanTask:
+  description:
+A JSON object that contains information provided by the server,
+to be utilized by clients for distributed planning, should be supplied
+as is for input in PlanTable operation.
+  type: object
+
+FileScanTask:
+  type: object
+  required:
+- schema
+- spec
+- start
+- length
+- data-file
+  properties:
+data-file:
+  $ref: '#/components/schemas/ContentFile'
+start:
+  type: number
+length:
+  type: number
+delete-files:
+  type: array
+  items:
+$ref: '#/components/schemas/ContentFile'
+schema:
+  description: Table Schema
+  $ref: '#/components/schemas/Schema'
+spec:
+  $ref: '#/components/schemas/PartitionSpec'
+residual-filter:
+  description: An optional residual filter provided by a service, if 
not present clients shall calculate this residual or use the original filter.

Review Comment:
   ```suggestion
 description: filters that are not fully applied by the server and 
should be pushed down to the file reader when reading the file.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



jackye1995 commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518331988


##
open-api/rest-catalog-open-api.yaml:
##
@@ -3789,6 +3995,21 @@ components:
 EmptyResponse:
   $ref: '#/components/examples/ListNamespacesEmptyExample'
 
+PayloadTooLargeResponse:
+  description:
+Payload is too large.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/IcebergErrorResponse'
+  example: {
+"error": {
+  "message": " Payload is too large, client should obtain a 
plan-task by using the preplan endpoint.",

Review Comment:
   ```suggestion
 "message": " Payload is too large for PlanTable, please call 
PreplanTable to distribute table scan planning",
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



jackye1995 commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518332560


##
open-api/rest-catalog-open-api.yaml:
##
@@ -3804,6 +4025,21 @@ components:
 }
   }
 
+UnprocessableEntityResponse:
+  description:
+Clients are required to do preplanning and provide a plan-task, before 
initiating a plan.

Review Comment:
   Similar comment to PayloadTooLarge, this description should be generic. We 
can make the example specific to PlanTable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-08 Thread via GitHub



jackye1995 commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518375778


##
open-api/rest-catalog-open-api.yaml:
##
@@ -2838,6 +3093,59 @@ components:
   additionalProperties:
 type: string
 
+PreplanTableRequest:
+  type: object
+  required:
+- select
+- filter
+- case-sensitive
+  properties:
+select:
+  description: A list of the selected column names
+  type: array
+  items:
+type: string
+filter:
+  $ref: '#/components/schemas/Expression'
+case-sensitive:
+  description: Indicates whether column selection and filtering should 
be case sensitive
+  type: boolean
+snapshot-id:
+  description: an int64 snapshot ID (if snapshot-range is not 
present); optional and defaults to the table's current snapshot
+  type: integer
+  format: int64
+snapshot-range:
+  description: a JSON list containing exactly 2 int64 snapshot IDs (if 
snapshot-id is not present) representing the start (exclusive) and end 
(inclusive) snapshots

Review Comment:
   btw just to clarify my comment, I definitely agree exclusive start and 
inclusive end is the most common use case in incremental pipelines. I just 
think it might confuse user because the common "range" in programming language 
has been inclusive start exclusive end, and not everyone would read the 
description carefully.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Minor: Parse rowGroupSize as a long [iceberg]

2024-03-08 Thread via GitHub



hussein-awala opened a new pull request, #9900:
URL: https://github.com/apache/iceberg/pull/9900

   `withRowGroupSize(int rowGroupSize)` is deprecated in `ParquetWriter` and 
currently, the max possible rowGroupSize in Iceberg is 2GB (max Java int) which 
will be fixed after this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spark 3.4, 3.3 : support output-spec-id in rewrite (back port #9803) [iceberg]

2024-03-08 Thread via GitHub



himadripal commented on PR #9901:
URL: https://github.com/apache/iceberg/pull/9901#issuecomment-1986580223

   @RussellSpitzer @aokolnychyi @flyrain @szehon-ho 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Spark 3.4, 3.3 : support output-spec-id in rewrite (back port #9803) [iceberg]

2024-03-08 Thread via GitHub



himadripal opened a new pull request, #9901:
URL: https://github.com/apache/iceberg/pull/9901

   Backport #9803 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Migrate Snapshot files in Core to JUnit5 [iceberg]

2024-03-08 Thread via GitHub



tomtongue commented on PR #9892:
URL: https://github.com/apache/iceberg/pull/9892#issuecomment-1986579621

   @nastra Thanks for the review. Reflected your comments. Could you review the 
new one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Flink 1.11.1 Not Supoort timestamptz [iceberg]

2024-03-08 Thread via GitHub



github-actions[bot] commented on issue #1915:
URL: https://github.com/apache/iceberg/issues/1915#issuecomment-1986587888

   This issue has been automatically marked as stale because it has been open 
for 180 days with no activity. It will be closed in next 14 days if no further 
activity occurs. To permanently prevent this issue from being considered stale, 
add the label 'not-stale', but commenting on the issue is preferred when 
possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Spark Table With Timestamp Cannot Query By Flink [iceberg]

2024-03-08 Thread via GitHub



github-actions[bot] commented on issue #1916:
URL: https://github.com/apache/iceberg/issues/1916#issuecomment-1986587898

   This issue has been automatically marked as stale because it has been open 
for 180 days with no activity. It will be closed in next 14 days if no further 
activity occurs. To permanently prevent this issue from being considered stale, 
add the label 'not-stale', but commenting on the issue is preferred when 
possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

1 2 >

1 - 100 of 146 matches

Mail list logo