Re: [I] Convert row filter to arrow filter [iceberg-rust]

2024-03-18 Thread via GitHub


Dysprosium0626 commented on issue #265:
URL: https://github.com/apache/iceberg-rust/issues/265#issuecomment-2003096263

   Hi @liurenjie1024 I could work on this but I have no idea where is the 
current row filter? I only find something here: 
https://github.com/apache/iceberg-rust/blob/d6703df40b24477d0a5a36939746bb1b36cc6933/crates/iceberg/src/scan.rs#L162


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.google.errorprone:error_prone_annotations from 2.24.1 to 2.26.1 [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9972:
URL: https://github.com/apache/iceberg/pull/9972


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.awaitility:awaitility from 4.2.0 to 4.2.1 [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9970:
URL: https://github.com/apache/iceberg/pull/9970


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add 13 Dremio Blogs + Fix a few incorrect dates [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9967:
URL: https://github.com/apache/iceberg/pull/9967


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Fix ignoring major version update in dependabot [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9981:
URL: https://github.com/apache/iceberg/pull/9981


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.44.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9978:
URL: https://github.com/apache/iceberg/pull/9978#issuecomment-2003110726

   Looks like com.palantir.baseline:gradle-baseline-java is no longer being 
updated by Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.github.ben-manes.caffeine:caffeine from 2.9.3 to 3.1.8 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #8784: Build: Bump 
com.github.ben-manes.caffeine:caffeine from 2.9.3 to 3.1.8
URL: https://github.com/apache/iceberg/pull/8784


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.glassfish.jaxb:jaxb-runtime from 2.3.3 to 4.0.5 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9908: Build: Bump 
org.glassfish.jaxb:jaxb-runtime from 2.3.3 to 4.0.5
URL: https://github.com/apache/iceberg/pull/9908


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.44.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9978: Build: Bump 
com.palantir.baseline:gradle-baseline-java from 4.42.0 to 5.44.0
URL: https://github.com/apache/iceberg/pull/9978


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump slf4j from 1.7.36 to 2.0.9 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #8737: Build: Bump slf4j from 1.7.36 to 
2.0.9
URL: https://github.com/apache/iceberg/pull/8737


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump slf4j from 1.7.36 to 2.0.12 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9688:
URL: https://github.com/apache/iceberg/pull/9688#issuecomment-2003110737

   Looks like these dependencies are no longer being updated by Dependabot, so 
this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.glassfish.jaxb:jaxb-runtime from 2.3.3 to 4.0.4 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #8898:
URL: https://github.com/apache/iceberg/pull/8898#issuecomment-2003110724

   Looks like org.glassfish.jaxb:jaxb-runtime is no longer being updated by 
Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump slf4j from 1.7.36 to 2.0.9 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #8737:
URL: https://github.com/apache/iceberg/pull/8737#issuecomment-2003110730

   Looks like these dependencies are no longer being updated by Dependabot, so 
this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.github.ben-manes.caffeine:caffeine from 2.9.3 to 3.1.8 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #8784:
URL: https://github.com/apache/iceberg/pull/8784#issuecomment-2003110719

   Looks like com.github.ben-manes.caffeine:caffeine is no longer being updated 
by Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.esotericsoftware:kryo from 4.0.2 to 5.6.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9469:
URL: https://github.com/apache/iceberg/pull/9469#issuecomment-2003110725

   Looks like com.esotericsoftware:kryo is no longer being updated by 
Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.springframework:spring-web from 5.3.30 to 6.1.5 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9969:
URL: https://github.com/apache/iceberg/pull/9969#issuecomment-2003110720

   Looks like org.springframework:spring-web is no longer being updated by 
Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.springframework:spring-web from 5.3.30 to 6.1.5 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9969: Build: Bump 
org.springframework:spring-web from 5.3.30 to 6.1.5
URL: https://github.com/apache/iceberg/pull/9969


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.adobe.testing:s3mock-junit5 from 2.11.0 to 3.5.2 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9971:
URL: https://github.com/apache/iceberg/pull/9971#issuecomment-2003110723

   Looks like com.adobe.testing:s3mock-junit5 is no longer being updated by 
Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump jakarta.el:jakarta.el-api from 3.0.3 to 5.0.1 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #8791:
URL: https://github.com/apache/iceberg/pull/8791#issuecomment-2003110713

   Looks like jakarta.el:jakarta.el-api is no longer being updated by 
Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump orc from 1.9.2 to 2.0.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9913:
URL: https://github.com/apache/iceberg/pull/9913#issuecomment-2003110740

   Looks like these dependencies are no longer being updated by Dependabot, so 
this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.openapitools:openapi-generator-gradle-plugin from 6.6.0 to 7.4.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9973: Build: Bump 
org.openapitools:openapi-generator-gradle-plugin from 6.6.0 to 7.4.0
URL: https://github.com/apache/iceberg/pull/9973


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump jakarta.el:jakarta.el-api from 3.0.3 to 5.0.1 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #8791: Build: Bump 
jakarta.el:jakarta.el-api from 3.0.3 to 5.0.1
URL: https://github.com/apache/iceberg/pull/8791


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump slf4j from 1.7.36 to 2.0.12 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9688: Build: Bump slf4j from 1.7.36 to 
2.0.12
URL: https://github.com/apache/iceberg/pull/9688


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.openapitools:openapi-generator-gradle-plugin from 6.6.0 to 7.4.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9973:
URL: https://github.com/apache/iceberg/pull/9973#issuecomment-2003110722

   Looks like org.openapitools:openapi-generator-gradle-plugin is no longer 
being updated by Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump orc from 1.9.2 to 2.0.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9913: Build: Bump orc from 1.9.2 to 2.0.0
URL: https://github.com/apache/iceberg/pull/9913


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Docs: Enhance Spark pages [iceberg]

2024-03-18 Thread via GitHub


manuzhang commented on PR #9920:
URL: https://github.com/apache/iceberg/pull/9920#issuecomment-2003111009

   @Fokko is fixing it in https://github.com/apache/iceberg/pull/9965


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.esotericsoftware:kryo from 4.0.2 to 5.6.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9469: Build: Bump 
com.esotericsoftware:kryo from 4.0.2 to 5.6.0
URL: https://github.com/apache/iceberg/pull/9469


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.glassfish.jaxb:jaxb-runtime from 2.3.3 to 4.0.5 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9908:
URL: https://github.com/apache/iceberg/pull/9908#issuecomment-2003110716

   Looks like org.glassfish.jaxb:jaxb-runtime is no longer being updated by 
Dependabot, so this is no longer needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump com.adobe.testing:s3mock-junit5 from 2.11.0 to 3.5.2 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9971: Build: Bump 
com.adobe.testing:s3mock-junit5 from 2.11.0 to 3.5.2
URL: https://github.com/apache/iceberg/pull/9971


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.glassfish.jaxb:jaxb-runtime from 2.3.3 to 4.0.4 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #8898: Build: Bump 
org.glassfish.jaxb:jaxb-runtime from 2.3.3 to 4.0.4
URL: https://github.com/apache/iceberg/pull/8898


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump jetty from 9.4.53.v20231009 to 9.4.54.v20240208 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] opened a new pull request, #9982:
URL: https://github.com/apache/iceberg/pull/9982

   Bumps `jetty` from 9.4.53.v20231009 to 9.4.54.v20240208.
   Updates `org.eclipse.jetty:jetty-server` from 9.4.53.v20231009 to 
9.4.54.v20240208
   
   Updates `org.eclipse.jetty:jetty-servlet` from 9.4.53.v20231009 to 
9.4.54.v20240208
   
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump mkdocs-material from 9.5.9 to 9.5.14 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] opened a new pull request, #9983:
URL: https://github.com/apache/iceberg/pull/9983

   Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 
9.5.9 to 9.5.14.
   
   Release notes
   Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdocs-material's 
releases.
   
   mkdocs-material-9.5.14
   
   Added support for hiding versions from selector when using mike
   Added init system to improve signal handling in Docker image
   Fixed edge cases in exclusion logic of info plugin
   Fixed inability to reset pipeline in search plugin
   Fixed syntax error in Finnish translations
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6917";>#6917:
 UTF-8 encoding problems in blog plugin on Windows
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6889";>#6889:
 Transparent iframes get background color
   
   Thanks to https://github.com/kamilkrzyskow";>@​kamilkrzyskow, https://github.com/yubiuser";>@​yubiuser and https://github.com/todeveni";>@​todeveni for their 
contributions
   mkdocs-material-9.5.13
   
   Updated Slovak translations
   Improved info plugin interop with projects plugin
   Improved info plugin inclusion/exclusion logic
   Fixed info plugin not gathering files recursively
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6750";>#6750:
 Ensure info plugin packs up all necessary files
   
   Thanks to https://github.com/kamilkrzyskow";>@​kamilkrzyskow and https://github.com/scepka";>@​scepka for their 
contributions
   mkdocs-material-9.5.12
   
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6846";>#6846:
 Some meta tags removed on instant navigation (9.4.2 regression)
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6823";>#6823:
 KaTex not rendering on instant navigation (9.5.5 regression)
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6821";>#6821:
 Privacy plugin doesn't handle URLs with encoded characters
   
   mkdocs-material-9.5.11
   
   Updated Finnish translation
   
   mkdocs-material-9.5.10
   
   Updated Bahasa Malaysia translations
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6783";>#6783:
 Hide continue reading link for blog posts without separators
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6779";>#6779:
 Incorrect positioning of integrated table of contents
   
   
   
   
   Changelog
   Sourced from https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG";>mkdocs-material's
 changelog.
   
   mkdocs-material-9.5.14 (2024-03-18)
   
   Added support for hiding versions from selector when using mike
   Added init system to improve signal handling in Docker image
   Fixed edge cases in exclusion logic of info plugin
   Fixed inability to reset pipeline in search plugin
   Fixed syntax error in Finnish translations
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6917";>#6917:
 UTF-8 encoding problems in blog plugin on Windows
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6889";>#6889:
 Transparent iframes get background color
   
   mkdocs-material-9.5.13+insiders-4.53.1 (2024-03-06)
   
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6877";>#6877:
 Projects plugin computes incorrect path to assets
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6869";>#6869:
 Blog plugin should emit warning on invalid related link
   
   mkdocs-material-9.5.13 (2024-03-06)
   
   Updated Slovak translations
   Improved info plugin interop with projects plugin
   Improved info plugin inclusion/exclusion logic
   Fixed info plugin not gathering files recursively
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6750";>#6750:
 Ensure info plugin packs up all necessary files
   
   mkdocs-material-9.5.12 (2024-02-29)
   
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6846";>#6846:
 Some meta tags removed on instant navigation (9.4.2 regression)
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6823";>#6823:
 KaTex not rendering on instant navigation (9.5.5 regression)
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6821";>#6821:
 Privacy plugin doesn't handle URLs with encoded characters
   
   mkdocs-material-9.5.11+insiders-4.53.0 (2024-02-24)
   
   Added support for automatic instant previews
   Added support for pinned blog posts
   
   mkdocs-material-9.5.11 (2024-02-19)
   
   Updated Finnish translation
   
   mkdocs-material-9.5.10+insiders-4.52.3 (2024-02-21)
   
   Fixed resolution of URLs in instant previews
   Fixed instant previews not mounting for same-page links
   
   mkdocs-material-9.5.10 (2024-02-19)
   
   Updated Bahasa Malaysia translations
   Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6783";>#6783:
 Hide continue reading link for blog posts w

Re: [PR] Build: Bump mkdocs-material from 9.5.9 to 9.5.13 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] closed pull request #9906: Build: Bump mkdocs-material from 
9.5.9 to 9.5.13
URL: https://github.com/apache/iceberg/pull/9906


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump mkdocs-material from 9.5.9 to 9.5.13 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] commented on PR #9906:
URL: https://github.com/apache/iceberg/pull/9906#issuecomment-2003111838

   Superseded by #9983.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump nessie from 0.77.1 to 0.79.0 [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9976:
URL: https://github.com/apache/iceberg/pull/9976


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump com.esotericsoftware:kryo from 4.0.2 to 4.0.3 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] opened a new pull request, #9984:
URL: https://github.com/apache/iceberg/pull/9984

   Bumps [com.esotericsoftware:kryo](https://github.com/EsotericSoftware/kryo) 
from 4.0.2 to 4.0.3.
   
   Release notes
   Sourced from https://github.com/EsotericSoftware/kryo/releases";>com.esotericsoftware:kryo's
 releases.
   
   kryo-4.0.3
   This is a maintenance release coming with bug fixes and performance 
improvements for chunked encoding.
   Improved filling InputChunked buffer (https://redirect.github.com/EsotericSoftware/kryo/issues/651";>#651)
   Support skipping input chunks after a buffer underflow (https://redirect.github.com/EsotericSoftware/kryo/issues/850";>#850)
   Avoid flush repeatedly when has finished flushing (https://redirect.github.com/EsotericSoftware/kryo/issues/978";>#978)
   The full list of changes can be found https://github.com/EsotericSoftware/kryo/compare/kryo-parent-4.0.2...kryo-parent-4.0.3";>here.
   Many thanks to all contributors!
   Compatibility
   
   Serialization compatible
   
   Standard IO: Yes
   Unsafe-based IO: Yes
   
   
   Binary compatible - Yes (https://rawgithub.com/EsotericSoftware/kryo/master/compat_reports/kryo/4.0.2_to_4.0.3/compat_report.html";>Details)
   Source compatible - Yes (https://rawgithub.com/EsotericSoftware/kryo/master/compat_reports/kryo/4.0.2_to_4.0.3/compat_report.html#Source";>Details)
   
   
   
   
   Commits
   
   https://github.com/EsotericSoftware/kryo/commit/362202885aceb64a1766abcd5836fed0ff7974e6";>3622028
 [maven-release-plugin] prepare release kryo-parent-4.0.3
   https://github.com/EsotericSoftware/kryo/commit/1f0b35e694012bd8a7e33bf46049a1c0692ea1eb";>1f0b35e
 https://redirect.github.com/EsotericSoftware/kryo/issues/834";>#834 
Support skipping input chunks after a buffer underflow (https://redirect.github.com/EsotericSoftware/kryo/issues/850";>#850)
   https://github.com/EsotericSoftware/kryo/commit/599cd11f340e48e04985dbb00cdf49558faf7626";>599cd11
 Revert "use reflection instead of relying on sun.misc.Cleaner being 
available."
   https://github.com/EsotericSoftware/kryo/commit/2967a8e138145293f23cbd6f2d845b3cd6326efd";>2967a8e
 Set project version to 4.0.3-SNAPSHOT
   https://github.com/EsotericSoftware/kryo/commit/f8bd11963f618402183a08939c7f06e37622f543";>f8bd119
 avoid flush repeatedly when has finished flushing (https://redirect.github.com/EsotericSoftware/kryo/issues/978";>#978)
   https://github.com/EsotericSoftware/kryo/commit/84189ebc528318c6e68afc6a410ca1db9018d5d8";>84189eb
 Improved filling InputChunked buffer.
   https://github.com/EsotericSoftware/kryo/commit/4d793a0a91869debcaa01a0ba504edb4135252c9";>4d793a0
 use reflection instead of relying on sun.misc.Cleaner being available.
   https://github.com/EsotericSoftware/kryo/commit/cac0a17464d516bf865eeabb606d0b8f9fb10999";>cac0a17
 Eclipse project files.
   See full diff in https://github.com/EsotericSoftware/kryo/compare/kryo-parent-4.0.2...kryo-parent-4.0.3";>compare
 view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=com.esotericsoftware:kryo&package-manager=gradle&previous-version=4.0.2&new-version=4.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   



Re: [PR] Build: Bump datamodel-code-generator from 0.25.4 to 0.25.5 [iceberg]

2024-03-18 Thread via GitHub


Fokko merged PR #9979:
URL: https://github.com/apache/iceberg/pull/9979


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump spring-boot from 2.5.4 to 2.7.18 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] opened a new pull request, #9985:
URL: https://github.com/apache/iceberg/pull/9985

   Bumps `spring-boot` from 2.5.4 to 2.7.18.
   Updates `org.springframework.boot:spring-boot-starter-jetty` from 2.5.4 to 
2.7.18
   
   Release notes
   Sourced from https://github.com/spring-projects/spring-boot/releases";>org.springframework.boot:spring-boot-starter-jetty's
 releases.
   
   v2.7.18
   ⚠️ Noteworthy Changes
   
   Following the Paketo team's https://blog.paketo.io/posts/paketo-bionic-builder-is-unsafe/";>announcement
 that the Bionic CNB builders will be removed, the default builder using by 
bootBuildImage (Gradle) and spring-boot:build-image 
(Maven) has been changed to Paketo Jammy https://redirect.github.com/spring-projects/spring-boot/issues/38477";>#38477
   
   :lady_beetle: Bug Fixes
   
   App fails to start with a NoSuchMethodError when using Flyway 10.0.0 https://redirect.github.com/spring-projects/spring-boot/issues/38164";>#38164
   spring.webflux.multipart.max-disk-usage-per-part behaves incorrectly for 
values where the number of bytes overflows an int https://redirect.github.com/spring-projects/spring-boot/issues/38146";>#38146
   Mail health indicator fails when host is not set in properties https://redirect.github.com/spring-projects/spring-boot/issues/38007";>#38007
   
   :notebook_with_decorative_cover: Documentation
   
   Document supported SQL comment prefixes https://redirect.github.com/spring-projects/spring-boot/pull/38385";>#38385
   Fix link to Elasticsearch health indicator https://redirect.github.com/spring-projects/spring-boot/pull/38330";>#38330
   Improve --help and documentation for "encodepassword 
-a/--algorithm" in the Spring Boot CLI https://redirect.github.com/spring-projects/spring-boot/issues/38203";>#38203
   Document that TomcatConnectorCustomizers are not applied to additional 
connectors https://redirect.github.com/spring-projects/spring-boot/issues/38183";>#38183
   MyErrorWebExceptionHandler example in documentation isn't working https://redirect.github.com/spring-projects/spring-boot/issues/38104";>#38104
   Document that SerializationFeature.WRITE_DURATIONS_AS_TIMESTAMPS is 
disabled by default https://redirect.github.com/spring-projects/spring-boot/issues/38083";>#38083
   Update "Running Behind a Front-end Proxy Server" to include 
reactive and ForwardedHeaderTransformer https://redirect.github.com/spring-projects/spring-boot/issues/37282";>#37282
   Improve documentation of classpath.idx file and its 
generation by the Maven and Gradle plugins https://redirect.github.com/spring-projects/spring-boot/issues/37125";>#37125
   Document configuration for building images with Colima https://redirect.github.com/spring-projects/spring-boot/issues/34522";>#34522
   Code sample in "Developing Your First Spring Boot Application" 
does not work https://redirect.github.com/spring-projects/spring-boot/issues/34513";>#34513
   Document ConfigurationPropertyCaching https://redirect.github.com/spring-projects/spring-boot/issues/34172";>#34172
   Document that application.* banner variables require a packaged jar or 
the use of Boot's launcher https://redirect.github.com/spring-projects/spring-boot/issues/33489";>#33489
   Add section on AspectJ support https://redirect.github.com/spring-projects/spring-boot/issues/32642";>#32642
   Document server.servlet.encoding.* properties and 
server.servlet.encoding.mapping in particular https://redirect.github.com/spring-projects/spring-boot/issues/32472";>#32472
   Add a section on customizing embedded reactive servers https://redirect.github.com/spring-projects/spring-boot/issues/31917";>#31917
   Clarify that MVC components provided through WebMvcRegistrations are 
subject to subsequent processing and configuration by MVC https://redirect.github.com/spring-projects/spring-boot/issues/31232";>#31232
   Clarifying documentation on including a top-level 
@TestConfiguration class in a test https://redirect.github.com/spring-projects/spring-boot/issues/30513";>#30513
   Clarify that @AutoConfigureWebTestClient binds 
WebTestClient to mock infrastructure https://redirect.github.com/spring-projects/spring-boot/issues/29890";>#29890
   Improve systemd configuration documentation https://redirect.github.com/spring-projects/spring-boot/issues/28453";>#28453
   Document how to customize the basePackages that auto-configurations 
consider (for example Spring Data Repositories) https://redirect.github.com/spring-projects/spring-boot/issues/27549";>#27549
   Document additional user configuration that's required after setting 
spring.hateoas.use-hal-as-default-json-media-type to 
false https://redirect.github.com/spring-projects/spring-boot/issues/26814";>#26814
   Add how-to documentation for test-only database migrations with 
Flyway/Liquibase https://redirect.github.com/spring-projects/spring-boot/issues/26796";>#26796
   
   :hammer: Dependency Upgrades
   
   Upgrade to ActiveMQ 5.16.7 https://redirect.github.com/sp

[PR] Build: Bump com.adobe.testing:s3mock-junit5 from 2.11.0 to 2.17.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] opened a new pull request, #9986:
URL: https://github.com/apache/iceberg/pull/9986

   Bumps com.adobe.testing:s3mock-junit5 from 2.11.0 to 2.17.0.
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=com.adobe.testing:s3mock-junit5&package-manager=gradle&previous-version=2.11.0&new-version=2.17.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 4.42.0 to 4.192.0 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] opened a new pull request, #9987:
URL: https://github.com/apache/iceberg/pull/9987

   Bumps 
[com.palantir.baseline:gradle-baseline-java](https://github.com/palantir/gradle-baseline)
 from 4.42.0 to 4.192.0.
   
   Release notes
   Sourced from https://github.com/palantir/gradle-baseline/releases";>com.palantir.baseline:gradle-baseline-java's
 releases.
   
   4.192.0
   Automated release, no documented user facing changes
   4.191.0
   
   
   
   Type
   Description
   Link
   
   
   
   
   Feature
   Add error-prone check JooqBatchWithoutBindArgs
   https://redirect.github.com/palantir/gradle-baseline/pull/2506";>palantir/gradle-baseline#2506
   
   
   
   4.190.0
   
   
   
   Type
   Description
   Link
   
   
   
   
   Feature
   Added DangerousCollapseKeysUsage error prone check to 
disallow usage of collapseKeys() API of 
EntryStream.
   https://redirect.github.com/palantir/gradle-baseline/pull/2291";>palantir/gradle-baseline#2291
   
   
   Feature
   Prefer common versions of annotations over other copies
   https://redirect.github.com/palantir/gradle-baseline/pull/2505";>palantir/gradle-baseline#2505
   
   
   
   4.189.0
   
   
   
   Type
   Description
   Link
   
   
   
   
   Improvement
   Upgrade error_prone to 2.18.0 (from 2.16)
   https://redirect.github.com/palantir/gradle-baseline/pull/2472";>palantir/gradle-baseline#2472
   
   
   
   4.188.0
   
   
   
   Type
   Description
   Link
   
   
   
   
   Improvement
   Increase javac heap to 2g by default (up from 512m). Existing overrides 
are not impacted.
   https://redirect.github.com/palantir/gradle-baseline/pull/2482";>palantir/gradle-baseline#2482
   
   
   
   4.187.0
   No documented user facing changes
   4.186.0
   
   
   
   Type
   Description
   Link
   
   
   
   
   Fix
   add input properties for each task that uses moduleJvmArgs so that when 
the extension value changes, the task will no longer be up-to-date.
   https://redirect.github.com/palantir/gradle-baseline/pull/2477";>palantir/gradle-baseline#2477
   
   
   
   4.185.0
   
   
   
   Type
   Description
   Link
   
   
   
   
   Fix
   Ensure that baseline-immutables configures immutables to 
work incrementally when the immutables annotationProcessor 
dependency is not a direct dependency (ie it is brought in transitively or by 
an extendsFrom).
   https://redirect.github.com/palantir/gradle-baseline/pull/2465";>palantir/gradle-baseline#2465
   
   
   
   4.184.0
   
   
   
   Type
   Description
   Link
   
   
   
   
   Fix
   Bring IntelliJ in sync with ErrorProne on bad inner static class 
names
   https://redirect.github.com/palantir/gradle-baseline/pull/2447";>palantir/gradle-baseline#2447
   
   
   Fix
   Suppress the JavaxInjectOnAbstractMethod check for projects 
that apply java-gradle-plugin.
   https://redirect.github.com/palantir/gradle-baseline/pull/2460";>palantir/gradle-baseline#2460
   
   
   
   4.183.0
   Automated release, no documented user facing changes
   4.182.0
   
   
   
   Type
   Description
   Link
   
   
   
   
   Improvement
   Upgrade error-prone to 2.16, removing support for compilation with a 
jdk-15 target
   https://redirect.github.com/palantir/gradle-baseline/pull/2432";>palantir/gradle-baseline#2432
   
   
   
   
   
   ... (truncated)
   
   
   Commits
   
   https://github.com/palantir/gradle-baseline/commit/f89dfd54219f7f9093b15db0064ea544cf5de0d1";>f89dfd5
 Excavator:  Upgrade dependencies (https://redirect.github.com/palantir/gradle-baseline/issues/2513";>#2513)
   https://github.com/palantir/gradle-baseline/commit/dd749cb757f9bcf355bcb1bf2a3b5870779a1b4c";>dd749cb
 Excavator:  Upgrades Baseline to the latest version (https://redirect.github.com/palantir/gradle-baseline/issues/2520";>#2520)
   https://github.com/palantir/gradle-baseline/commit/4b658b148a008941df42ee32c20c25fdc8ebafda";>4b658b1
 Autorelease 4.191.0
   https://github.com/palantir/gradle-baseline/commit/28e20698b19e8cc4141df9e0c3c0a4d7e7e12a5d";>28e2069
 add JooqBatchWithoutBindArgs check (https://redirect.github.com/palantir/gradle-baseline/issues/2506";>#2506)
   https://github.com/palantir/gradle-baseline/commit/7e938d1e15cd53d6fc0ac4e51e5a5cdded639411";>7e938d1
 Excavator:  Update conjure plugins and dependencies (https://redirect.github.com/palantir/gradle-baseline/issues/2519";>#2519)
   https://github.com/palantir/gradle-baseline/commit/5155f7f09ae5d7d0eee860ace1a4834a5519112e";>5155f7f
 Excavator:  Update open-source publishing plugins (https://redirect.github.com/palantir/gradle-baseline/issues/2516";>#2516)
   https://github.com/palantir/gradle-baseline/commit/7cd13ba15b7ebeaed2beea434589ecc5929d0068";>7cd13ba
 Excavator:  Render CircleCI file using template specified in 
.circleci/templa...
   https://github.com/palantir/gradle-baseline/commit/365f4866d6fa1afddeb3e42d804b3c2fafd860cd";>365f486
 Excavator:  Render CircleCI file using template specified in 
.circleci/templa...
   https://github.com/p

[PR] Build: Bump org.glassfish.jaxb:jaxb-runtime from 2.3.3 to 2.3.9 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] opened a new pull request, #9988:
URL: https://github.com/apache/iceberg/pull/9988

   Bumps org.glassfish.jaxb:jaxb-runtime from 2.3.3 to 2.3.9.
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.glassfish.jaxb:jaxb-runtime&package-manager=gradle&previous-version=2.3.3&new-version=2.3.9)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] Build: Bump org.springframework:spring-web from 5.3.30 to 5.3.33 [iceberg]

2024-03-18 Thread via GitHub


dependabot[bot] opened a new pull request, #9989:
URL: https://github.com/apache/iceberg/pull/9989

   Bumps 
[org.springframework:spring-web](https://github.com/spring-projects/spring-framework)
 from 5.3.30 to 5.3.33.
   
   Release notes
   Sourced from https://github.com/spring-projects/spring-framework/releases";>org.springframework:spring-web's
 releases.
   
   v5.3.33
   :star: New Features
   
   Extract reusable method for URI validations https://redirect.github.com/spring-projects/spring-framework/issues/32442";>#32442
   Allow UriTemplate to be built with an empty template https://redirect.github.com/spring-projects/spring-framework/issues/32438";>#32438
   Refine *HttpMessageConverter#getContentLength return value 
null safety https://redirect.github.com/spring-projects/spring-framework/issues/32332";>#32332
   
   :lady_beetle: Bug Fixes
   
   AopUtils.getMostSpecificMethod does not return original method for 
proxy-derived method anymore https://redirect.github.com/spring-projects/spring-framework/issues/32369";>#32369
   Better protect against concurrent error handling for async requests https://redirect.github.com/spring-projects/spring-framework/issues/32342";>#32342
   Restore Jetty 10 compatibility in JettyClientHttpResponse https://redirect.github.com/spring-projects/spring-framework/issues/32337";>#32337
   ContentCachingResponseWrapper no longer honors Content-Type and 
Content-Length https://redirect.github.com/spring-projects/spring-framework/issues/32322";>#32322
   
   :notebook_with_decorative_cover: Documentation
   
   Build KDoc against 5.3.x Spring Framework Javadoc https://redirect.github.com/spring-projects/spring-framework/issues/32414";>#32414
   
   :hammer: Dependency Upgrades
   
   Upgrade to Reactor 2020.0.42 https://redirect.github.com/spring-projects/spring-framework/issues/32422";>#32422
   
   v5.3.32
   :star: New Features
   
   Add CORS support for Private Network Access https://redirect.github.com/spring-projects/spring-framework/issues/31974";>#31974
   Avoid early getMostSpecificMethod resolution in 
CommonAnnotationBeanPostProcessor https://redirect.github.com/spring-projects/spring-framework/issues/31969";>#31969
   
   :lady_beetle: Bug Fixes
   
   Consistent parsing of user information in UriComponentsBuilder https://redirect.github.com/spring-projects/spring-framework/issues/32247";>#32247
   QualifierAnnotationAutowireCandidateResolver.checkQualifier does 
identity checks when comparing arrays used as qualifier fields https://redirect.github.com/spring-projects/spring-framework/issues/32108";>#32108
   Guard against multiple body subscriptions in Jetty and JDK reactive 
responses https://redirect.github.com/spring-projects/spring-framework/issues/32101";>#32101
   Static resources caching issues with ShallowEtagHeaderFilter and Jetty 
caching directives https://redirect.github.com/spring-projects/spring-framework/issues/32051";>#32051
   ChannelSendOperator.WriteBarrier race condition in request(long) method 
leads to response being dropped https://redirect.github.com/spring-projects/spring-framework/issues/32021";>#32021
   Spring AOP does not propagate arguments for dynamic prototype-scoped 
advice  https://redirect.github.com/spring-projects/spring-framework/issues/31964";>#31964
   MergedAnnotation swallows IllegalAccessException for attribute method https://redirect.github.com/spring-projects/spring-framework/issues/31961";>#31961
   CronTrigger hard-codes default ZoneId instead of participating in 
scheduler-wide Clock setup https://redirect.github.com/spring-projects/spring-framework/issues/31950";>#31950
   MergedAnnotations finds duplicate annotations on method in 
multi-level interface hierarchy https://redirect.github.com/spring-projects/spring-framework/issues/31825";>#31825
   PathEditor cannot handle absolute Windows paths with forward slashes https://redirect.github.com/spring-projects/spring-framework/issues/31728";>#31728
   Include Hibernate's Query.scroll() in 
SharedEntityManagerCreator's queryTerminatingMethods 
set https://redirect.github.com/spring-projects/spring-framework/issues/31684";>#31684
   TypeDescriptor does not check generics in equals method 
(for ConversionService caching) https://redirect.github.com/spring-projects/spring-framework/issues/31674";>#31674
   Slow SpEL performance due to method sorting in ReflectiveMethodResolver 
https://redirect.github.com/spring-projects/spring-framework/issues/31665";>#31665
   Jackson encoder releases resources in wrong order https://redirect.github.com/spring-projects/spring-framework/issues/31657";>#31657
   WebSocketMessageBrokerStats has null stats for stompSubProtocolHandler 
since 5.3.2 https://redirect.github.com/spring-projects/spring-framework/issues/31642";>#31642
   
   :notebook_with_decorative_cover: Documentation
   
   Document cron-vs-quartz parsing convention for dayOfWeek part in 
CronExpression https://redirect.github.com/spring-projects/spring-framework/i

Re: [PR] Build: Bump mkdocs-material from 9.5.9 to 9.5.14 [iceberg]

2024-03-18 Thread via GitHub


Fokko merged PR #9983:
URL: https://github.com/apache/iceberg/pull/9983


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] docs: Add links checker [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9965:
URL: https://github.com/apache/iceberg/pull/9965#discussion_r1527998892


##
format/spec.md:
##
@@ -57,6 +57,7 @@ In addition to row-level deletes, version 2 makes some 
requirements stricter for
 
 ## Overview
 
+

Review Comment:
   should this link maybe have `https://iceberg.apache.org/` as its prefix as 
it's already being used in other places for images?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump orc from 1.9.2 to 2.0.0 [iceberg]

2024-03-18 Thread via GitHub


manuzhang commented on PR #9913:
URL: https://github.com/apache/iceberg/pull/9913#issuecomment-2003126279

   @nastra Thanks, this was working immediately.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Migrate Manifest, FormatVersion and LocationProvider files in Core to JUnit5 [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9964:
URL: https://github.com/apache/iceberg/pull/9964#discussion_r1528013566


##
core/src/test/java/org/apache/iceberg/TestManifestWriter.java:
##
@@ -89,48 +80,44 @@ public void testManifestPartitionStats() throws IOException 
{
 manifestEntry(Status.DELETED, null, newFile(2, 
TestHelpers.Row.of(3;
 
 List partitions = 
manifest.partitions();
-Assert.assertEquals("Partition field summaries count should match", 1, 
partitions.size());
+assertThat(partitions).hasSize(1);
 ManifestFile.PartitionFieldSummary partitionFieldSummary = 
partitions.get(0);
-Assert.assertFalse("contains_null should be false", 
partitionFieldSummary.containsNull());
-Assert.assertFalse("contains_nan should be false", 
partitionFieldSummary.containsNaN());
-Assert.assertEquals(
-"Lower bound should match",
-Integer.valueOf(1),
-Conversions.fromByteBuffer(Types.IntegerType.get(), 
partitionFieldSummary.lowerBound()));
-Assert.assertEquals(
-"Upper bound should match",
-Integer.valueOf(3),
-Conversions.fromByteBuffer(Types.IntegerType.get(), 
partitionFieldSummary.upperBound()));
+assertThat(partitionFieldSummary.containsNull()).isFalse();
+assertThat(partitionFieldSummary.containsNaN()).isFalse();
+assertThat(
+(Integer)

Review Comment:
   is the casting necessary here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Migrate Manifest, FormatVersion and LocationProvider files in Core to JUnit5 [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9964:
URL: https://github.com/apache/iceberg/pull/9964#discussion_r1528020644


##
core/src/test/java/org/apache/iceberg/TestManifestWriter.java:
##
@@ -89,48 +80,44 @@ public void testManifestPartitionStats() throws IOException 
{
 manifestEntry(Status.DELETED, null, newFile(2, 
TestHelpers.Row.of(3;
 
 List partitions = 
manifest.partitions();
-Assert.assertEquals("Partition field summaries count should match", 1, 
partitions.size());
+assertThat(partitions).hasSize(1);
 ManifestFile.PartitionFieldSummary partitionFieldSummary = 
partitions.get(0);
-Assert.assertFalse("contains_null should be false", 
partitionFieldSummary.containsNull());
-Assert.assertFalse("contains_nan should be false", 
partitionFieldSummary.containsNaN());
-Assert.assertEquals(
-"Lower bound should match",
-Integer.valueOf(1),
-Conversions.fromByteBuffer(Types.IntegerType.get(), 
partitionFieldSummary.lowerBound()));
-Assert.assertEquals(
-"Upper bound should match",
-Integer.valueOf(3),
-Conversions.fromByteBuffer(Types.IntegerType.get(), 
partitionFieldSummary.upperBound()));
+assertThat(partitionFieldSummary.containsNull()).isFalse();
+assertThat(partitionFieldSummary.containsNaN()).isFalse();
+assertThat(
+(Integer)

Review Comment:
   nvm, it's necessary



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Migrate Manifest, FormatVersion and LocationProvider files in Core to JUnit5 [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9964:
URL: https://github.com/apache/iceberg/pull/9964


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Migrate Manifest, FormatVersion and LocationProvider files in Core to JUnit5 [iceberg]

2024-03-18 Thread via GitHub


tomtongue commented on PR #9964:
URL: https://github.com/apache/iceberg/pull/9964#issuecomment-2003146004

   @nastra Thanks for the quick review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#discussion_r1528029346


##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderWithBloomFilter.java:
##
@@ -174,170 +156,62 @@ public static Object[][] parameters() {
 
   @BeforeAll
   public static void startMetastoreAndSpark() {
-metastore = new TestHiveMetastore();
-metastore.start();
-HiveConf hiveConf = metastore.hiveConf();
-
 spark =
 SparkSession.builder()
 .master("local[2]")
-.config("spark.hadoop." + METASTOREURIS.varname, 
hiveConf.get(METASTOREURIS.varname))
-.enableHiveSupport()
+.config("spark.sql.catalog.local", 
"org.apache.iceberg.spark.SparkCatalog")
+.config("spark.sql.catalog.local.type", "hadoop")
+.config("spark.sql.catalog.local.warehouse", temp.toString())
+.config("spark.sql.defaultCatalog", "local")
 .getOrCreate();
 
-catalog =
-(HiveCatalog)
-CatalogUtil.loadCatalog(
-HiveCatalog.class.getName(), "hive", ImmutableMap.of(), 
hiveConf);
-
-try {
-  catalog.createNamespace(Namespace.of("default"));
-} catch (AlreadyExistsException ignored) {
-  // the default namespace already exists. ignore the create error
-}
+spark.sql("CREATE DATABASE IF NOT EXISTS default");
+spark.sql("USE default");
   }
 
   @AfterAll
-  public static void stopMetastoreAndSpark() throws Exception {
-catalog = null;
-metastore.stop();
-metastore = null;
+  public static void stopMetastoreAndSpark() {
 spark.stop();
 spark = null;
   }
 
-  protected void createTable(String name, Schema schema) {
-table = catalog.createTable(TableIdentifier.of("default", name), schema);
-TableOperations ops = ((BaseTable) table).operations();
-TableMetadata meta = ops.current();
-ops.commit(meta, meta.upgradeToFormatVersion(2));
+  protected void createTable(String name) throws TableAlreadyExistsException {
+Dataset emptyDf = spark.createDataFrame(Lists.newArrayList(), schema);
+CreateTableWriter createTableWriter = emptyDf.writeTo("default." + 
name);
 
 if (useBloomFilter) {
-  table
-  .updateProperties()
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id", "true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id_long", "true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id_double", 
"true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id_float", "true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id_string", 
"true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id_boolean", 
"true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id_date", "true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id_int_decimal", 
"true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + "id_long_decimal", 
"true")
-  .set(PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + 
"id_fixed_decimal", "true")
-  .commit();
+  String[] columns = {
+"id",
+"id_long",
+"id_double",
+"id_float",
+"id_string",
+"id_boolean",
+"id_date",
+"id_int_decimal",
+"id_long_decimal",
+"id_fixed_decimal",
+"id_nested.nested_id"
+  };
+  for (String column : columns) {
+createTableWriter.tableProperty(
+PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX + column, "true");
+  }
 }
 
-table
-.updateProperties()
-.set(TableProperties.PARQUET_ROW_GROUP_SIZE_BYTES, "100") // to have 
multiple row groups
-.commit();
-if (vectorized) {
-  table
-  .updateProperties()
-  .set(TableProperties.PARQUET_VECTORIZATION_ENABLED, "true")
-  .set(TableProperties.PARQUET_BATCH_SIZE, "4")
-  .commit();
-}
-  }
-
-  protected void dropTable(String name) {
-catalog.dropTable(TableIdentifier.of("default", name));
-  }
+createTableWriter.tableProperty(PARQUET_ROW_GROUP_SIZE_BYTES, "100");
 
-  private DataFile writeDataFile(OutputFile out, StructLike partition, 
List rows)

Review Comment:
   this seems like too many changes just to add a single test. This makes it 
quite difficult to review the diffset



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[I] CI looks like broken [iceberg-rust]

2024-03-18 Thread via GitHub


viirya opened a new issue, #279:
URL: https://github.com/apache/iceberg-rust/issues/279

   
https://github.com/apache/iceberg-rust/actions/runs/8323042113/job/22771908103?pr=258
   
   ```
   warning: use of deprecated function `arrow_arith::temporal::month_dyn`: Use 
`date_part` instead
 --> crates/iceberg/src/transform/temporal.rs:22:16
  |
   22 | temporal::{month_dyn, year_dyn},
  |^
  |
  = note: `#[warn(deprecated)]` on by default
   
   warning: use of deprecated function `arrow_arith::temporal::year_dyn`: Use 
`date_part` instead
 --> crates/iceberg/src/transform/temporal.rs:22:27
  |
   22 | temporal::{month_dyn, year_dyn},
  |   
   
   warning: use of deprecated function `arrow_arith::temporal::year_dyn`: Use 
`date_part` instead
 --> crates/iceberg/src/transform/temporal.rs:47:13
  |
   47 | year_dyn(&input).map_err(|err| 
Error::new(ErrorKind::Unexpected, format!("{err}")))?;
  | 
   
   warning: use of deprecated function `arrow_arith::temporal::year_dyn`: Use 
`date_part` instead
 --> crates/iceberg/src/transform/temporal.rs:65:13
  |
   65 | year_dyn(&input).map_err(|err| 
Error::new(ErrorKind::Unexpected, format!("{err}")))?;
  | 
   
   warning: use of deprecated function `arrow_arith::temporal::month_dyn`: Use 
`date_part` instead
 --> crates/iceberg/src/transform/temporal.rs:72:13
  |
   72 | month_dyn(&input).map_err(|err| 
Error::new(ErrorKind::Unexpected, format!("{err}")))?;
  | ^
   
   error[E0308]: mismatched types
  --> crates/iceberg/src/writer/file_writer/parquet_writer.rs:118:13
   |
   116 | let writer = AsyncArrowWriter::try_new(
   |  - arguments to this 
function are incorrect
   117 | inner_writer,
   118 | self.schema.clone(),
   | ^^^ expected 
`arrow_schema::schema::Schema`, found `arrow_schema::Schema`
   |
   = note: `arrow_schema::Schema` and `arrow_schema::schema::Schema` have 
similar names, but are actually distinct types
   note: `arrow_schema::Schema` is defined in crate `arrow_schema`
  --> 
/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-schema-51.0.0/src/schema.rs:187:1
   |
   187 | pub struct Schema {
   | ^
   note: `arrow_schema::schema::Schema` is defined in crate `arrow_schema`
  --> 
/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-schema-50.0.0/src/schema.rs:181:1
   |
   181 | pub struct Schema {
   | ^
   = note: perhaps two different versions of crate `arrow_schema` are being 
used?
   note: associated function defined here
  --> 
/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/parquet-50.0.0/src/arrow/async_writer/mod.rs:95:12
   |
   95  | pub fn try_new(
   |^^^
   
   error[E0308]: mismatched types
  --> crates/iceberg/src/writer/file_writer/parquet_writer.rs:220:27
   |
   220 | self.writer.write(batch).await.map_err(|err| {
   | - ^ expected `RecordBatch`, found a 
different `RecordBatch`
   | |
   | arguments to this method are incorrect
   |
   = note: `RecordBatch` and `RecordBatch` have similar names, but are 
actually distinct types
   note: `RecordBatch` is defined in crate `arrow_array`
  --> 
/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-51.0.0/src/record_batch.rs:72:1
   |
   72  | pub struct RecordBatch {
   | ^^
   note: `RecordBatch` is defined in crate `arrow_array`
  --> 
/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-50.0.0/src/record_batch.rs:72:1
   |
   72  | pub struct RecordBatch {
   | ^^
   = note: perhaps two different versions of crate `arrow_array` are being 
used?
   note: method defined here
  --> 
/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/parquet-50.0.0/src/arrow/async_writer/mod.rs:116:18
   |
   116 | pub async fn write(&mut self, batch: &RecordBatch) -> Result<()> {
   |  ^
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#discussion_r1528029935


##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderWithBloomFilter.java:
##
@@ -18,56 +18,43 @@
  */
 package org.apache.iceberg.spark.source;
 
-import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.METASTOREURIS;
-import static org.apache.iceberg.TableProperties.DEFAULT_FILE_FORMAT;
-import static org.apache.iceberg.TableProperties.DEFAULT_FILE_FORMAT_DEFAULT;
 import static 
org.apache.iceberg.TableProperties.PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX;
 import static org.apache.iceberg.TableProperties.PARQUET_ROW_GROUP_SIZE_BYTES;
-import static 
org.apache.iceberg.TableProperties.PARQUET_ROW_GROUP_SIZE_BYTES_DEFAULT;
 import static org.assertj.core.api.Assertions.assertThat;
+import static org.junit.jupiter.api.Assertions.assertEquals;

Review Comment:
   please use AssertJ assertions (which is already available in the live just 
above this)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#discussion_r1528030672


##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderWithBloomFilter.java:
##
@@ -116,54 +114,38 @@ public class TestSparkReaderWithBloomFilter {
   private static final float FLOAT_BASE = 10F;
   private static final String BINARY_PREFIX = "BINARY测试_";
 
-  @TempDir private Path temp;
+  @TempDir private static Path temp;

Review Comment:
   why is this change necessary?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#discussion_r1528031076


##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderWithBloomFilter.java:
##
@@ -116,54 +114,38 @@ public class TestSparkReaderWithBloomFilter {
   private static final float FLOAT_BASE = 10F;
   private static final String BINARY_PREFIX = "BINARY测试_";
 
-  @TempDir private Path temp;
+  @TempDir private static Path temp;
 
   @BeforeEach
-  public void writeTestDataFile() throws IOException {
+  public void writeData() throws NoSuchTableException, 
TableAlreadyExistsException {
 this.tableName = "test";
-createTable(tableName, SCHEMA);
-this.records = Lists.newArrayList();
-
-// records all use IDs that are in bucket id_bucket=0
-GenericRecord record = GenericRecord.create(table.schema());
+createTable(tableName);
+this.rowList = Lists.newArrayList();
 
 for (int i = 0; i < INT_VALUE_COUNT; i += 1) {
-  records.add(
-  record.copy(
-  ImmutableMap.of(
-  "id",
-  INT_MIN_VALUE + i,
-  "id_long",
-  LONG_BASE + INT_MIN_VALUE + i,
-  "id_double",
-  DOUBLE_BASE + INT_MIN_VALUE + i,
-  "id_float",
-  FLOAT_BASE + INT_MIN_VALUE + i,
-  "id_string",
-  BINARY_PREFIX + (INT_MIN_VALUE + i),
-  "id_boolean",
-  i % 2 == 0,
-  "id_date",
-  LocalDate.parse("2021-09-05"),
-  "id_int_decimal",
-  new BigDecimal(String.valueOf(77.77)),
-  "id_long_decimal",
-  new BigDecimal(String.valueOf(88.88)),
-  "id_fixed_decimal",
-  new BigDecimal(String.valueOf(99.99);
+  Row row =

Review Comment:
   why are all these changes necessary?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#discussion_r1528034527


##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderWithBloomFilter.java:
##
@@ -95,18 +80,31 @@ public class TestSparkReaderWithBloomFilter {
   protected boolean useBloomFilter;
 
   // Schema passed to create tables
-  public static final Schema SCHEMA =
-  new Schema(
-  Types.NestedField.required(1, "id", Types.IntegerType.get()),
-  Types.NestedField.required(2, "id_long", Types.LongType.get()),
-  Types.NestedField.required(3, "id_double", Types.DoubleType.get()),
-  Types.NestedField.required(4, "id_float", Types.FloatType.get()),
-  Types.NestedField.required(5, "id_string", Types.StringType.get()),
-  Types.NestedField.optional(6, "id_boolean", Types.BooleanType.get()),
-  Types.NestedField.optional(7, "id_date", Types.DateType.get()),
-  Types.NestedField.optional(8, "id_int_decimal", 
Types.DecimalType.of(8, 2)),
-  Types.NestedField.optional(9, "id_long_decimal", 
Types.DecimalType.of(14, 2)),
-  Types.NestedField.optional(10, "id_fixed_decimal", 
Types.DecimalType.of(31, 2)));
+  public static final StructType schema =
+  new StructType(
+  new StructField[] {

Review Comment:
   why are these changes necessary?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] CI looks like broken [iceberg-rust]

2024-03-18 Thread via GitHub


Fokko commented on issue #279:
URL: https://github.com/apache/iceberg-rust/issues/279#issuecomment-2003161284

   Thanks for reporting this @viirya. I'm seeing something similar on my local 
machine:
   
   ```
   error[E0308]: mismatched types
  --> 
/Users/fokkodriesprong/.cargo/git/checkouts/iceberg-rust-d49e83c40ef4cf40/d6703df/crates/iceberg/src/writer/file_writer/parquet_writer.rs:118:13
   |
   116 | let writer = AsyncArrowWriter::try_new(
   |  - arguments to this 
function are incorrect
   117 | inner_writer,
   118 | self.schema.clone(),
   | ^^^ expected 
`arrow_schema::schema::Schema`, found `arrow_schema::Schema`
   |
   = note: `arrow_schema::Schema` and `arrow_schema::schema::Schema` have 
similar names, but are actually distinct types
   note: `arrow_schema::Schema` is defined in crate `arrow_schema`
  --> 
/Users/fokkodriesprong/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-schema-51.0.0/src/schema.rs:187:1
   |
   187 | pub struct Schema {
   | ^
   note: `arrow_schema::schema::Schema` is defined in crate `arrow_schema`
  --> 
/Users/fokkodriesprong/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-schema-50.0.0/src/schema.rs:181:1
   |
   181 | pub struct Schema {
   | ^
   = note: perhaps two different versions of crate `arrow_schema` are being 
used?
   note: associated function defined here
  --> 
/Users/fokkodriesprong/.cargo/registry/src/index.crates.io-6f17d22bba15001f/parquet-50.0.0/src/arrow/async_writer/mod.rs:95:12
   |
   95  | pub fn try_new(
   |^^^
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#discussion_r1528037506


##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderWithBloomFilter.java:
##
@@ -351,12 +225,13 @@ public void testReadWithFilter() {
 .filter(
 "id = 30 AND id_long = 1030 AND id_double = 10030.0 AND 
id_float = 100030.0"
 + " AND id_string = 'BINARY测试_30' AND id_boolean = true 
AND id_date = '2021-09-05'"
-+ " AND id_int_decimal = 77.77 AND id_long_decimal = 88.88 
AND id_fixed_decimal = 99.99");
++ " AND id_int_decimal = 77.77 AND id_long_decimal = 88.88 
AND id_fixed_decimal = 99.99"
++ " AND id_nested.nested_id = 30");

Review Comment:
   this change doesn't seem related to the PR title
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528044340


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   a while ago we introduced `CleanableFailure`. Maybe cleanup in Hadoop should 
only happen when a `CleanableFailure` is detected



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Implement the conversion from Arrow Schema to Iceberg Schema [iceberg-rust]

2024-03-18 Thread via GitHub


Fokko commented on code in PR #258:
URL: https://github.com/apache/iceberg-rust/pull/258#discussion_r1528045350


##
crates/iceberg/src/arrow.rs:
##
@@ -106,3 +114,560 @@ impl ArrowReader {
 ProjectionMask::all()
 }
 }
+
+/// A post order arrow schema visitor.
+///
+/// For order of methods called, please refer to [`visit_schema`].
+pub trait ArrowSchemaVisitor {
+/// Return type of this visitor on arrow field.
+type T;
+
+/// Return type of this visitor on arrow schema.
+type U;
+
+/// Called before struct/list/map field.
+fn before_field(&mut self, _field: &Field) -> Result<()> {
+Ok(())
+}
+
+/// Called after struct/list/map field.
+fn after_field(&mut self, _field: &Field) -> Result<()> {
+Ok(())
+}
+
+/// Called before list element.
+fn before_list_element(&mut self, _field: &Field) -> Result<()> {
+Ok(())
+}
+
+/// Called after list element.
+fn after_list_element(&mut self, _field: &Field) -> Result<()> {
+Ok(())
+}
+
+/// Called before map key.
+fn before_map_key(&mut self, _field: &Field) -> Result<()> {
+Ok(())
+}
+
+/// Called after map key.
+fn after_map_key(&mut self, _field: &Field) -> Result<()> {
+Ok(())
+}
+
+/// Called before map value.
+fn before_map_value(&mut self, _field: &Field) -> Result<()> {
+Ok(())
+}
+
+/// Called after map value.
+fn after_map_value(&mut self, _field: &Field) -> Result<()> {
+Ok(())
+}
+
+/// Called after schema's type visited.
+fn schema(&mut self, schema: &ArrowSchema, values: Vec) -> 
Result;
+
+/// Called after struct's fields visited.
+fn r#struct(&mut self, fields: &Fields, results: Vec) -> 
Result;
+
+/// Called after list fields visited.
+fn list(&mut self, list: &DataType, value: Self::T) -> Result;
+
+/// Called after map's key and value fields visited.
+fn map(&mut self, map: &DataType, key_value: Self::T, value: Self::T) -> 
Result;
+
+/// Called when see a primitive type.
+fn primitive(&mut self, p: &DataType) -> Result;
+}
+
+/// Visiting a type in post order.
+fn visit_type(r#type: &DataType, visitor: &mut V) -> 
Result {
+match r#type {
+p if p.is_primitive()
+|| matches!(
+p,
+DataType::Boolean
+| DataType::Utf8
+| DataType::Binary
+| DataType::FixedSizeBinary(_)
+) =>
+{
+visitor.primitive(p)
+}
+DataType::List(element_field) => {
+visitor.before_list_element(element_field)?;
+let value = visit_type(element_field.data_type(), visitor)?;
+visitor.after_list_element(element_field)?;
+visitor.list(r#type, value)
+}
+DataType::Map(field, _) => match field.data_type() {
+DataType::Struct(fields) => {
+if fields.len() != 2 {
+return Err(Error::new(
+ErrorKind::DataInvalid,
+"Map field must have exactly 2 fields",
+));
+}
+
+let key_field = &fields[0];
+let value_field = &fields[1];
+
+let key_result = {
+visitor.before_map_key(key_field)?;
+let ret = visit_type(key_field.data_type(), visitor)?;
+visitor.after_map_key(key_field)?;
+ret
+};
+
+let value_result = {
+visitor.before_map_value(value_field)?;
+let ret = visit_type(value_field.data_type(), visitor)?;
+visitor.after_map_value(value_field)?;
+ret
+};
+
+visitor.map(r#type, key_result, value_result)
+}
+_ => Err(Error::new(
+ErrorKind::DataInvalid,
+"Map field must have struct type",
+)),
+},
+DataType::Struct(fields) => visit_struct(fields, visitor),
+other => Err(Error::new(
+ErrorKind::DataInvalid,
+format!("Cannot visit Arrow data type: {other}"),
+)),
+}
+}
+
+/// Visit struct type in post order.
+#[allow(dead_code)]
+fn visit_struct(fields: &Fields, visitor: &mut V) -> 
Result {
+let mut results = Vec::with_capacity(fields.len());
+for field in fields {
+visitor.before_field(field)?;
+let result = visit_type(field.data_type(), visitor)?;
+visitor.after_field(field)?;
+results.push(result);
+}
+
+visitor.r#struct(fields, results)
+}
+
+/// Visit schema in post order.
+#[allow(dead_code)]
+fn visit_schema(schema: &ArrowSchema, visitor: &mut V) 
-> Result {
+let mut results = Vec::with_capacity(schema.fields().len());
+for field in schema.fields() {

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528044340


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   a while ago we introduced `CleanableFailure`. Maybe cleanup in Hadoop should 
only happen when a `CleanableFailure` is detected. See also #8397 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Snapshots table metadata [iceberg-python]

2024-03-18 Thread via GitHub


Fokko commented on code in PR #524:
URL: https://github.com/apache/iceberg-python/pull/524#discussion_r1528049941


##
tests/integration/test_writes.py:
##
@@ -664,3 +668,55 @@ def test_table_properties_raise_for_none_value(
 session_catalog, identifier, {"format-version": format_version, 
**property_with_none}, [arrow_table_with_null]
 )
 assert "None type is not a supported value in properties: property_name" 
in str(exc_info.value)
+
+
+@pytest.mark.integration
+@pytest.mark.parametrize("format_version", [1, 2])
+def test_inspect_snapshots(
+spark: SparkSession, session_catalog: Catalog, arrow_table_with_null: 
pa.Table, format_version: int
+) -> None:
+identifier = "default.table_metadata_snapshots"
+tbl = _create_table(session_catalog, identifier, 
properties={"format-version": format_version})
+
+tbl.overwrite(arrow_table_with_null)
+# should produce a DELETE entry
+tbl.overwrite(arrow_table_with_null)
+# Since we don't rewrite, this should produce a new manifest with an ADDED 
entry
+tbl.append(arrow_table_with_null)
+
+df = tbl.inspect.snapshots()

Review Comment:
   Ooh, I like that!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] build: Restore CI by making parquet and arrow version consistent [iceberg-rust]

2024-03-18 Thread via GitHub


viirya commented on code in PR #280:
URL: https://github.com/apache/iceberg-rust/pull/280#discussion_r1528050445


##
crates/iceberg/src/writer/file_writer/parquet_writer.rs:
##
@@ -112,20 +103,14 @@ impl 
FileWriterBuilder for ParquetWr
 
.generate_location(&self.file_name_generator.generate_file_name()),
 )?;
 let inner_writer = TrackWriter::new(out_file.writer().await?, 
written_size.clone());
-let init_buffer_size = max(Self::MIN_BUFFER_SIZE, 
self.init_buffer_size);
-let writer = AsyncArrowWriter::try_new(
-inner_writer,
-self.schema.clone(),
-init_buffer_size,
-Some(self.props),
-)
-.map_err(|err| {
-Error::new(
-crate::ErrorKind::Unexpected,
-"Failed to build parquet writer.",
-)
-.with_source(err)
-})?;
+let writer = AsyncArrowWriter::try_new(inner_writer, 
self.schema.clone(), Some(self.props))

Review Comment:
   See https://github.com/apache/arrow-rs/pull/5485



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add partition stats in snapshot summary [iceberg-python]

2024-03-18 Thread via GitHub


Fokko merged PR #521:
URL: https://github.com/apache/iceberg-python/pull/521


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add partition stats in snapshot summary [iceberg-python]

2024-03-18 Thread via GitHub


Fokko commented on PR #521:
URL: https://github.com/apache/iceberg-python/pull/521#issuecomment-2003194763

   This is great, thanks for working on this @jqin61 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] CI looks like broken [iceberg-rust]

2024-03-18 Thread via GitHub


Xuanwo commented on issue #279:
URL: https://github.com/apache/iceberg-rust/issues/279#issuecomment-2003224806

   I believe we should commit `Cargo.lock` to ensure we test against the same 
version. And we need to establish a `MSAV` (Minimum Supported Arrow Version) to 
make sure our users align with our deps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


hussein-awala commented on PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#issuecomment-2003228773

   > there are a bunch of changes that seem unrelated to what's being proposed 
to be done (aka adding a check to see if blook filters are added)
   
   Thanks @nastra for the review. The data was written using an Avro writer, so 
if there is a problem with the Spark writer, the bug will not be detectable. I 
refecatored the tests as @amogh-jahagirdar suggested 
(https://github.com/apache/iceberg/pull/9902#discussion_r1518451041) and I 
completely agree with him on this point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528103119


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   @nastra No problem, the ultimate goal of this PR is to control the types of 
exceptions thrown in order to handle the data correctly. So adapting the types 
of exceptions thrown is a breeze.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528109766


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   @nastra Currently, we throw a CommitStateUnknownException to circumvent file 
cleanup. We could also tweak it. But, I guess that's another topic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528142417


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   @nastra Sir. If this PR is merged, I will initiate the next PR and I will 
adjust the cleanup strategy of HadoopTable to fit CleanableFailure.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-03-18 Thread via GitHub


tomfans commented on issue #9990:
URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2003359267

   more information:  metadata managed by HMS, not HDFS.  it works when 
metadata managed by HDFS


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528220605


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   > @nastra Sir. If this PR is merged, I will initiate the next PR and I will 
adjust the cleanup strategy of HadoopTable to fit CleanableFailure.
   
   Why make a separate PR when we could handle it in this PR by catching 
`CleanableFailure`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528227235


##
core/src/test/java/org/apache/iceberg/hadoop/TestHadoopCommits.java:
##
@@ -206,6 +210,133 @@ public void testFailedCommit() throws Exception {
 Assertions.assertThat(manifests).as("Should contain 0 Avro manifest 
files").isEmpty();
   }
 
+  @Test
+  public void testCommitFailedBeforeChangeVersionHint() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+
+HadoopTableOperations spyOps2 = spy(tableOperations);
+doReturn(1).when(spyOps2).findVersionWithOutVersionHint(any());
+TableMetadata metadataV1 = spyOps2.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+assertThatThrownBy(() -> spyOps2.commit(metadataV1, metadataV2))
+.isInstanceOf(CommitFailedException.class)
+.hasMessageContaining("as the latest version is currently");
+
+HadoopTableOperations spyOps3 = spy(tableOperations);
+doReturn(false).when(spyOps3).nextVersionIsLatest(any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps3, CommitFailedException.class, "as the latest 
version is currently");
+
+HadoopTableOperations spyOps4 = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps4)
+.renameMetaDataFileAndCheck(any(), any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps4, CommitFailedException.class, "FileSystem crash!");
+  }
+
+  @Test
+  public void testCommitFailedAndCheckFailed() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doThrow(new RuntimeException("Can not check new Metadata!"))
+.when(spyOps)
+.checkMetaDataFileRenameSuccess(any(), any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps, CommitStateUnknownException.class, "FileSystem 
crash!");
+  }
+
+  @Test
+  public void testCommitFailedAndRenameNotSuccess() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doReturn(false).when(spyOps).checkMetaDataFileRenameSuccess(any(), any(), 
any());
+assertCommitNotChangeVersion(
+baseTable, spyOps, CommitFailedException.class, "Can not commit 
newMetaData.");
+  }
+
+  @Test
+  public void testCommitFailedButActualSuccess() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doReturn(true).when(spyOps).checkMetaDataFileRenameSuccess(any(), any(), 
any());
+int versionBefore = spyOps.findVersion();
+TableMetadata metadataV1 = spyOps.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+spyOps.commit(metadataV1, metadataV2);
+int versionAfter = spyOps.findVersion();
+assert versionAfter - versionBefore == 1;
+  }
+
+  private void assertCommitNotChangeVersion(
+  BaseTable baseTable,
+  HadoopTableOperations spyOps,
+  Class exceptionClass,
+  String msg) {
+int versionBefore = spyOps.findVersion();
+TableMetadata metadataV1 = spyOps.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+assertThatThrownBy(() -> spyOps.commit(metadataV1, metadataV2))
+.isInstanceOf(exceptionClass)
+.hasMessageContaining(msg);
+int versionAfter = spyOps.findVersion();
+assert versionBefore == versionAfter;

Review Comment:
   please use `assertThat()` from AssertJ to verify conditions. See also 
https://iceberg.apache.org/contribute/#assertj



-- 
This is an automated message from the 

Re: [PR] Build: Bump spring-boot from 2.5.4 to 2.7.18 [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9985:
URL: https://github.com/apache/iceberg/pull/9985


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-18 Thread via GitHub


nk1506 commented on code in PR #9852:
URL: https://github.com/apache/iceberg/pull/9852#discussion_r1528242070


##
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java:
##
@@ -222,53 +231,203 @@ public boolean dropTable(TableIdentifier identifier, 
boolean purge) {
 
   @Override
   public void renameTable(TableIdentifier from, TableIdentifier originalTo) {
-if (!isValidIdentifier(from)) {
-  throw new NoSuchTableException("Invalid identifier: %s", from);
+renameEntity(from, originalTo, "Table");
+  }
+
+  @Override
+  public boolean dropView(TableIdentifier identifier) {
+if (!isValidIdentifier(identifier)) {
+  return false;
+}
+
+try {
+  String database = identifier.namespace().level(0);
+  String viewName = identifier.name();
+  Table table = clients.run(client -> client.getTable(database, viewName));
+  HiveOperationsBase.validateTableIsIcebergView(
+  table, CatalogUtil.fullTableName(name, identifier));
+
+  HiveViewOperations ops = (HiveViewOperations) newViewOps(identifier);
+  ViewMetadata lastViewMetadata = null;
+
+  try {
+lastViewMetadata = ops.current();
+  } catch (NotFoundException e) {
+LOG.warn(
+"Failed to load table metadata for table: {}, continuing drop 
without purge",
+identifier,
+e);
+  }
+
+  clients.run(
+  client -> {
+client.dropTable(database, viewName);
+return null;
+  });
+
+  if (lastViewMetadata != null) {
+CatalogUtil.dropViewMetaData(ops.io(), lastViewMetadata);
+  }
+
+  LOG.info("Dropped View: {}", identifier);
+  return true;
+
+} catch (NoSuchViewException | NoSuchObjectException e) {
+  LOG.info("Skipping drop, View does not exist: {}", identifier, e);
+  return false;
+} catch (TException e) {
+  throw new RuntimeException("Failed to drop " + identifier, e);
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+  throw new RuntimeException("Interrupted in call to dropView", e);
+}
+  }
+
+  @Override
+  public List listViews(Namespace namespace) {
+Preconditions.checkArgument(
+isValidateNamespace(namespace), "Missing database in namespace: %s", 
namespace);
+
+try {
+  return listTablesByType(
+  namespace, TableType.VIRTUAL_VIEW, 
HiveOperationsBase.ICEBERG_VIEW_TYPE_VALUE);
+} catch (UnknownDBException e) {
+  throw new NoSuchNamespaceException("Namespace does not exist: %s", 
namespace);
+
+} catch (TException e) {
+  throw new RuntimeException("Failed to list all views under namespace " + 
namespace, e);
+
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+  throw new RuntimeException("Interrupted in call to listViews", e);
 }
+  }
+
+  private List listTablesByType(
+  Namespace namespace, TableType tableType, String tableTypeProp)
+  throws TException, InterruptedException {
+String database = namespace.level(0);
+List tableNames = clients.run(client -> client.getTables(database, 
"*", tableType));
+
+// Retrieving the Table objects from HMS in batches to avoid OOM
+List filteredTableIdentifiers = Lists.newArrayList();
+Iterable> tableNameSets = Iterables.partition(tableNames, 
100);
+
+for (List tableNameSet : tableNameSets) {
+  filteredTableIdentifiers.addAll(filterIcebergTables(tableNameSet, 
namespace, tableTypeProp));
+}
+
+return filteredTableIdentifiers;
+  }
+
+  private List filterIcebergTables(
+  List tableNames, Namespace namespace, String tableTypeProp)
+  throws TException, InterruptedException {
+List tableObjects =
+clients.run(client -> client.getTableObjectsByName(namespace.level(0), 
tableNames));
+return tableObjects.stream()
+.filter(
+table ->
+table.getParameters() != null
+&& tableTypeProp.equalsIgnoreCase(
+
table.getParameters().get(BaseMetastoreTableOperations.TABLE_TYPE_PROP)))
+.map(table -> TableIdentifier.of(namespace, table.getTableName()))
+.collect(Collectors.toList());
+  }
+
+  @Override
+  @SuppressWarnings("FormatStringAnnotation")
+  public void renameView(TableIdentifier from, TableIdentifier originalTo) {
+if (!namespaceExists(originalTo.namespace())) {
+  throw new NoSuchNamespaceException(
+  "Cannot rename %s to %s. Namespace does not exist: %s",
+  from, originalTo, originalTo.namespace());
+}
+renameEntity(from, originalTo, "View");
+  }
 
-TableIdentifier to = removeCatalogName(originalTo);
+  private void renameEntity(
+  TableIdentifier fromIdentifierEntity, TableIdentifier 
toIdentifierEntity, String entityType) {
+if (!isValidIdentifier(fromIdentifierEntity)) {
+  throw new NoSuchViewException("Invalid identifier: %s", 
fromIdentifie

Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#discussion_r1528243529


##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderWithBloomFilter.java:
##
@@ -81,12 +68,10 @@ public class TestSparkReaderWithBloomFilter {
 
   protected String tableName = null;
   protected Table table = null;
-  protected List records = null;
+  protected List rowList = null;

Review Comment:
   ```suggestion
 protected List rows = null;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Spark: Add a test to check if the bloom filters are added to the parquet files [iceberg]

2024-03-18 Thread via GitHub


nastra commented on PR #9902:
URL: https://github.com/apache/iceberg/pull/9902#issuecomment-2003444989

   > > there are a bunch of changes that seem unrelated to what's being 
proposed to be done (aka adding a check to see if blook filters are added)
   > 
   > Thanks @nastra for the review. The data was written using an Avro writer, 
so if there is a problem with the Spark writer, the bug will not be detectable. 
I refecatored the tests as @amogh-jahagirdar suggested ([#9902 
(comment)](https://github.com/apache/iceberg/pull/9902#discussion_r1518451041)) 
and I completely agree with him on this point.
   
   In that case I would probably just add a new test class where reading and 
writing is done through Spark. I think the purpose of 
`TestSparkReaderWithBloomFilter` was to actually only read through Spark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-18 Thread via GitHub


nk1506 commented on code in PR #9852:
URL: https://github.com/apache/iceberg/pull/9852#discussion_r1528253765


##
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java:
##
@@ -181,4 +279,230 @@ default Table newHmsTable(String hmsTableOwner) {
 
 return newTable;
   }
+
+  @SuppressWarnings("checkstyle:CyclomaticComplexity")
+  default void commitWithLocking(
+  Configuration conf,
+  BaseMetadata base,
+  BaseMetadata metadata,
+  String baseMetadataLocation,
+  String newMetadataLocation,
+  FileIO io) {
+boolean newTable = base == null;
+boolean hiveEngineEnabled = hiveEngineEnabled(conf, metadata);
+boolean keepHiveStats = conf.getBoolean(ConfigProperties.KEEP_HIVE_STATS, 
false);
+
+BaseMetastoreOperations.CommitStatus commitStatus =
+BaseMetastoreOperations.CommitStatus.FAILURE;
+boolean updateHiveTable = false;
+HiveLock lock = lockObject(metadata, conf, catalogName());
+try {
+  lock.lock();
+  Table tbl = loadHmsTable();
+
+  if (tbl != null) {
+String tableType = tbl.getTableType();
+if (!tableType.equalsIgnoreCase(tableType().name())) {
+  throw new AlreadyExistsException(
+  "%s with same name already exists: %s.%s",
+  tableType.equalsIgnoreCase(TableType.VIRTUAL_VIEW.name()) ? 
"View" : "Table",
+  tbl.getDbName(),
+  tbl.getTableName());
+}
+
+// If we try to create the table but the metadata location is already 
set, then we had a
+// concurrent commit
+if (newTable
+&& 
tbl.getParameters().get(BaseMetastoreTableOperations.METADATA_LOCATION_PROP)
+!= null) {
+  throw new AlreadyExistsException(
+  "%s already exists: %s.%s", entityType(), database(), table());
+}
+
+updateHiveTable = true;
+LOG.debug("Committing existing {}: {}", entityType().toLowerCase(), 
fullName());
+  } else {
+tbl =
+newHmsTable(
+metadata
+.properties()
+.getOrDefault(HiveCatalog.HMS_TABLE_OWNER, 
HiveHadoopUtil.currentUser()));
+LOG.debug("Committing new {}: {}", entityType().toLowerCase(), 
fullName());
+  }
+
+  tbl.setSd(storageDescriptor(metadata, hiveEngineEnabled)); // set to 
pickup any schema changes
+
+  String metadataLocation =
+  
tbl.getParameters().get(BaseMetastoreTableOperations.METADATA_LOCATION_PROP);
+
+  if (!Objects.equals(baseMetadataLocation, metadataLocation)) {
+throw new CommitFailedException(
+"Cannot commit: Base metadata location '%s' is not same as the 
current %s metadata location '%s' for %s.%s",
+baseMetadataLocation,
+entityType().toLowerCase(),
+metadataLocation,
+database(),
+table());
+  }
+
+  setHmsParameters(
+  metadata,
+  tbl,
+  newMetadataLocation,
+  obsoleteProps(conf, base, metadata),
+  hiveEngineEnabled);
+
+  if (!keepHiveStats) {
+tbl.getParameters().remove(StatsSetupConst.COLUMN_STATS_ACCURATE);
+  }
+
+  lock.ensureActive();
+
+  try {
+persistTable(
+tbl, updateHiveTable, hiveLockEnabled(conf, metadata) ? null : 
baseMetadataLocation);
+lock.ensureActive();
+
+commitStatus = BaseMetastoreOperations.CommitStatus.SUCCESS;
+  } catch (LockException le) {
+commitStatus = BaseMetastoreOperations.CommitStatus.UNKNOWN;
+throw new CommitStateUnknownException(
+"Failed to heartbeat for hive lock while "
++ "committing changes. This can lead to a concurrent commit 
attempt be able to overwrite this commit. "
++ "Please check the commit history. If you are running into 
this issue, try reducing "
++ "iceberg.hive.lock-heartbeat-interval-ms.",
+le);
+  } catch (org.apache.hadoop.hive.metastore.api.AlreadyExistsException e) {
+throw new AlreadyExistsException(
+"%s already exists: %s.%s", entityType(), tbl.getDbName(), 
tbl.getTableName());
+  } catch (InvalidObjectException e) {
+throw new ValidationException(e, "Invalid Hive object for %s.%s", 
database(), table());
+  } catch (CommitFailedException | CommitStateUnknownException e) {
+throw e;
+  } catch (Throwable e) {
+if (e.getMessage()
+.contains(
+"The table has been modified. The parameter value for key '"
++ BaseMetastoreTableOperations.METADATA_LOCATION_PROP
++ "' is")) {
+  throw new CommitFailedException(
+  e, "The table %s.%s has been modified concurrently", database(), 
table());
+}
+
+if (e.getMessage() != null
+&& e.getMessage().contains("Table/View 'HIVE_LOCKS'

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-18 Thread via GitHub


nk1506 commented on code in PR #9852:
URL: https://github.com/apache/iceberg/pull/9852#discussion_r1528257514


##
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java:
##
@@ -181,4 +279,230 @@ default Table newHmsTable(String hmsTableOwner) {
 
 return newTable;
   }
+
+  @SuppressWarnings("checkstyle:CyclomaticComplexity")
+  default void commitWithLocking(
+  Configuration conf,
+  BaseMetadata base,
+  BaseMetadata metadata,
+  String baseMetadataLocation,
+  String newMetadataLocation,
+  FileIO io) {
+boolean newTable = base == null;
+boolean hiveEngineEnabled = hiveEngineEnabled(conf, metadata);
+boolean keepHiveStats = conf.getBoolean(ConfigProperties.KEEP_HIVE_STATS, 
false);
+
+BaseMetastoreOperations.CommitStatus commitStatus =
+BaseMetastoreOperations.CommitStatus.FAILURE;
+boolean updateHiveTable = false;
+HiveLock lock = lockObject(metadata, conf, catalogName());
+try {
+  lock.lock();
+  Table tbl = loadHmsTable();
+
+  if (tbl != null) {
+String tableType = tbl.getTableType();
+if (!tableType.equalsIgnoreCase(tableType().name())) {
+  throw new AlreadyExistsException(
+  "%s with same name already exists: %s.%s",
+  tableType.equalsIgnoreCase(TableType.VIRTUAL_VIEW.name()) ? 
"View" : "Table",
+  tbl.getDbName(),
+  tbl.getTableName());
+}
+
+// If we try to create the table but the metadata location is already 
set, then we had a
+// concurrent commit
+if (newTable
+&& 
tbl.getParameters().get(BaseMetastoreTableOperations.METADATA_LOCATION_PROP)
+!= null) {
+  throw new AlreadyExistsException(
+  "%s already exists: %s.%s", entityType(), database(), table());
+}
+
+updateHiveTable = true;
+LOG.debug("Committing existing {}: {}", entityType().toLowerCase(), 
fullName());
+  } else {
+tbl =
+newHmsTable(
+metadata
+.properties()
+.getOrDefault(HiveCatalog.HMS_TABLE_OWNER, 
HiveHadoopUtil.currentUser()));
+LOG.debug("Committing new {}: {}", entityType().toLowerCase(), 
fullName());
+  }
+
+  tbl.setSd(storageDescriptor(metadata, hiveEngineEnabled)); // set to 
pickup any schema changes
+
+  String metadataLocation =
+  
tbl.getParameters().get(BaseMetastoreTableOperations.METADATA_LOCATION_PROP);
+
+  if (!Objects.equals(baseMetadataLocation, metadataLocation)) {
+throw new CommitFailedException(
+"Cannot commit: Base metadata location '%s' is not same as the 
current %s metadata location '%s' for %s.%s",
+baseMetadataLocation,
+entityType().toLowerCase(),
+metadataLocation,
+database(),
+table());
+  }
+
+  setHmsParameters(
+  metadata,
+  tbl,
+  newMetadataLocation,
+  obsoleteProps(conf, base, metadata),
+  hiveEngineEnabled);
+
+  if (!keepHiveStats) {
+tbl.getParameters().remove(StatsSetupConst.COLUMN_STATS_ACCURATE);
+  }
+
+  lock.ensureActive();
+
+  try {
+persistTable(
+tbl, updateHiveTable, hiveLockEnabled(conf, metadata) ? null : 
baseMetadataLocation);
+lock.ensureActive();
+
+commitStatus = BaseMetastoreOperations.CommitStatus.SUCCESS;
+  } catch (LockException le) {
+commitStatus = BaseMetastoreOperations.CommitStatus.UNKNOWN;
+throw new CommitStateUnknownException(
+"Failed to heartbeat for hive lock while "
++ "committing changes. This can lead to a concurrent commit 
attempt be able to overwrite this commit. "
++ "Please check the commit history. If you are running into 
this issue, try reducing "
++ "iceberg.hive.lock-heartbeat-interval-ms.",
+le);
+  } catch (org.apache.hadoop.hive.metastore.api.AlreadyExistsException e) {
+throw new AlreadyExistsException(
+"%s already exists: %s.%s", entityType(), tbl.getDbName(), 
tbl.getTableName());
+  } catch (InvalidObjectException e) {
+throw new ValidationException(e, "Invalid Hive object for %s.%s", 
database(), table());
+  } catch (CommitFailedException | CommitStateUnknownException e) {
+throw e;
+  } catch (Throwable e) {
+if (e.getMessage()
+.contains(
+"The table has been modified. The parameter value for key '"
++ BaseMetastoreTableOperations.METADATA_LOCATION_PROP
++ "' is")) {
+  throw new CommitFailedException(
+  e, "The table %s.%s has been modified concurrently", database(), 
table());
+}
+
+if (e.getMessage() != null
+&& e.getMessage().contains("Table/View 'HIVE_LOCKS'

Re: [I] Metadata not found for table created by flink [iceberg]

2024-03-18 Thread via GitHub


nastra closed issue #9958: Metadata not found for table created by flink
URL: https://github.com/apache/iceberg/issues/9958


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528260699


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   @nastra Sir. If I catch CleanableFailure, then I need to change the 
exception handling logic of the commit method. And I need to re-test it. 
   
   Since I've already made quite a few changes in this version, I don't want to 
expand on them in this PR. This PR is the basis for fixing the processing logic 
in HadoopTable, and I think that the introduction of other adaptation logic 
should be done on top of fixing the basic logic.  step by step.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528264110


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   That way, code review will be easier.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9852:
URL: https://github.com/apache/iceberg/pull/9852#discussion_r1528265184


##
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java:
##
@@ -181,4 +279,230 @@ default Table newHmsTable(String hmsTableOwner) {
 
 return newTable;
   }
+
+  @SuppressWarnings("checkstyle:CyclomaticComplexity")
+  default void commitWithLocking(
+  Configuration conf,
+  BaseMetadata base,
+  BaseMetadata metadata,
+  String baseMetadataLocation,
+  String newMetadataLocation,
+  FileIO io) {
+boolean newTable = base == null;
+boolean hiveEngineEnabled = hiveEngineEnabled(conf, metadata);
+boolean keepHiveStats = conf.getBoolean(ConfigProperties.KEEP_HIVE_STATS, 
false);
+
+BaseMetastoreOperations.CommitStatus commitStatus =
+BaseMetastoreOperations.CommitStatus.FAILURE;
+boolean updateHiveTable = false;
+HiveLock lock = lockObject(metadata, conf, catalogName());
+try {
+  lock.lock();
+  Table tbl = loadHmsTable();
+
+  if (tbl != null) {
+String tableType = tbl.getTableType();
+if (!tableType.equalsIgnoreCase(tableType().name())) {
+  throw new AlreadyExistsException(
+  "%s with same name already exists: %s.%s",
+  tableType.equalsIgnoreCase(TableType.VIRTUAL_VIEW.name()) ? 
"View" : "Table",
+  tbl.getDbName(),
+  tbl.getTableName());
+}
+
+// If we try to create the table but the metadata location is already 
set, then we had a
+// concurrent commit
+if (newTable
+&& 
tbl.getParameters().get(BaseMetastoreTableOperations.METADATA_LOCATION_PROP)
+!= null) {
+  throw new AlreadyExistsException(
+  "%s already exists: %s.%s", entityType(), database(), table());
+}
+
+updateHiveTable = true;
+LOG.debug("Committing existing {}: {}", entityType().toLowerCase(), 
fullName());
+  } else {
+tbl =
+newHmsTable(
+metadata
+.properties()
+.getOrDefault(HiveCatalog.HMS_TABLE_OWNER, 
HiveHadoopUtil.currentUser()));
+LOG.debug("Committing new {}: {}", entityType().toLowerCase(), 
fullName());
+  }
+
+  tbl.setSd(storageDescriptor(metadata, hiveEngineEnabled)); // set to 
pickup any schema changes
+
+  String metadataLocation =
+  
tbl.getParameters().get(BaseMetastoreTableOperations.METADATA_LOCATION_PROP);
+
+  if (!Objects.equals(baseMetadataLocation, metadataLocation)) {
+throw new CommitFailedException(
+"Cannot commit: Base metadata location '%s' is not same as the 
current %s metadata location '%s' for %s.%s",
+baseMetadataLocation,
+entityType().toLowerCase(),
+metadataLocation,
+database(),
+table());
+  }
+
+  setHmsParameters(
+  metadata,
+  tbl,
+  newMetadataLocation,
+  obsoleteProps(conf, base, metadata),
+  hiveEngineEnabled);
+
+  if (!keepHiveStats) {
+tbl.getParameters().remove(StatsSetupConst.COLUMN_STATS_ACCURATE);
+  }
+
+  lock.ensureActive();
+
+  try {
+persistTable(
+tbl, updateHiveTable, hiveLockEnabled(conf, metadata) ? null : 
baseMetadataLocation);
+lock.ensureActive();
+
+commitStatus = BaseMetastoreOperations.CommitStatus.SUCCESS;
+  } catch (LockException le) {
+commitStatus = BaseMetastoreOperations.CommitStatus.UNKNOWN;
+throw new CommitStateUnknownException(
+"Failed to heartbeat for hive lock while "
++ "committing changes. This can lead to a concurrent commit 
attempt be able to overwrite this commit. "
++ "Please check the commit history. If you are running into 
this issue, try reducing "
++ "iceberg.hive.lock-heartbeat-interval-ms.",
+le);
+  } catch (org.apache.hadoop.hive.metastore.api.AlreadyExistsException e) {
+throw new AlreadyExistsException(
+"%s already exists: %s.%s", entityType(), tbl.getDbName(), 
tbl.getTableName());
+  } catch (InvalidObjectException e) {
+throw new ValidationException(e, "Invalid Hive object for %s.%s", 
database(), table());
+  } catch (CommitFailedException | CommitStateUnknownException e) {
+throw e;
+  } catch (Throwable e) {
+if (e.getMessage()
+.contains(
+"The table has been modified. The parameter value for key '"
++ BaseMetastoreTableOperations.METADATA_LOCATION_PROP
++ "' is")) {
+  throw new CommitFailedException(
+  e, "The table %s.%s has been modified concurrently", database(), 
table());
+}
+
+if (e.getMessage() != null
+&& e.getMessage().contains("Table/View 'HIVE_LOCKS'

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528264110


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   That way, code review will be easier. In case I create a bug while adapting 
CleanableFailure, we can at least keep the first half of the work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528264110


##
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java:
##
@@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 String codecName =
 metadata.property(
 TableProperties.METADATA_COMPRESSION, 
TableProperties.METADATA_COMPRESSION_DEFAULT);
+// TODO:This is not compatible with the scenario where the user modifies 
the metadata file
+// compression codec arbitrarily.
+// We can inform the user about this bug first, and fix it later.(Do not 
modify the compressed
+// format after the table is created.)
 TableMetadataParser.Codec codec = 
TableMetadataParser.Codec.fromName(codecName);
 String fileExtension = TableMetadataParser.getFileExtension(codec);
-Path tempMetadataFile = metadataPath(UUID.randomUUID().toString() + 
fileExtension);
+Path tempMetadataFile = metadataPath(UUID.randomUUID() + fileExtension);
 TableMetadataParser.write(metadata, 
io().newOutputFile(tempMetadataFile.toString()));
 
 int nextVersion = (current.first() != null ? current.first() : 0) + 1;
 Path finalMetadataFile = metadataFilePath(nextVersion, codec);
 FileSystem fs = getFileSystem(tempMetadataFile, conf);
-
-// this rename operation is the atomic commit operation
-renameToFinal(fs, tempMetadataFile, finalMetadataFile, nextVersion);
-
-LOG.info("Committed a new metadata file {}", finalMetadataFile);
-
-// update the best-effort version pointer
-writeVersionHint(nextVersion);
-
-deleteRemovedMetadataFiles(base, metadata);
-
-this.shouldRefresh = true;
+boolean versionCommitSuccess = false;
+try {
+  deleteOldVersionHint(fs, versionHintFile(), nextVersion);
+  versionCommitSuccess = commitNewVersion(fs, tempMetadataFile, 
finalMetadataFile, nextVersion);
+  if (!versionCommitSuccess) {
+// Users should clean up orphaned files after job fail.
+// This may be too heavy. But it can stay that way for now.
+String msg =
+String.format(
+"Can not write versionHint. commitVersion = %s.Is there a 
problem with the file system?",
+nextVersion);
+throw new RuntimeException(msg);
+  } else {
+this.shouldRefresh = versionCommitSuccess;
+// In fact, we don't really care if the metadata cleanup/update 
succeeds or not,
+// if it fails this time, we can execute it in the next commit method 
call.
+// So we should fix the shouldRefresh flag first.
+if (this.firstRun) {
+  this.firstRun = false;
+}
+LOG.info("Committed a new metadata file {}", finalMetadataFile);
+// update the best-effort version pointer
+boolean writeVersionHintSuccess = writeVersionHint(fs, nextVersion);
+if (!writeVersionHintSuccess) {
+  LOG.warn(
+  "Failed to write a new versionHintFile,commit version is [{}], 
is there a problem with the file system?",
+  nextVersion);
+}
+deleteRemovedMetadataFiles(base, metadata);
+  }
+} catch (CommitStateUnknownException | CommitFailedException e) {

Review Comment:
   That way, code review will be easier. In case I create a bug while adapting 
CleanableFailure, we can at least keep the first half of the work.
   I'm sure Russell would approve of that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-18 Thread via GitHub


nastra commented on code in PR #9852:
URL: https://github.com/apache/iceberg/pull/9852#discussion_r1528270018


##
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java:
##
@@ -113,6 +124,24 @@ public void initialize(String inputName, Map properties) {
 this.clients = new CachedClientPool(conf, properties);
   }
 
+  @Override
+  public TableBuilder buildTable(TableIdentifier identifier, Schema schema) {
+if (viewExists(identifier)) {

Review Comment:
   yes this is the wrong place to do this validation. You're already doing this 
validation in 
https://github.com/apache/iceberg/pull/9852/files#diff-5ecfc223b311b523a12be7482b6235318fdd7535c371f899daf7e09eacddad77R322



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Open-API: How about adding table change detection in Iceberg Catalog [iceberg]

2024-03-18 Thread via GitHub


mingnuj commented on issue #9942:
URL: https://github.com/apache/iceberg/issues/9942#issuecomment-2003484990

   > There's an option that you can use to only fetch the latest snapshot via 
`snapshot-loading-mode=refs` (defaults to `all`) when the table is loaded 
instead of loading all snapshots.
   
   Thank you for response! I'll try it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Snapshots table metadata [iceberg-python]

2024-03-18 Thread via GitHub


Fokko commented on code in PR #524:
URL: https://github.com/apache/iceberg-python/pull/524#discussion_r1528301582


##
tests/integration/test_writes.py:
##
@@ -664,3 +668,55 @@ def test_table_properties_raise_for_none_value(
 session_catalog, identifier, {"format-version": format_version, 
**property_with_none}, [arrow_table_with_null]
 )
 assert "None type is not a supported value in properties: property_name" 
in str(exc_info.value)
+
+
+@pytest.mark.integration
+@pytest.mark.parametrize("format_version", [1, 2])
+def test_inspect_snapshots(
+spark: SparkSession, session_catalog: Catalog, arrow_table_with_null: 
pa.Table, format_version: int
+) -> None:
+identifier = "default.table_metadata_snapshots"
+tbl = _create_table(session_catalog, identifier, 
properties={"format-version": format_version})
+
+tbl.overwrite(arrow_table_with_null)
+# should produce a DELETE entry
+tbl.overwrite(arrow_table_with_null)
+# Since we don't rewrite, this should produce a new manifest with an ADDED 
entry
+tbl.append(arrow_table_with_null)
+
+df = tbl.inspect.snapshots()

Review Comment:
   That was an excellent suggestion and actually caught a bug 🙌 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump org.springframework:spring-web from 5.3.30 to 5.3.33 [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9989:
URL: https://github.com/apache/iceberg/pull/9989


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump jetty from 9.4.53.v20231009 to 9.4.54.v20240208 [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9982:
URL: https://github.com/apache/iceberg/pull/9982


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Build: Bump guava from 33.0.0-jre to 33.1.0-jre [iceberg]

2024-03-18 Thread via GitHub


nastra commented on PR #9977:
URL: https://github.com/apache/iceberg/pull/9977#issuecomment-2003593823

   @dependabot rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] [WIP] Implement project for Transform. #264 [iceberg-rust]

2024-03-18 Thread via GitHub


marvinlanhenke commented on code in PR #269:
URL: https://github.com/apache/iceberg-rust/pull/269#discussion_r1528347743


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,50 @@ impl Transform {
 _ => self == other,
 }
 }
+/// Projects predicate to `Transform`
+pub fn project(&self, name: String, pred: &BoundPredicate) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match self {
+Transform::Bucket(_) => match pred {
+BoundPredicate::Unary(expr) => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+BoundPredicate::Binary(expr) => {
+if expr.op() != PredicateOperator::Eq {
+return Ok(None);
+}
+
+let result = match expr.as_primitive_literal() {
+PrimitiveLiteral::Int(v) => func
+.transform(Arc::new(Int32Array::from_value(v, 1)))?
+.as_any()
+.downcast_ref::()
+.ok_or_else(|| Error::new(ErrorKind::Unexpected, 
"Failed to downcast"))?
+.value(0),
+PrimitiveLiteral::Long(v) => func
+.transform(Arc::new(Int64Array::from_value(v, 1)))?
+.as_any()
+.downcast_ref::()

Review Comment:
   @sdd 
   if i understood correctly the `fn transform` implementation of `Bucket` 
always returns `arrow_array::Int32Array`.
   
https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/transform/bucket.rs#L89



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Discussion: How to handle S3 Config in Catalog [iceberg-rust]

2024-03-18 Thread via GitHub


marvinlanhenke closed issue #273: Discussion: How to handle S3 Config in Catalog
URL: https://github.com/apache/iceberg-rust/issues/273


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Discussion: How to handle S3 Config in Catalog [iceberg-rust]

2024-03-18 Thread via GitHub


marvinlanhenke commented on issue #273:
URL: https://github.com/apache/iceberg-rust/issues/273#issuecomment-2003673001

   @odysa @Xuanwo 
   this is the way I implemented it for now - and I think its fine. PR will be 
up for review once #271 has been resolved.
   thanks again for your feedback.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Loading table from metadata file directly. [iceberg-rust]

2024-03-18 Thread via GitHub


liurenjie1024 closed issue #246: Loading table from metadata file directly.
URL: https://github.com/apache/iceberg-rust/issues/246


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [I] Loading table from metadata file directly. [iceberg-rust]

2024-03-18 Thread via GitHub


liurenjie1024 commented on issue #246:
URL: https://github.com/apache/iceberg-rust/issues/246#issuecomment-2003753550

   Close by #259 Feel free to open when if necessary.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528470303


##
core/src/test/java/org/apache/iceberg/hadoop/TestHadoopCommits.java:
##
@@ -206,6 +210,133 @@ public void testFailedCommit() throws Exception {
 Assertions.assertThat(manifests).as("Should contain 0 Avro manifest 
files").isEmpty();
   }
 
+  @Test
+  public void testCommitFailedBeforeChangeVersionHint() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+
+HadoopTableOperations spyOps2 = spy(tableOperations);
+doReturn(1).when(spyOps2).findVersionWithOutVersionHint(any());
+TableMetadata metadataV1 = spyOps2.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+assertThatThrownBy(() -> spyOps2.commit(metadataV1, metadataV2))
+.isInstanceOf(CommitFailedException.class)
+.hasMessageContaining("as the latest version is currently");
+
+HadoopTableOperations spyOps3 = spy(tableOperations);
+doReturn(false).when(spyOps3).nextVersionIsLatest(any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps3, CommitFailedException.class, "as the latest 
version is currently");
+
+HadoopTableOperations spyOps4 = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps4)
+.renameMetaDataFileAndCheck(any(), any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps4, CommitFailedException.class, "FileSystem crash!");
+  }
+
+  @Test
+  public void testCommitFailedAndCheckFailed() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doThrow(new RuntimeException("Can not check new Metadata!"))
+.when(spyOps)
+.checkMetaDataFileRenameSuccess(any(), any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps, CommitStateUnknownException.class, "FileSystem 
crash!");
+  }
+
+  @Test
+  public void testCommitFailedAndRenameNotSuccess() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doReturn(false).when(spyOps).checkMetaDataFileRenameSuccess(any(), any(), 
any());
+assertCommitNotChangeVersion(
+baseTable, spyOps, CommitFailedException.class, "Can not commit 
newMetaData.");
+  }
+
+  @Test
+  public void testCommitFailedButActualSuccess() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doReturn(true).when(spyOps).checkMetaDataFileRenameSuccess(any(), any(), 
any());
+int versionBefore = spyOps.findVersion();
+TableMetadata metadataV1 = spyOps.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+spyOps.commit(metadataV1, metadataV2);
+int versionAfter = spyOps.findVersion();
+assert versionAfter - versionBefore == 1;
+  }
+
+  private void assertCommitNotChangeVersion(
+  BaseTable baseTable,
+  HadoopTableOperations spyOps,
+  Class exceptionClass,
+  String msg) {
+int versionBefore = spyOps.findVersion();
+TableMetadata metadataV1 = spyOps.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+assertThatThrownBy(() -> spyOps.commit(metadataV1, metadataV2))
+.isInstanceOf(exceptionClass)
+.hasMessageContaining(msg);
+int versionAfter = spyOps.findVersion();
+assert versionBefore == versionAfter;

Review Comment:
   got it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec

[PR] AWS: Add Option to don't write non current columns in glue schema closes #7584 [iceberg]

2024-03-18 Thread via GitHub


Raphael-Vignes opened a new pull request, #9420:
URL: https://github.com/apache/iceberg/pull/9420

   This PR aims to close this 
[issue](https://github.com/apache/iceberg/issues/7584) and would resolve this 
[issue](https://github.com/apache/iceberg/issues/6340)  too.
   
   We want to provide an optional parameter to `IcebergToGlueConverter` 's 
`setTableInputInformation` method (set to false by default) named 
`glueCatalogWriteNonCurrentColumns`  . It will allow to skip writing columns 
that were deleted in previous version of the Glue Schema instead of writing and 
setting the `iceberg.filed.current` value to false.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-03-18 Thread via GitHub


BsoBird commented on code in PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#discussion_r1528501169


##
core/src/test/java/org/apache/iceberg/hadoop/TestHadoopCommits.java:
##
@@ -206,6 +210,133 @@ public void testFailedCommit() throws Exception {
 Assertions.assertThat(manifests).as("Should contain 0 Avro manifest 
files").isEmpty();
   }
 
+  @Test
+  public void testCommitFailedBeforeChangeVersionHint() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+
+HadoopTableOperations spyOps2 = spy(tableOperations);
+doReturn(1).when(spyOps2).findVersionWithOutVersionHint(any());
+TableMetadata metadataV1 = spyOps2.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+assertThatThrownBy(() -> spyOps2.commit(metadataV1, metadataV2))
+.isInstanceOf(CommitFailedException.class)
+.hasMessageContaining("as the latest version is currently");
+
+HadoopTableOperations spyOps3 = spy(tableOperations);
+doReturn(false).when(spyOps3).nextVersionIsLatest(any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps3, CommitFailedException.class, "as the latest 
version is currently");
+
+HadoopTableOperations spyOps4 = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps4)
+.renameMetaDataFileAndCheck(any(), any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps4, CommitFailedException.class, "FileSystem crash!");
+  }
+
+  @Test
+  public void testCommitFailedAndCheckFailed() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doThrow(new RuntimeException("Can not check new Metadata!"))
+.when(spyOps)
+.checkMetaDataFileRenameSuccess(any(), any(), any());
+assertCommitNotChangeVersion(
+baseTable, spyOps, CommitStateUnknownException.class, "FileSystem 
crash!");
+  }
+
+  @Test
+  public void testCommitFailedAndRenameNotSuccess() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doReturn(false).when(spyOps).checkMetaDataFileRenameSuccess(any(), any(), 
any());
+assertCommitNotChangeVersion(
+baseTable, spyOps, CommitFailedException.class, "Can not commit 
newMetaData.");
+  }
+
+  @Test
+  public void testCommitFailedButActualSuccess() throws IOException {
+table.newFastAppend().appendFile(FILE_A).commit();
+BaseTable baseTable = (BaseTable) table;
+HadoopTableOperations tableOperations = (HadoopTableOperations) 
baseTable.operations();
+HadoopTableOperations spyOps = spy(tableOperations);
+doThrow(new RuntimeException("FileSystem crash!"))
+.when(spyOps)
+.renameMetaDataFile(any(), any(), any());
+doReturn(true).when(spyOps).checkMetaDataFileRenameSuccess(any(), any(), 
any());
+int versionBefore = spyOps.findVersion();
+TableMetadata metadataV1 = spyOps.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+spyOps.commit(metadataV1, metadataV2);
+int versionAfter = spyOps.findVersion();
+assert versionAfter - versionBefore == 1;
+  }
+
+  private void assertCommitNotChangeVersion(
+  BaseTable baseTable,
+  HadoopTableOperations spyOps,
+  Class exceptionClass,
+  String msg) {
+int versionBefore = spyOps.findVersion();
+TableMetadata metadataV1 = spyOps.current();
+SortOrder dataSort = 
SortOrder.builderFor(baseTable.schema()).asc("data").build();
+TableMetadata metadataV2 = metadataV1.replaceSortOrder(dataSort);
+assertThatThrownBy(() -> spyOps.commit(metadataV1, metadataV2))
+.isInstanceOf(exceptionClass)
+.hasMessageContaining(msg);
+int versionAfter = spyOps.findVersion();
+assert versionBefore == versionAfter;

Review Comment:
   @nastra Sir. I modified it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab

Re: [PR] Build: Bump guava from 33.0.0-jre to 33.1.0-jre [iceberg]

2024-03-18 Thread via GitHub


nastra merged PR #9977:
URL: https://github.com/apache/iceberg/pull/9977


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] [WIP] Implement project for Transform. #264 [iceberg-rust]

2024-03-18 Thread via GitHub


marvinlanhenke commented on PR #269:
URL: https://github.com/apache/iceberg-rust/pull/269#issuecomment-2003886408

   @liurenjie1024 @sdd @Xuanwo @ZENOTME 
   I went on and implemented `fn project` for `Transform::Bucket` with some 
design assumptions. PTAL and tell me what you think, before we can move on and 
implement the missing transforms.
   Thank you so much for your effort


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



  1   2   3   >