Re: [PR] Core: Use ParallelIterable in Deletes::toPositionIndex (6387) [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8805: URL: https://github.com/apache/iceberg/pull/8805#discussion_r1366551865 ## core/src/test/java/org/apache/iceberg/deletes/TestPositionFilter.java: ## @@ -282,6 +286,16 @@ public void testPositionSetRowFilter() { @Test public void test

Re: [PR] Spark 3.2: Use Awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8882: URL: https://github.com/apache/iceberg/pull/8882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark 3.4: Use Awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8884: URL: https://github.com/apache/iceberg/pull/8884 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark 3.3: Use Awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8883: URL: https://github.com/apache/iceberg/pull/8883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] I cannot package my application as uberjar using maven shade plugin. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on issue #7953: URL: https://github.com/apache/iceberg/issues/7953#issuecomment-1772179150 `Unsupported class file major version 63` indicates that you're using JDK19 instead of JDK11, so that's most likely the issue -- This is an automated message from the Apache Git Ser

Re: [I] Docs: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
mobley-trent commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1772102614 Yes I will look into this šŸ‘ @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Doc: fix iceberg-javadoc link [iceberg]

2023-10-19 Thread via GitHub
jbonofre commented on PR #8885: URL: https://github.com/apache/iceberg/pull/8885#issuecomment-1772081205 Sorry about that, I forgot to update in my previous change. Thanks for the fix ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-19 Thread via GitHub
fengjiajie commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1772038159 > My thoughts were more like, if we have a column defined as "double" we may allow "float" in the file definition but we wouldn't allow Binary. So how is this different? @Russel

Re: [PR] Core: Reduce unnecessary add operations in deletedPaths set [iceberg]

2023-10-19 Thread via GitHub
bknbkn commented on PR #8868: URL: https://github.com/apache/iceberg/pull/8868#issuecomment-1772037756 > What about the check on 422? sorry, I made a mistake in the previous comment. The change being discussed pertains to the `deletedPath` and only affects the code starting from line

Re: [PR] Core: Reduce unnecessary add operations in deletedPaths set [iceberg]

2023-10-19 Thread via GitHub
RussellSpitzer commented on PR #8868: URL: https://github.com/apache/iceberg/pull/8868#issuecomment-1771989413 What about the check on 422? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Core: Reduce unnecessary add operations in deletedPaths set [iceberg]

2023-10-19 Thread via GitHub
bknbkn commented on PR #8868: URL: https://github.com/apache/iceberg/pull/8868#issuecomment-1771963125 > Not sure I understand this change. Seems like you are removing an optimization to avoid re computing whether a path is deleted if we already determined it is deleted? In the original

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-19 Thread via GitHub
RussellSpitzer commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1771952316 My thoughts were more like, if we have a column defined as "double" we may allow "float" in the file definition but we wouldn't allow Binary. So how is this different? -- This i

Re: [I] I cannot package my application as uberjar using maven shade plugin. [iceberg]

2023-10-19 Thread via GitHub
vinitamaloo-asu commented on issue #7953: URL: https://github.com/apache/iceberg/issues/7953#issuecomment-1771943066 I am getting the same issue. Were you able to resolve it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-19 Thread via GitHub
fengjiajie commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1771924888 > how are we guaranteed that the binary is parsable as UTF8 bytes? @RussellSpitzer Thank you for participating in the review. If a column is not encoded in UTF-8, it should not

Re: [PR] JDBC catalog fix namespaceExists check [iceberg]

2023-10-19 Thread via GitHub
rdblue commented on PR #8340: URL: https://github.com/apache/iceberg/pull/8340#issuecomment-1771876081 @ismailsimsek it looks like this has unnecessary changes and refactors quite a bit. Can you revert the unnecessary changes and add to the PR description how this solves the problem? Thank

Re: [PR] JDBC catalog fix namespaceExists check [iceberg]

2023-10-19 Thread via GitHub
rdblue commented on code in PR #8340: URL: https://github.com/apache/iceberg/pull/8340#discussion_r1366272265 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -345,38 +345,44 @@ public static String deletePropertiesStatement(Set properties) { static boole

Re: [PR] JDBC catalog fix namespaceExists check [iceberg]

2023-10-19 Thread via GitHub
rdblue commented on code in PR #8340: URL: https://github.com/apache/iceberg/pull/8340#discussion_r1366270788 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -345,38 +345,44 @@ public static String deletePropertiesStatement(Set properties) { static boole

Re: [PR] JDBC catalog fix namespaceExists check [iceberg]

2023-10-19 Thread via GitHub
rdblue commented on code in PR #8340: URL: https://github.com/apache/iceberg/pull/8340#discussion_r1366270788 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -345,38 +345,44 @@ public static String deletePropertiesStatement(Set properties) { static boole

Re: [PR] JDBC catalog fix namespaceExists check [iceberg]

2023-10-19 Thread via GitHub
rdblue commented on code in PR #8340: URL: https://github.com/apache/iceberg/pull/8340#discussion_r1366270299 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -135,7 +135,7 @@ final class JdbcUtil { + CATALOG_NAME + " = ? AND "

Re: [I] iceberg supports incremental computation based on iceberg tags and data changes between tags in computing engines such as spark and presto. [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] commented on issue #7193: URL: https://github.com/apache/iceberg/issues/7193#issuecomment-1771870665 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Migrate catalog from hive catalog to jdbc/rest catalog [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] commented on issue #7208: URL: https://github.com/apache/iceberg/issues/7208#issuecomment-1771870633 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] iceberg supports incremental computation based on iceberg tags and data changes between tags in computing engines such as spark and presto. [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] closed issue #7193: iceberg supports incremental computation based on iceberg tags and data changes between tags in computing engines such as spark and presto. URL: https://github.com/apache/iceberg/issues/7193 -- This is an automated message from the Apache Git Service.

Re: [I] Migrate catalog from hive catalog to jdbc/rest catalog [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] closed issue #7208: Migrate catalog from hive catalog to jdbc/rest catalog URL: https://github.com/apache/iceberg/issues/7208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Some question about zorder [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] commented on issue #7405: URL: https://github.com/apache/iceberg/issues/7405#issuecomment-1771870475 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-19 Thread via GitHub
bitsondatadev commented on PR #8659: URL: https://github.com/apache/iceberg/pull/8659#issuecomment-1771793120 @rdblue I have this now in a good state relative to the 1.4.0 docs in nightly, could you PTAL? Thanks! -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] Doc: fix iceberg-javadoc link [iceberg]

2023-10-19 Thread via GitHub
Fokko merged PR #8885: URL: https://github.com/apache/iceberg/pull/8885 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[PR] Doc: fix iceberg-javadoc link [iceberg]

2023-10-19 Thread via GitHub
dramaticlly opened a new pull request, #8885: URL: https://github.com/apache/iceberg/pull/8885 latest javadoc can be found in https://iceberg.apache.org/javadoc/latest Can you help take a look @nastra @Fokko @amogh-jahagirdar -- This is an automated message from the Apache Git Ser

Re: [I] Docs: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1771667298 Are you interested in providing a PR? :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-19 Thread via GitHub
Fokko merged PR #83: URL: https://github.com/apache/iceberg-python/pull/83 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] [BUG] to_arrow conversion does not support iceberg table column name containing slash [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on issue #81: URL: https://github.com/apache/iceberg-python/issues/81#issuecomment-1771666165 Thanks for fixing this @puchengy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] [BUG] to_arrow conversion does not support iceberg table column name containing slash [iceberg-python]

2023-10-19 Thread via GitHub
Fokko closed issue #81: [BUG] to_arrow conversion does not support iceberg table column name containing slash URL: https://github.com/apache/iceberg-python/issues/81 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Add Refurb to ruff [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on PR #87: URL: https://github.com/apache/iceberg-python/pull/87#issuecomment-1771655439 @jayceslesar Thanks for chiming in here. I believe that all checks are enabled when the plugin is enabled. In the current codebase, there are no violations. Enabling this will make sure

Re: [I] Add view support for Hive catalog [iceberg]

2023-10-19 Thread via GitHub
pvary commented on issue #8698: URL: https://github.com/apache/iceberg/issues/8698#issuecomment-1771644929 @ajantha-bhat: The question is whether the Hive 3 HMS API is enough for the integration, or not. I would prefer if it would be enough, but we should definitely try to involve the Hive

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-19 Thread via GitHub
RussellSpitzer commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1771569346 I'm also a little nervous about this change, how are we guaranteed that the binary is parsable as UTF8 bytes? Seems like we should just be fixing the type annotations rather than c

Re: [PR] Add Refurb to ruff [iceberg-python]

2023-10-19 Thread via GitHub
jayceslesar commented on PR #87: URL: https://github.com/apache/iceberg-python/pull/87#issuecomment-1771566427 You might want to have a config here that enables and disables certain checks but that can come with trial and error -- This is an automated message from the Apache Git Service.

Re: [PR] Core: Reduce unnecessary add operations in deletedPaths set [iceberg]

2023-10-19 Thread via GitHub
RussellSpitzer commented on PR #8868: URL: https://github.com/apache/iceberg/pull/8868#issuecomment-1771557957 Not sure I understand this change. Seems like you are removing an optimization to avoid re computing whether a path is deleted if we already determined it is deleted? -- This i

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-19 Thread via GitHub
puchengy commented on code in PR #83: URL: https://github.com/apache/iceberg-python/pull/83#discussion_r1365869931 ## pyiceberg/schema.py: ## @@ -1273,6 +1273,102 @@ def primitive(self, primitive: PrimitiveType) -> PrimitiveType: return primitive +# Implementation

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-19 Thread via GitHub
puchengy commented on code in PR #83: URL: https://github.com/apache/iceberg-python/pull/83#discussion_r1365868994 ## pyiceberg/schema.py: ## @@ -1273,6 +1273,102 @@ def primitive(self, primitive: PrimitiveType) -> PrimitiveType: return primitive +# Implementation

[I] fast_forward command not merging branches within AWS Glue [iceberg]

2023-10-19 Thread via GitHub
lime-squeeze opened a new issue, #8881: URL: https://github.com/apache/iceberg/issues/8881 ### Query engine Spark 3.3 within AWS Glue 4.0 and using iceberg-spark-runtime-3.3_2.12-1.4.0.jar ### Question I am attempting to test branching via SparkSQL within AWS Glue. I am

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1771214572 I have the same feeling, I think this should be done once, and not every time you initialize the catalog (if you run a lot of jobs in parallel using Airflow, these calls can add up)

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
gkaretka commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1771206442 Maybe this issue can be renamed into: _documentation_ rather than _bug_. What do you think @mobley-trent -- This is an automated message from the Apache Git Service. To respon

Re: [PR] Build: Replace deprecated command with environment file [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8666: URL: https://github.com/apache/iceberg/pull/8666 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] Replace deprecated `set-output` command with environment file [iceberg]

2023-10-19 Thread via GitHub
nastra closed issue #8665: Replace deprecated `set-output` command with environment file URL: https://github.com/apache/iceberg/issues/8665 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1365689434 ## api/src/test/java/org/apache/iceberg/metrics/TestDefaultTimer.java: ## @@ -20,6 +20,7 @@ import static java.util.concurrent.Executors.newFixedThreadPool; import s

Re: [PR] Flink 1.16: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8880: URL: https://github.com/apache/iceberg/pull/8880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Flink 1.15: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8877: URL: https://github.com/apache/iceberg/pull/8877 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Flink 1.15: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nk1506 commented on PR #8877: URL: https://github.com/apache/iceberg/pull/8877#issuecomment-1771122305 @nastra , Please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Flink 1.16: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nk1506 commented on PR #8880: URL: https://github.com/apache/iceberg/pull/8880#issuecomment-1771122007 @nastra , Please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Iceberg Rest Catalog Support for a Separate OIDC Authorization Server URI [iceberg]

2023-10-19 Thread via GitHub
syun64 commented on issue #8869: URL: https://github.com/apache/iceberg/issues/8869#issuecomment-1771122573 > @syun64 In my org, we have very similar situation where we, unfortunately, can only use an internal procedure to grab auth token (that is quite different from OIDC flow). Based on w

[PR] Add Refurb to ruff [iceberg-python]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #87: URL: https://github.com/apache/iceberg-python/pull/87 Seems to do some nice checks: https://docs.astral.sh/ruff/rules/#refurb-furb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[PR] Add flake8-pie to ruff [iceberg-python]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #86: URL: https://github.com/apache/iceberg-python/pull/86 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] API, Core: Add UUID API to Table [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar commented on code in PR #8800: URL: https://github.com/apache/iceberg/pull/8800#discussion_r1365617906 ## api/src/main/java/org/apache/iceberg/Table.java: ## @@ -333,6 +333,15 @@ default UpdateStatistics updateStatistics() { */ Map refs(); + /** + *

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar merged PR #8870: URL: https://github.com/apache/iceberg/pull/8870 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

[PR] Update pre-commit [iceberg-python]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #85: URL: https://github.com/apache/iceberg-python/pull/85 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Update release template [iceberg]

2023-10-19 Thread via GitHub
Fokko commented on PR #8879: URL: https://github.com/apache/iceberg/pull/8879#issuecomment-1770987327 Some historical context: https://github.com/apache/iceberg-docs/pull/187#discussion_r1086933656 -- This is an automated message from the Apache Git Service. To respond to the message, ple

[PR] Update release template [iceberg]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #8879: URL: https://github.com/apache/iceberg/pull/8879 I think we should remove the excluding weekends part of the wait period. With a patch release like we're doing now, I don't think we want to be constrained by this. In practice, I don't think many relea

Re: [I] Rest Catalog UpdateTableRequest IOException handling could cause data discrepancy in case of response getting lost [iceberg]

2023-10-19 Thread via GitHub
nastra commented on issue #6778: URL: https://github.com/apache/iceberg/issues/6778#issuecomment-1770949997 I believe this issue should be fixed by https://github.com/apache/iceberg/pull/8397 and https://github.com/apache/iceberg/pull/8599, where we only perform cleanup on exceptions that

[PR] Disable merging explicitly [iceberg]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #8878: URL: https://github.com/apache/iceberg/pull/8878 This was done earlier before there was a `.asf.yaml` through an INFRA ticket, but I think it is good to also add it to the `yaml`. -- This is an automated message from the Apache Git Service. To respo

Re: [I] Rest Catalog UpdateTableRequest IOException handling could cause data discrepancy in case of response getting lost [iceberg]

2023-10-19 Thread via GitHub
nastra closed issue #6778: Rest Catalog UpdateTableRequest IOException handling could cause data discrepancy in case of response getting lost URL: https://github.com/apache/iceberg/issues/6778 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Add sort_order_id to SCAN_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on PR #8873: URL: https://github.com/apache/iceberg/pull/8873#issuecomment-1770931874 Thanks @nastra @Fokko for review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Build: Bump org.apache.pig:pig from 0.14.0 to 0.17.0 [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8774: URL: https://github.com/apache/iceberg/pull/8774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar commented on code in PR #8870: URL: https://github.com/apache/iceberg/pull/8870#discussion_r1365470046 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -281,6 +281,7 @@ protected String defaultWarehouseLocation(TableIdentifier tableIdent

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar commented on code in PR #8870: URL: https://github.com/apache/iceberg/pull/8870#discussion_r1365470046 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -281,6 +281,7 @@ protected String defaultWarehouseLocation(TableIdentifier tableIdent

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365453843 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -57,7 +57,8 @@ abstract class BaseScan> "nan_value_counts", "lower_bounds",

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar commented on PR #8870: URL: https://github.com/apache/iceberg/pull/8870#issuecomment-1770872534 > Just a wild guess, but could it be possible that we're missing to strip a trailing slash in > > https://github.com/apache/iceberg/blob/81bf8d30766b1b129b87abde15239645cb

Re: [PR] Update CachingCatalog to use expireAfterWrite instead of expireAfterAccess [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on PR #8844: URL: https://github.com/apache/iceberg/pull/8844#issuecomment-1770804416 @nastra Can you please take a finally review ? 😺 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
gkaretka commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1770748027 Thanks šŸ‘ was actually trying to investigate a little further and came to the same conclusion. Creating tables by calling: `catalog.create_tables()` solved my problem

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on PR #8873: URL: https://github.com/apache/iceberg/pull/8873#issuecomment-1770739966 @nastra Could you please take a look again ? Thanks . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365382850 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -57,7 +57,8 @@ abstract class BaseScan> "nan_value_counts", "lower_bounds",

Re: [PR] Nessie: Adapt to Nessie 0.71.1 release [iceberg]

2023-10-19 Thread via GitHub
Fokko commented on PR #8798: URL: https://github.com/apache/iceberg/pull/8798#issuecomment-1770736345 Certainly @ajantha-bhat, thanks for pinging me. Thanks for the PR @nk1506 šŸ‘ and @dimas-b for the review -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Nessie: Adapt to Nessie 0.71.1 release [iceberg]

2023-10-19 Thread via GitHub
Fokko merged PR #8798: URL: https://github.com/apache/iceberg/pull/8798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1770729964 @mobley-trent @gkaretka Thanks for reaching out here. The tables are not created by default, but I think that might be the wrong behaviour since you both expected them to be created

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365374541 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -57,7 +57,8 @@ abstract class BaseScan> "nan_value_counts", "lower_bounds",

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
gkaretka commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1770718118 šŸ‘ same issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Nessie: Adapt to Nessie 0.71.1 release [iceberg]

2023-10-19 Thread via GitHub
ajantha-bhat commented on PR #8798: URL: https://github.com/apache/iceberg/pull/8798#issuecomment-1770707163 can this be merged? @nastra, @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Add view support for Hive catalog [iceberg]

2023-10-19 Thread via GitHub
ajantha-bhat commented on issue #8698: URL: https://github.com/apache/iceberg/issues/8698#issuecomment-1770694291 > Please share your thoughts. CC: @ajantha-bhat , @jbonofre . Also tag the relevant people. I think we can use `VIRTUAL_VIEW` type to avoid adding custom properties to

Re: [PR] feat: First version of rest catalog. [iceberg-rust]

2023-10-19 Thread via GitHub
liurenjie1024 commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1365327942 ## crates/catalog/rest/src/catalog.rs: ## @@ -0,0 +1,845 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreem

Re: [PR] Doc: Fix "Verifying Checksums" script in verify-release.md [iceberg-python]

2023-10-19 Thread via GitHub
Fokko merged PR #82: URL: https://github.com/apache/iceberg-python/pull/82 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on code in PR #83: URL: https://github.com/apache/iceberg-python/pull/83#discussion_r1363628652 ## pyiceberg/schema.py: ## @@ -1273,6 +1273,102 @@ def primitive(self, primitive: PrimitiveType) -> PrimitiveType: return primitive +# Implementation cop

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on PR #8857: URL: https://github.com/apache/iceberg/pull/8857#issuecomment-1770502216 > I suspect, the setProperties + removeProperties methods need the same changes? Yes... that was lurking around since the beginning, I guess I'm in to change them too :-D -- This

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365256655 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -223,27 +284,57 @@ namespace, getRef().getName()), } public boolean dropNamespa

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365251661 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,23 +182,83 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365252557 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,23 +182,83 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365245858 ## core/src/test/java/org/apache/iceberg/ScanTestBase.java: ## @@ -224,4 +224,34 @@ public void testReAddingPartitionField() throws Exception { } } } + +

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365243595 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -57,7 +57,8 @@ abstract class BaseScan> "nan_value_counts", "lower_bounds",

Re: [PR] Doc: Fix "Verifying Checksums" script in verify-release.md [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on PR #82: URL: https://github.com/apache/iceberg-python/pull/82#issuecomment-1770436467 I just checked the Iceberg Java 1.4.1 release and noticed this as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Build: Bump urllib3 from 1.26.17 to 1.26.18 [iceberg-python]

2023-10-19 Thread via GitHub
Fokko merged PR #84: URL: https://github.com/apache/iceberg-python/pull/84 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Add view support for Hive catalog [iceberg]

2023-10-19 Thread via GitHub
pvary commented on issue #8698: URL: https://github.com/apache/iceberg/issues/8698#issuecomment-1770431955 @deniskuzZ: Is there there something like this ongoing in the Hive codebase? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [I] Iceberg vs Parquet [iceberg]

2023-10-19 Thread via GitHub
pvary closed issue #8876: Iceberg vs Parquet URL: https://github.com/apache/iceberg/issues/8876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-u

Re: [I] Iceberg vs Parquet [iceberg]

2023-10-19 Thread via GitHub
pvary commented on issue #8876: URL: https://github.com/apache/iceberg/issues/8876#issuecomment-1770427907 @hieuLapTop77: Apache Parquet is a file format - it describes how the data is written to files, while Apache Iceberg is a table format - while it defines how the data is written to fil

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365197840 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -186,12 +189,23 @@ public PartitionData copy() { this.recordCount = toCopy.recordCount; this.fileS

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365196782 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -174,8 +176,9 @@ public PartitionData copy() { * * @param toCopy a generic data file to copy. *

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365196098 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -165,6 +166,19 @@ default Long fileSequenceNumber() { */ F copyWithoutStats(); + /** + * Copie

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365195690 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -177,4 +191,26 @@ default Long fileSequenceNumber() { default F copy(boolean withStats) { return w

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365196518 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -121,6 +121,10 @@ protected boolean shouldReturnColumnStats() { return context().returnColumnStats();

Re: [PR] feat: support ser/deser of value [iceberg-rust]

2023-10-19 Thread via GitHub
liurenjie1024 commented on code in PR #82: URL: https://github.com/apache/iceberg-rust/pull/82#discussion_r1365064528 ## crates/iceberg/src/avro/schema.rs: ## @@ -203,8 +197,8 @@ impl SchemaVisitor for SchemaToAvroSchema { PrimitiveType::Timestamp => AvroSchema::Tim

Re: [PR] Core: Improvements around View catalog tests [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8865: URL: https://github.com/apache/iceberg/pull/8865#discussion_r1365171588 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -225,8 +243,9 @@ public void createViewErrorCases() { .withQuery(trino.di

Re: [PR] Core: Improvements around View catalog tests [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8865: URL: https://github.com/apache/iceberg/pull/8865#discussion_r1365154084 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -1446,7 +1544,14 @@ public void updateViewLocationConflict() { // the view was already

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nk1506 commented on PR #8852: URL: https://github.com/apache/iceberg/pull/8852#issuecomment-1770333911 Yeah sure @nastra . I will create PR with other versions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
snazy commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365087588 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -223,27 +284,57 @@ namespace, getRef().getName()), } public boolean dropNamespac

  1   2   >