[GitHub] [iceberg] hililiwei opened a new pull request, #6253: Flink: Write watermark to the snapshot summary

2022-11-22 Thread GitBox
hililiwei opened a new pull request, #6253: URL: https://github.com/apache/iceberg/pull/6253 In some scenarios, the task needs to determine that all data of a certain period has been written based on the watermark. The PR writes the watermark of the task to the snapshot summary, like

[GitHub] [iceberg] lirui-apache commented on pull request #6175: Hive: Add UGI to the key in CachedClientPool

2022-11-22 Thread GitBox
lirui-apache commented on PR #6175: URL: https://github.com/apache/iceberg/pull/6175#issuecomment-1324633275 It seems we generally agree to support pluggable cache. How about allow users to specify ClientPool implementation class via a catalog property, and HiveCatalog can instantiate Clien

[GitHub] [iceberg] manuzhang commented on issue #1026: Add an action to rewrite equality deletes as position deletes

2022-11-22 Thread GitBox
manuzhang commented on issue #1026: URL: https://github.com/apache/iceberg/issues/1026#issuecomment-1324585193 @rdblue @openinx @chenjunjiedada has this been implemented? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] djouallah commented on issue #5801: Read iceberg table into a PyArrow Dataset

2022-11-22 Thread GitBox
djouallah commented on issue #5801: URL: https://github.com/apache/iceberg/issues/5801#issuecomment-1324555305 @dungdm93 trust me, I made sure every one knows about it :) , please don't close till the python binary is released :) https://twitter.com/mim_djo/status/1594821677904715776 --

[GitHub] [iceberg] dungdm93 commented on issue #5801: Read iceberg table into a PyArrow Dataset

2022-11-22 Thread GitBox
dungdm93 commented on issue #5801: URL: https://github.com/apache/iceberg/issues/5801#issuecomment-1324546747 Implemented in #6233 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [iceberg] rdblue commented on a diff in pull request #6072: Core: Add scan report for incremental Table scans

2022-11-22 Thread GitBox
rdblue commented on code in PR #6072: URL: https://github.com/apache/iceberg/pull/6072#discussion_r1029942996 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -72,13 +72,16 @@ protected CloseableIterable doPlanFiles( .filter(manifest

[GitHub] [iceberg] rdblue commented on a diff in pull request #6072: Core: Add scan report for incremental Table scans

2022-11-22 Thread GitBox
rdblue commented on code in PR #6072: URL: https://github.com/apache/iceberg/pull/6072#discussion_r1029942733 ## core/src/main/java/org/apache/iceberg/IncrementalDataTableScan.java: ## @@ -105,6 +105,7 @@ public CloseableIterable planFiles() { snapshotIds.co

[GitHub] [iceberg] rdblue commented on a diff in pull request #6072: Core: Add scan report for incremental Table scans

2022-11-22 Thread GitBox
rdblue commented on code in PR #6072: URL: https://github.com/apache/iceberg/pull/6072#discussion_r1029941914 ## core/src/main/java/org/apache/iceberg/IncrementalDataTableScan.java: ## @@ -105,6 +105,7 @@ public CloseableIterable planFiles() { snapshotIds.co

[GitHub] [iceberg] github-actions[bot] commented on issue #4862: Webpage breaks at medium width

2022-11-22 Thread GitBox
github-actions[bot] commented on issue #4862: URL: https://github.com/apache/iceberg/issues/4862#issuecomment-1324390491 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #4743: flink1.14.4+iceberg0.13.1+hive-metastore3.1.2+minio(S3) error!

2022-11-22 Thread GitBox
github-actions[bot] commented on issue #4743: URL: https://github.com/apache/iceberg/issues/4743#issuecomment-1324390519 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] rdblue opened a new pull request, #6251: Spec: Clarify auth responses in the REST spec

2022-11-22 Thread GitBox
rdblue opened a new pull request, #6251: URL: https://github.com/apache/iceberg/pull/6251 This is a minor clarification to the REST catalog spec. When a request is unauthenticated or unauthorized, the REST service must respond with a correct HTTP error code and must not return fake success

[GitHub] [iceberg] nastra closed issue #6003: Vectorized Read

2022-11-22 Thread GitBox
nastra closed issue #6003: Vectorized Read URL: https://github.com/apache/iceberg/issues/6003 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-uns

[GitHub] [iceberg] asheeshgarg commented on issue #6003: Vectorized Read

2022-11-22 Thread GitBox
asheeshgarg commented on issue #6003: URL: https://github.com/apache/iceberg/issues/6003#issuecomment-1324047236 @nastra added my configuration above -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [iceberg] asheeshgarg commented on issue #6003: Vectorized Read

2022-11-22 Thread GitBox
asheeshgarg commented on issue #6003: URL: https://github.com/apache/iceberg/issues/6003#issuecomment-1324041867 Adding combination of Jars and Command ` org.apache.iceberg iceberg-common 1.0.0 org.apache.iceberg iceberg-core

[GitHub] [iceberg] nastra commented on issue #6003: Vectorized Read

2022-11-22 Thread GitBox
nastra commented on issue #6003: URL: https://github.com/apache/iceberg/issues/6003#issuecomment-1324034699 @asheeshgarg great that it eventually worked out. Do you maybe want to share how you fixed it so that potentially other users that navigate to this issue know what to do? -- This i

[GitHub] [iceberg] asheeshgarg commented on issue #6003: Vectorized Read

2022-11-22 Thread GitBox
asheeshgarg commented on issue #6003: URL: https://github.com/apache/iceberg/issues/6003#issuecomment-1324032039 @nastra thanks for the details able to read the data -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [iceberg] ajantha-bhat commented on issue #6196: How to use equality delete in Iceberg v2 table

2022-11-22 Thread GitBox
ajantha-bhat commented on issue #6196: URL: https://github.com/apache/iceberg/issues/6196#issuecomment-1324000845 @singhpk234, @nastra: I think we need to add it to this. Without mentioning what is row-level-deletes, we can't add the note in spark writes. https://github.com/apache

[GitHub] [iceberg] singhpk234 commented on issue #6196: How to use equality delete in Iceberg v2 table

2022-11-22 Thread GitBox
singhpk234 commented on issue #6196: URL: https://github.com/apache/iceberg/issues/6196#issuecomment-1323992792 > @singhpk234 is it worth documenting this so that it's clearer for users? +1 I think we should document this, have seen this come up quite a number of times in slack discus

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6250: Docs: Remove redundant configuration from spark docs

2022-11-22 Thread GitBox
ajantha-bhat opened a new pull request, #6250: URL: https://github.com/apache/iceberg/pull/6250 The section description mentions that we are creating a catalog called `local`, but it also includes redundant configuration for `spark_catalog` of type hive. Also, hive URI is not mentioned for

[GitHub] [iceberg] ajantha-bhat commented on pull request #6250: Docs: Remove redundant configuration from spark docs

2022-11-22 Thread GitBox
ajantha-bhat commented on PR #6250: URL: https://github.com/apache/iceberg/pull/6250#issuecomment-1323991146 cc: @Fokko, @RussellSpitzer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [iceberg] rdblue commented on a diff in pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2022-11-22 Thread GitBox
rdblue commented on code in PR #6169: URL: https://github.com/apache/iceberg/pull/6169#discussion_r1029607140 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1119,6 +1139,54 @@ public void applyS3ServiceConfigurations(T builder) .build()

[GitHub] [iceberg] nastra commented on issue #6196: How to use equality delete in Iceberg v2 table

2022-11-22 Thread GitBox
nastra commented on issue #6196: URL: https://github.com/apache/iceberg/issues/6196#issuecomment-1323852395 @singhpk234 is it worth documenting this so that it's clearer for users? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [iceberg] nastra commented on issue #6216: write.metadata.metrics.default how to works?

2022-11-22 Thread GitBox
nastra commented on issue #6216: URL: https://github.com/apache/iceberg/issues/6216#issuecomment-1323850881 @chenwyi2 could you please rephrase your question so that I can help you better in answering it? -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [iceberg] nastra commented on issue #6218: Set COMMIT_MIN_RETRY_WAIT_MS_DEFAULT to 1000 ms instead of 100 ms to avoid too frequent commit exceptions

2022-11-22 Thread GitBox
nastra commented on issue #6218: URL: https://github.com/apache/iceberg/issues/6218#issuecomment-1323848621 I'm not sure we would want to increase the default as that would cause things to generally take longer in the case of retries. For large tables I think it would better to selectively

[GitHub] [iceberg] nastra commented on issue #6236: Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 3072 bytes

2022-11-22 Thread GitBox
nastra commented on issue #6236: URL: https://github.com/apache/iceberg/issues/6236#issuecomment-1323817187 Given that you're using the `JDBCCatalog` with MySql (`jdbc:mysql://localhost:3306/iceberg`), this is actually a limitation that is imposed by MySql, not Iceberg itself. You would hav

[GitHub] [iceberg] InvisibleProgrammer opened a new issue, #6249: Create Iceberg Hive documentation

2022-11-22 Thread GitBox
InvisibleProgrammer opened a new issue, #6249: URL: https://github.com/apache/iceberg/issues/6249 ### Feature Request / Improvement https://iceberg.apache.org/docs/latest/hive/ tracks the documentation of Iceberg features supported by Hive. It currently reflects whatever was released

[GitHub] [iceberg] pvary opened a new pull request, #6248: Flink: Fix tests creating catalog after FLINK-29677

2022-11-22 Thread GitBox
pvary opened a new pull request, #6248: URL: https://github.com/apache/iceberg/pull/6248 FLINK-29677 add a check, so the currently used `Catalog` could not be dropped. FLINK-29677 will land in Flink 1.16.1, and 1.17.0 and when we start to use it we need to update some of our tests.

[GitHub] [iceberg] ajantha-bhat commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

2022-11-22 Thread GitBox
ajantha-bhat commented on PR #6223: URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1323728710 cc: @Fokko, @rdblue, @RussellSpitzer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [iceberg] grbinho commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

2022-11-22 Thread GitBox
grbinho commented on PR #6223: URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1323723030 @ajantha-bhat @singhpk234 Do you need anything else from me to get this merged? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [iceberg] Fokko merged pull request #6242: API: Restore the type of the identity transform

2022-11-22 Thread GitBox
Fokko merged PR #6242: URL: https://github.com/apache/iceberg/pull/6242 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko merged pull request #6240: Nessie: Refactor NessieTableOperations#doCommit

2022-11-22 Thread GitBox
Fokko merged PR #6240: URL: https://github.com/apache/iceberg/pull/6240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] ajantha-bhat commented on pull request #6240: Nessie: Refactor NessieTableOperations#doCommit

2022-11-22 Thread GitBox
ajantha-bhat commented on PR #6240: URL: https://github.com/apache/iceberg/pull/6240#issuecomment-1323499398 javadoc task Build failed due to a server timeout error! Will retrigger the CI by rebasing. ``` 2022-11-22T10:36:40.1649273Z Starting a Gradle Daemon (subsequent builds wil

[GitHub] [iceberg] Fokko commented on a diff in pull request #6242: API: Restore the type of the identity transform

2022-11-22 Thread GitBox
Fokko commented on code in PR #6242: URL: https://github.com/apache/iceberg/pull/6242#discussion_r1029155652 ## api/src/main/java/org/apache/iceberg/transforms/Identity.java: ## @@ -105,6 +142,21 @@ public boolean isIdentity() { return true; } + @Override + public bo

[GitHub] [iceberg] nastra commented on a diff in pull request #6242: API: Restore the type of the identity transform

2022-11-22 Thread GitBox
nastra commented on code in PR #6242: URL: https://github.com/apache/iceberg/pull/6242#discussion_r1029152726 ## api/src/main/java/org/apache/iceberg/transforms/Identity.java: ## @@ -105,6 +142,21 @@ public boolean isIdentity() { return true; } + @Override + public b

[GitHub] [iceberg] nastra commented on a diff in pull request #6242: API: Restore the type of the identity transform

2022-11-22 Thread GitBox
nastra commented on code in PR #6242: URL: https://github.com/apache/iceberg/pull/6242#discussion_r1029152726 ## api/src/main/java/org/apache/iceberg/transforms/Identity.java: ## @@ -105,6 +142,21 @@ public boolean isIdentity() { return true; } + @Override + public b

[GitHub] [iceberg] Fokko commented on a diff in pull request #6242: API: Restore the type of the identity transform

2022-11-22 Thread GitBox
Fokko commented on code in PR #6242: URL: https://github.com/apache/iceberg/pull/6242#discussion_r1029148053 ## api/src/main/java/org/apache/iceberg/transforms/Identity.java: ## @@ -105,6 +142,21 @@ public boolean isIdentity() { return true; } + @Override + public bo

[GitHub] [iceberg] nastra commented on a diff in pull request #6242: API: Restore the type of the identity transform

2022-11-22 Thread GitBox
nastra commented on code in PR #6242: URL: https://github.com/apache/iceberg/pull/6242#discussion_r1029142022 ## api/src/main/java/org/apache/iceberg/transforms/Identity.java: ## @@ -105,6 +142,21 @@ public boolean isIdentity() { return true; } + @Override + public b

[GitHub] [iceberg] nastra commented on pull request #6240: Nessie: Refactor NessieTableOperations#doCommit

2022-11-22 Thread GitBox
nastra commented on PR #6240: URL: https://github.com/apache/iceberg/pull/6240#issuecomment-1323427139 I've verified that this new API change works for https://github.com/trinodb/trino/pull/11701 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [iceberg] wxjzlm commented on issue #4550: the snapshot file is lost when write iceberg using flink Failed to open input stream for file File does not exist

2022-11-22 Thread GitBox
wxjzlm commented on issue #4550: URL: https://github.com/apache/iceberg/issues/4550#issuecomment-1323337067 > 频繁因为这个问题,flink作业失败 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [iceberg] psnilesh closed issue #6245: Fix for issue #2796 is missing from 0.14.1 and 1.0.x releases

2022-11-22 Thread GitBox
psnilesh closed issue #6245: Fix for issue #2796 is missing from 0.14.1 and 1.0.x releases URL: https://github.com/apache/iceberg/issues/6245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [iceberg] psnilesh commented on issue #6245: Fix for issue #2796 is missing from 0.14.1 and 1.0.x releases

2022-11-22 Thread GitBox
psnilesh commented on issue #6245: URL: https://github.com/apache/iceberg/issues/6245#issuecomment-1323319051 Got it. Closing the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci