[GitHub] [iceberg] zhangbutao commented on pull request #6482: API: Fix inconsistent TimeTransform Type

2023-01-27 Thread via GitHub
zhangbutao commented on PR #6482: URL: https://github.com/apache/iceberg/pull/6482#issuecomment-1407247517 Sorry for long time to come back. Will address comments as soon as :) @hililiwei -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-27 Thread via GitHub
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1089591796 ## api/src/main/java/org/apache/iceberg/actions/MigrateTable.java: ## @@ -50,6 +50,15 @@ default MigrateTable dropBackup() { throw new UnsupportedOperationE

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-27 Thread via GitHub
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1089592709 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java: ## @@ -39,7 +39,8 @@ class MigrateTableProcedure extends BasePr

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6660: Flink: Support writes to branches in FlinkSink

2023-01-27 Thread via GitHub
stevenzwu commented on code in PR #6660: URL: https://github.com/apache/iceberg/pull/6660#discussion_r1089604601 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSink.java: ## @@ -85,28 +86,29 @@ public class TestFlinkIcebergSink { private fina

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-27 Thread via GitHub
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1089605163 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/AddFilesProcedure.java: ## @@ -119,8 +120,15 @@ public InternalRow[] call(InternalRow args)

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-27 Thread via GitHub
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1089605123 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java: ## @@ -442,14 +444,51 @@ public static void importSparkTable( "Canno

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6660: Flink: Support writes to branches in FlinkSink

2023-01-27 Thread via GitHub
stevenzwu commented on code in PR #6660: URL: https://github.com/apache/iceberg/pull/6660#discussion_r1089604601 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSink.java: ## @@ -85,28 +86,29 @@ public class TestFlinkIcebergSink { private fina

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-27 Thread via GitHub
stevenzwu commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089643840 ## core/src/test/java/org/apache/iceberg/TestDeleteFiles.java: ## @@ -349,6 +355,56 @@ public void testDeleteFilesOnIndependentBranches() { statuses(Status.E

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6660: Flink: Support writes to branches in FlinkSink

2023-01-27 Thread via GitHub
stevenzwu commented on code in PR #6660: URL: https://github.com/apache/iceberg/pull/6660#discussion_r1089604601 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSink.java: ## @@ -85,28 +86,29 @@ public class TestFlinkIcebergSink { private fina

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2023-01-27 Thread via GitHub
jackye1995 commented on code in PR #6169: URL: https://github.com/apache/iceberg/pull/6169#discussion_r1089652523 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1119,6 +1139,54 @@ public void applyS3ServiceConfigurations(T builder) .bui

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2023-01-27 Thread via GitHub
jackye1995 commented on code in PR #6169: URL: https://github.com/apache/iceberg/pull/6169#discussion_r1089652555 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -0,0 +1,311 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [iceberg] zhangbutao commented on a diff in pull request #6482: API: Fix inconsistent TimeTransform Type

2023-01-28 Thread via GitHub
zhangbutao commented on code in PR #6482: URL: https://github.com/apache/iceberg/pull/6482#discussion_r1089690341 ## core/src/test/java/org/apache/iceberg/TestTableUpdatePartitionSpec.java: ## @@ -187,6 +188,53 @@ public void testRemoveAndAddField() { Assert.assertEquals(10

[GitHub] [iceberg] zhangbutao commented on a diff in pull request #6482: API: Fix inconsistent TimeTransform Type

2023-01-28 Thread via GitHub
zhangbutao commented on code in PR #6482: URL: https://github.com/apache/iceberg/pull/6482#discussion_r1089690531 ## core/src/test/java/org/apache/iceberg/TestTableUpdatePartitionSpec.java: ## @@ -187,6 +188,53 @@ public void testRemoveAndAddField() { Assert.assertEquals(10

[GitHub] [iceberg] zhangbutao commented on a diff in pull request #6482: API: Fix inconsistent TimeTransform Type

2023-01-28 Thread via GitHub
zhangbutao commented on code in PR #6482: URL: https://github.com/apache/iceberg/pull/6482#discussion_r1089692433 ## core/src/test/java/org/apache/iceberg/TestTableUpdatePartitionSpec.java: ## @@ -187,6 +188,53 @@ public void testRemoveAndAddField() { Assert.assertEquals(10

[GitHub] [iceberg] zhangbutao commented on a diff in pull request #6482: API: Fix inconsistent TimeTransform Type

2023-01-28 Thread via GitHub
zhangbutao commented on code in PR #6482: URL: https://github.com/apache/iceberg/pull/6482#discussion_r1089692854 ## core/src/test/java/org/apache/iceberg/TestTableUpdatePartitionSpec.java: ## @@ -187,6 +188,53 @@ public void testRemoveAndAddField() { Assert.assertEquals(10

[GitHub] [iceberg] zhangbutao commented on a diff in pull request #6482: API: Fix inconsistent TimeTransform Type

2023-01-28 Thread via GitHub
zhangbutao commented on code in PR #6482: URL: https://github.com/apache/iceberg/pull/6482#discussion_r1089692889 ## core/src/test/java/org/apache/iceberg/TestTableUpdatePartitionSpec.java: ## @@ -187,6 +188,53 @@ public void testRemoveAndAddField() { Assert.assertEquals(10

[GitHub] [iceberg] zhangbutao commented on pull request #6482: API: Fix inconsistent TimeTransform Type

2023-01-28 Thread via GitHub
zhangbutao commented on PR #6482: URL: https://github.com/apache/iceberg/pull/6482#issuecomment-1407353080 > Note that this PR is essentially reverting #6220, which I don't think we can do atm. /cc @Fokko @rdblue @Fokko @nastra Could you please take a look? I am not sure if this fix

[GitHub] [iceberg] zjffdu opened a new issue, #6684: [DOC] Unable to scroll the right navigation section

2023-01-28 Thread via GitHub
zjffdu opened a new issue, #6684: URL: https://github.com/apache/iceberg/issues/6684 ### Feature Request / Improvement For the flink doc, the right navigation section is very long. But I am unable to to scroll down it. https://user-images.githubusercontent.com/164491/215265149-

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6660: Flink: Support writes to branches in FlinkSink

2023-01-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #6660: URL: https://github.com/apache/iceberg/pull/6660#discussion_r1089737315 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSink.java: ## @@ -85,28 +86,29 @@ public class TestFlinkIcebergSink { priva

[GitHub] [iceberg] fireking77 commented on issue #5977: How to write to a bucket-partitioned table using PySpark?

2023-01-28 Thread via GitHub
fireking77 commented on issue #5977: URL: https://github.com/apache/iceberg/issues/5977#issuecomment-1407435198 Hi Guys! I would also curious about this: Any way to register the bucket UDF with Pyspark? Will the built-in bucket method not work for the same? @robinsinghstud

[GitHub] [iceberg] robinsinghstudios commented on issue #5977: How to write to a bucket-partitioned table using PySpark?

2023-01-28 Thread via GitHub
robinsinghstudios commented on issue #5977: URL: https://github.com/apache/iceberg/issues/5977#issuecomment-1407438732 Hi @fireking77 , I was able to create bucketed tables by using SparkSQL query without any data. It just works for some reason. -- This is an automated message from the Ap

[GitHub] [iceberg] fireking77 commented on issue #5977: How to write to a bucket-partitioned table using PySpark?

2023-01-28 Thread via GitHub
fireking77 commented on issue #5977: URL: https://github.com/apache/iceberg/issues/5977#issuecomment-1407454746 @robinsinghstudios that is OK, but were youable to write data into that table with pyspark? -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [iceberg] robinsinghstudios commented on issue #5977: How to write to a bucket-partitioned table using PySpark?

2023-01-28 Thread via GitHub
robinsinghstudios commented on issue #5977: URL: https://github.com/apache/iceberg/issues/5977#issuecomment-1407462492 @fireking77 Merge statement works with SparkSQL. Had the same issue with append pyspark method. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6682: Bulk delete

2023-01-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1089790123 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -85,6 +88,7 @@ private static final Logger LOG = LoggerFacto

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6682: Bulk delete

2023-01-28 Thread via GitHub
amogh-jahagirdar commented on PR #6682: URL: https://github.com/apache/iceberg/pull/6682#issuecomment-1407464402 Thanks a ton for closing the loop on this @RussellSpitzer ! Left some comments -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6682: Bulk delete

2023-01-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1089792632 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java: ## @@ -149,6 +164,45 @@ public void deletePrefix(String prefix) { } } + @Override

[GitHub] [iceberg] fireking77 commented on issue #5977: How to write to a bucket-partitioned table using PySpark?

2023-01-28 Thread via GitHub
fireking77 commented on issue #5977: URL: https://github.com/apache/iceberg/issues/5977#issuecomment-1407466795 @robinsinghstudios Thanks! :D you were right :D I hope there will be someone fix insert too... ;) Thanks again! Darvi -- This is an automated message from the Apach

[GitHub] [iceberg] rdblue commented on issue #6679: Change Default Write Distribution Mode

2023-01-28 Thread via GitHub
rdblue commented on issue #6679: URL: https://github.com/apache/iceberg/issues/6679#issuecomment-1407477437 +1 for range as default. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [iceberg] rdblue commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
rdblue commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089805842 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -510,16 +510,19 @@ private Pair metricsEvaluator // in other words, ResidualEvaluator r

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
RussellSpitzer commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089806242 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -510,16 +510,19 @@ private Pair metricsEvaluator // in other words, ResidualEva

[GitHub] [iceberg] RussellSpitzer commented on pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
RussellSpitzer commented on PR #6680: URL: https://github.com/apache/iceberg/pull/6680#issuecomment-1407483685 > Looks correct to me, though we may want to move to `StructLikeMap` that was introduced later. `metricsEvaluators` is a struct like map which when putting copies the Wrappe

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
RussellSpitzer commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089810563 ## core/src/test/java/org/apache/iceberg/TestDeleteFiles.java: ## @@ -349,6 +355,56 @@ public void testDeleteFilesOnIndependentBranches() { statuses(Sta

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
RussellSpitzer commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089811146 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -510,16 +510,19 @@ private Pair metricsEvaluator // in other words, ResidualEva

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
RussellSpitzer commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089811943 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -510,16 +510,19 @@ private Pair metricsEvaluator // in other words, ResidualEva

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
RussellSpitzer commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089813726 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -510,16 +510,19 @@ private Pair metricsEvaluator // in other words, ResidualEva

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
RussellSpitzer commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089813726 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -510,16 +510,19 @@ private Pair metricsEvaluator // in other words, ResidualEva

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6582: Add a Spark procedure to collect NDV

2023-01-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #6582: URL: https://github.com/apache/iceberg/pull/6582#discussion_r1082822392 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/DistinctCountProcedure.java: ## @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software

[GitHub] [iceberg] RussellSpitzer opened a new issue, #6685: StructCopy does not correctly Copy Fixed Data Type

2023-01-28 Thread via GitHub
RussellSpitzer opened a new issue, #6685: URL: https://github.com/apache/iceberg/issues/6685 ### Apache Iceberg version None ### Query engine None ### Please describe the bug 🐞 Fanout writer, and several other pieces of code. Use StructCopy to copy partitio

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6680: Core: DeleteWithFilter fails on HashCode Collision

2023-01-28 Thread via GitHub
RussellSpitzer commented on code in PR #6680: URL: https://github.com/apache/iceberg/pull/6680#discussion_r1089817726 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -510,16 +510,19 @@ private Pair metricsEvaluator // in other words, ResidualEva

[GitHub] [iceberg] youngxinler commented on pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-01-28 Thread via GitHub
youngxinler commented on PR #6571: URL: https://github.com/apache/iceberg/pull/6571#issuecomment-1407538619 @jackye1995 @JonasJ-ap Can I trouble you if you have time to do a review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [iceberg] xuzhiwen1255 commented on pull request #6614: Flink:fix flink streaming query problem [ Cannot get a client from a closed pool]

2023-01-28 Thread via GitHub
xuzhiwen1255 commented on PR #6614: URL: https://github.com/apache/iceberg/pull/6614#issuecomment-1407547988 I'm sorry, I've been spending time with my family recently, so I haven't discussed this issue together. I would like to share my opinion. > I think that if a catalog is

[GitHub] [iceberg] xuzhiwen1255 commented on pull request #6614: Flink:fix flink streaming query problem [ Cannot get a client from a closed pool]

2023-01-28 Thread via GitHub
xuzhiwen1255 commented on PR #6614: URL: https://github.com/apache/iceberg/pull/6614#issuecomment-1407548381 > TableAccessor This is my experimental idea, yet to be tested. There may be a better way, but I haven't thought of it yet. -- This is an automated message from the Apache Git S

[GitHub] [iceberg] zjffdu opened a new issue, #6693: [DOC] Incorrect mentions of other engines

2023-01-28 Thread via GitHub
zjffdu opened a new issue, #6693: URL: https://github.com/apache/iceberg/issues/6693 ### Apache Iceberg version 1.1.0 (latest release) ### Query engine None ### Please describe the bug 🐞 There's no engines tab (may it exists before). ![image](https://use

[GitHub] [iceberg] hililiwei commented on pull request #6470: Spark: Allow specifying file format in RewriteDataFiles

2023-01-28 Thread via GitHub
hililiwei commented on PR #6470: URL: https://github.com/apache/iceberg/pull/6470#issuecomment-1407585933 > Does the streaming write have the ability to set the file format? Or does that only let you use the table default as well? Yes, streaming writes can set file format. > Ot

[GitHub] [iceberg] hililiwei commented on pull request #6614: Flink:fix flink streaming query problem [ Cannot get a client from a closed pool]

2023-01-29 Thread via GitHub
hililiwei commented on PR #6614: URL: https://github.com/apache/iceberg/pull/6614#issuecomment-1407592885 > That is a good question. It is a bit challenging. The easier model is share nothing across tasks (e.g. no global static conn pools). Let's say each TM as 8 subtask, each task need to

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6584: Flink: support reading as Avro GenericRecord for FLIP-27 IcebergSource

2023-01-29 Thread via GitHub
hililiwei commented on code in PR #6584: URL: https://github.com/apache/iceberg/pull/6584#discussion_r1089903225 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/reader/AvroGenericRecordReaderFunction.java: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Sof

[GitHub] [iceberg] xwmr-max commented on pull request #6440: Flink: Support Look-up Function

2023-01-29 Thread via GitHub
xwmr-max commented on PR #6440: URL: https://github.com/apache/iceberg/pull/6440#issuecomment-1407610685 cc @pvary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [iceberg] kingwind94 opened a new issue, #6694: position delete manifest lower_bounds/upper_bounds not correct

2023-01-29 Thread via GitHub
kingwind94 opened a new issue, #6694: URL: https://github.com/apache/iceberg/issues/6694 ### Query engine flink 1.12 iceberg 0.13.2 ### Question As we know, position delete files should keep the file path and position. In rewriting v2 tables, iceberg would validateNoN

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6657: Python: Allow to pass in a string as filter

2023-01-29 Thread via GitHub
amogh-jahagirdar commented on code in PR #6657: URL: https://github.com/apache/iceberg/pull/6657#discussion_r1090014264 ## python/pyiceberg/table/__init__.py: ## @@ -183,14 +185,17 @@ class TableScan(Generic[S], ABC): def __init__( self, table: Table, -

[GitHub] [iceberg] stevenzwu commented on pull request #6614: Flink:fix flink streaming query problem [ Cannot get a client from a closed pool]

2023-01-29 Thread via GitHub
stevenzwu commented on PR #6614: URL: https://github.com/apache/iceberg/pull/6614#issuecomment-1407740175 > 2. Clone a new table. In Flip-27, Read-only table does not meet our requirements, so we can try to clone a new table in the table loader. It's independent of catalog. We need to manag

[GitHub] [iceberg] stevenzwu commented on pull request #6614: Flink:fix flink streaming query problem [ Cannot get a client from a closed pool]

2023-01-29 Thread via GitHub
stevenzwu commented on PR #6614: URL: https://github.com/apache/iceberg/pull/6614#issuecomment-1407740616 @xuzhiwen1255 regarding your `TableAccessor` proposal, I think it is simpler to just add `TableLoader#clone` so that we maintain the same resource manager model where a `TableLoader` ob

[GitHub] [iceberg] fireking77 commented on issue #5721: Registering BucketUDF on PySpark

2023-01-29 Thread via GitHub
fireking77 commented on issue #5721: URL: https://github.com/apache/iceberg/issues/5721#issuecomment-1407757922 I would also curious about this question! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [iceberg-docs] rdblue merged pull request #197: [doc] Update the doris doc url

2023-01-29 Thread via GitHub
rdblue merged PR #197: URL: https://github.com/apache/iceberg-docs/pull/197 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg-docs] rdblue commented on pull request #197: [doc] Update the doris doc url

2023-01-29 Thread via GitHub
rdblue commented on PR #197: URL: https://github.com/apache/iceberg-docs/pull/197#issuecomment-1407758128 Thanks, @morningman! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-01-29 Thread via GitHub
singhpk234 commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1088284287 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushe

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-01-29 Thread via GitHub
jackye1995 commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1090118756 ## docs/java-api.md: ## @@ -147,6 +147,59 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data in

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-01-29 Thread via GitHub
jackye1995 commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1090118896 ## docs/java-api.md: ## @@ -147,6 +147,59 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data in

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-01-29 Thread via GitHub
jackye1995 commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1090118998 ## docs/java-api.md: ## @@ -147,6 +147,59 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data in

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-01-29 Thread via GitHub
jackye1995 commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1090123147 ## docs/java-api.md: ## @@ -147,6 +147,59 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data in

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-01-29 Thread via GitHub
jackye1995 commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1090124892 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestReplaceBranch.java: ## @@ -0,0 +1,267 @@ +/* + * Licensed to the Apache Software

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-01-29 Thread via GitHub
jackye1995 commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1090126239 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,89 @@ +/* + * Licensed to

[GitHub] [iceberg] lirui-apache commented on pull request #6175: Hive: Add UGI to the key in CachedClientPool

2023-01-29 Thread via GitHub
lirui-apache commented on PR #6175: URL: https://github.com/apache/iceberg/pull/6175#issuecomment-1407978754 @szehon-ho Sorry for the delayed reply. I'll work on a PR for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [iceberg] Fokko merged pull request #6688: Build: Bump pre-commit from 2.21.0 to 3.0.1 in /python

2023-01-29 Thread via GitHub
Fokko merged PR #6688: URL: https://github.com/apache/iceberg/pull/6688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-01-29 Thread via GitHub
ajantha-bhat opened a new pull request, #6695: URL: https://github.com/apache/iceberg/pull/6695 Incase of no-op (when the manifests are already optimized), rewrite manifest can still create one new file from the old file with the same contents. Which is a waste of resources. Hence, handli

[GitHub] [iceberg] Fokko merged pull request #6689: Build: Bump fastavro from 1.7.0 to 1.7.1 in /python

2023-01-29 Thread via GitHub
Fokko merged PR #6689: URL: https://github.com/apache/iceberg/pull/6689 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6696: Build: Bump Arrow from 10.0.1 to 11.0.0

2023-01-29 Thread via GitHub
ajantha-bhat opened a new pull request, #6696: URL: https://github.com/apache/iceberg/pull/6696 release notes https://arrow.apache.org/release/11.0.0.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-01-29 Thread via GitHub
ajantha-bhat commented on code in PR #6695: URL: https://github.com/apache/iceberg/pull/6695#discussion_r1090180806 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteManifestsProcedure.java: ## @@ -78,6 +78,28 @@ public void testRewriteL

[GitHub] [iceberg] zhongyujiang commented on issue #6694: position delete manifest lower_bounds/upper_bounds not correct

2023-01-29 Thread via GitHub
zhongyujiang commented on issue #6694: URL: https://github.com/apache/iceberg/issues/6694#issuecomment-1408025601 This is because these metrics were truncated, Iceberg's default metrics mode for column metric is `truncate(16)`. This should be fixed by #6313. I think it doesn't cause corre

[GitHub] [iceberg] ajantha-bhat commented on pull request #6686: Build: Bump spotless-plugin-gradle from 6.13.0 to 6.14.0

2023-01-29 Thread via GitHub
ajantha-bhat commented on PR #6686: URL: https://github.com/apache/iceberg/pull/6686#issuecomment-1408027848 > > Failed to apply plugin 'com.diffplug.spotless'. > Spotless requires JRE 11 or newer, this was 1.8. You can upgrade your build JRE and still compile for older targets,

[GitHub] [iceberg] Fokko merged pull request #6690: Build: Bump rich from 13.2.0 to 13.3.1 in /python

2023-01-29 Thread via GitHub
Fokko merged PR #6690: URL: https://github.com/apache/iceberg/pull/6690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko merged pull request #6691: Build: Bump coverage from 7.0.5 to 7.1.0 in /python

2023-01-29 Thread via GitHub
Fokko merged PR #6691: URL: https://github.com/apache/iceberg/pull/6691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] youngxinler commented on a diff in pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-01-29 Thread via GitHub
youngxinler commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1090226814 ## docs/java-api.md: ## @@ -147,6 +147,59 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data i

[GitHub] [iceberg] youngxinler commented on a diff in pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-01-29 Thread via GitHub
youngxinler commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1090226643 ## docs/java-api.md: ## @@ -147,6 +147,59 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data i

[GitHub] [iceberg] youngxinler commented on a diff in pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-01-29 Thread via GitHub
youngxinler commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1090226814 ## docs/java-api.md: ## @@ -147,6 +147,59 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data i

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-01-29 Thread via GitHub
ajantha-bhat commented on code in PR #6695: URL: https://github.com/apache/iceberg/pull/6695#discussion_r1090237690 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -341,6 +341,10 @@ public void testRewriteImportedManifest

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-01-29 Thread via GitHub
ajantha-bhat commented on code in PR #6695: URL: https://github.com/apache/iceberg/pull/6695#discussion_r1090238158 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -431,6 +435,8 @@ public void testRewriteManifestsWithPred

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-01-29 Thread via GitHub
ajantha-bhat commented on code in PR #6695: URL: https://github.com/apache/iceberg/pull/6695#discussion_r1090238362 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -440,21 +446,24 @@ public void testRewriteManifestsWithPr

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-01-29 Thread via GitHub
ajantha-bhat commented on code in PR #6695: URL: https://github.com/apache/iceberg/pull/6695#discussion_r1090238665 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -464,11 +473,16 @@ public void testRewriteManifestsWithPr

[GitHub] [iceberg] ajantha-bhat commented on pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-01-29 Thread via GitHub
ajantha-bhat commented on PR #6695: URL: https://github.com/apache/iceberg/pull/6695#issuecomment-1408094225 cc: @aokolnychyi, @RussellSpitzer, @szehon-ho -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [iceberg] lirui-apache opened a new issue, #6697: Support pluggable ClientPool

2023-01-29 Thread via GitHub
lirui-apache opened a new issue, #6697: URL: https://github.com/apache/iceberg/issues/6697 ### Feature Request / Improvement Currently HiveCatalog uses CachedClientPool to manage HMS clients, and CachedClientPool uses HMS URI as the cache key. It can be useful to add more info to the

[GitHub] [iceberg] lirui-apache opened a new pull request, #6698: Core, Hive: Support pluggable ClientPool

2023-01-30 Thread via GitHub
lirui-apache opened a new pull request, #6698: URL: https://github.com/apache/iceberg/pull/6698 To address issue #6697. This PR allows users to specify custom client pools via a catalog property. Then catalogs can check this property and create the client pool accordingly. An `initialize

[GitHub] [iceberg] kingwind94 commented on issue #6694: position delete manifest lower_bounds/upper_bounds not correct

2023-01-30 Thread via GitHub
kingwind94 commented on issue #6694: URL: https://github.com/apache/iceberg/issues/6694#issuecomment-1408162400 > This is because these metrics were truncated, Iceberg's default metrics mode for column metric is `truncate(16)`. This should be fixed by #6313. I think it doesn't cause correct

[GitHub] [iceberg] Neuw84 commented on issue #2221: Spark: Extend expire_snapshots procedure with an optional arg for snapshot ids

2023-01-30 Thread via GitHub
Neuw84 commented on issue #2221: URL: https://github.com/apache/iceberg/issues/2221#issuecomment-1408188556 Hi! I have a usecase similar to the ones described (CDC). If we are using latest version of Iceberg (1.1.0) it is safe then to delete specific snapshots using the Table

[GitHub] [iceberg] snazy opened a new pull request, #6699: OpenAPI responses should reference schemas

2023-01-30 Thread via GitHub
snazy opened a new pull request, #6699: URL: https://github.com/apache/iceberg/pull/6699 The common _OpenAPI Tools_ generators do not properly recognize non-200 responses and generate the necessary objects for the non-200 response types. This change moves the schema definitions from `respon

[GitHub] [iceberg] xuzhiwen1255 commented on pull request #6614: Flink:fix flink streaming query problem [ Cannot get a client from a closed pool]

2023-01-30 Thread via GitHub
xuzhiwen1255 commented on PR #6614: URL: https://github.com/apache/iceberg/pull/6614#issuecomment-1408298187 @stevenzwu @hililiwei I tried to change it, please review it for me. ``For the sink, he did not operate the table, so I did not change his logic, only modified the logic of the sou

[GitHub] [iceberg] lirui-apache commented on pull request #6698: Core, Hive: Support pluggable ClientPool

2023-01-30 Thread via GitHub
lirui-apache commented on PR #6698: URL: https://github.com/apache/iceberg/pull/6698#issuecomment-1408391175 @szehon-ho @pvary @nastra @flyrain Could you please have a look? Thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [iceberg] zhongyujiang commented on issue #6694: position delete manifest lower_bounds/upper_bounds not correct

2023-01-30 Thread via GitHub
zhongyujiang commented on issue #6694: URL: https://github.com/apache/iceberg/issues/6694#issuecomment-1408449585 > But flink's new added position deletes should only appy to the new added data files, not history (rewritting) data files, so this should not hinder the rewrite operation if it

[GitHub] [iceberg] hililiwei commented on pull request #6660: Flink: Support writes to branches in FlinkSink

2023-01-30 Thread via GitHub
hililiwei commented on PR #6660: URL: https://github.com/apache/iceberg/pull/6660#issuecomment-1408484697 > > Thanks for the PR @amogh-jahagirdar! Left one small question. > > Also, do we have this feature in FlinkSource? > > Thanks for the review @pvary ! Not yet, but I'm working o

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-01-30 Thread via GitHub
hililiwei commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1090530508 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,89 @@ +/* + * Licensed to t

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-01-30 Thread via GitHub
hililiwei commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1090533942 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCreateBranch.java: ## @@ -129,6 +129,24 @@ public void testCreateBranchUseCustomM

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-01-30 Thread via GitHub
hililiwei commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1090533942 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCreateBranch.java: ## @@ -129,6 +129,24 @@ public void testCreateBranchUseCustomM

[GitHub] [iceberg] Fokko merged pull request #6692: Build: Bump pyarrow from 10.0.1 to 11.0.0 in /python

2023-01-30 Thread via GitHub
Fokko merged PR #6692: URL: https://github.com/apache/iceberg/pull/6692 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko closed issue #6675: [Nessie] Nessie should store his table in the namespace location if he can

2023-01-30 Thread via GitHub
Fokko closed issue #6675: [Nessie] Nessie should store his table in the namespace location if he can URL: https://github.com/apache/iceberg/issues/6675 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [iceberg] Fokko merged pull request #6676: Nessie : use default namespace location if exists

2023-01-30 Thread via GitHub
Fokko merged PR #6676: URL: https://github.com/apache/iceberg/pull/6676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko merged pull request #6678: Nessie: Update the outdated javadoc

2023-01-30 Thread via GitHub
Fokko merged PR #6678: URL: https://github.com/apache/iceberg/pull/6678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko merged pull request #6656: Nessie: Avoid usage of deprecated APIs in test

2023-01-30 Thread via GitHub
Fokko merged PR #6656: URL: https://github.com/apache/iceberg/pull/6656 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko opened a new pull request, #6702: Python: Let PyTest know that it shouldn't collect TestType

2023-01-30 Thread via GitHub
Fokko opened a new pull request, #6702: URL: https://github.com/apache/iceberg/pull/6702 Because TestType contains test in the name, pytest tries to collect it and run tests on it. Setting `__test__ = False` tells PyTest to ignore it -- This is an automated message from the Apache Git Ser

[GitHub] [iceberg] Fokko opened a new pull request, #6703: Python: Fix warnings from PyLint

2023-01-30 Thread via GitHub
Fokko opened a new pull request, #6703: URL: https://github.com/apache/iceberg/pull/6703 Because TestType contains test in the name, pytest tries to collect it and run tests on it. Setting `__test__ = False` tells PyTest to ignore it Also, s3 emits an error because we clean up buckets

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6703: Python: Fix warnings from PyLint

2023-01-30 Thread via GitHub
ajantha-bhat commented on code in PR #6703: URL: https://github.com/apache/iceberg/pull/6703#discussion_r1090772461 ## python/pyproject.toml: ## @@ -109,6 +109,10 @@ markers = [ "s3: marks a test as requiring access to s3 compliant storage (use with --aws-access-key-id, --

[GitHub] [iceberg] Fokko commented on a diff in pull request #6703: Python: Fix warnings from PyLint

2023-01-30 Thread via GitHub
Fokko commented on code in PR #6703: URL: https://github.com/apache/iceberg/pull/6703#discussion_r1090784169 ## python/pyproject.toml: ## @@ -109,6 +109,10 @@ markers = [ "s3: marks a test as requiring access to s3 compliant storage (use with --aws-access-key-id, --aws-sec

<    2   3   4   5   6   7   8   9   10   11   >