[GitHub] [iceberg] dramaticlly commented on a diff in pull request #6372: [Python] Fix incorrect description when set a property

2022-12-07 Thread GitBox
dramaticlly commented on code in PR #6372: URL: https://github.com/apache/iceberg/pull/6372#discussion_r1041878402 ## python/pyiceberg/cli/console.py: ## @@ -286,7 +286,7 @@ def get_table(ctx: Context, identifier: str, property_name: str): @properties.group() def set(): -

[GitHub] [iceberg] 245831311 commented on issue #4550: the snapshot file is lost when write iceberg using flink Failed to open input stream for file File does not exist

2022-12-07 Thread GitBox
245831311 commented on issue #4550: URL: https://github.com/apache/iceberg/issues/4550#issuecomment-1340578855 > I have solved this problem. Thank you. My problem mainly occurs when InMemoryLockManager releases the heartbeat of the lock and reports a NullPointerException; I rewrote InMemory

[GitHub] [iceberg] rajarshisarkar opened a new pull request, #6374: Docs: Remove backticks from Spark procedure headings

2022-12-07 Thread GitBox
rajarshisarkar opened a new pull request, #6374: URL: https://github.com/apache/iceberg/pull/6374 This PR removes backticks from the Spark procedure headings. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [iceberg] JMin824 closed issue #6373: Question about usage of RewriteFile with Zorder Strategy

2022-12-07 Thread GitBox
JMin824 closed issue #6373: Question about usage of RewriteFile with Zorder Strategy URL: https://github.com/apache/iceberg/issues/6373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [iceberg] JMin824 commented on issue #6373: Question about usage of RewriteFile with Zorder Strategy

2022-12-07 Thread GitBox
JMin824 commented on issue #6373: URL: https://github.com/apache/iceberg/issues/6373#issuecomment-1340607468 > Rewrite all rewrites all files, this means reading all data of the files, ordering them, then writing out new ordered files. If no predicates are selected this would co

[GitHub] [iceberg] ggershinsky commented on a diff in pull request #3231: GCM encryption stream

2022-12-07 Thread GitBox
ggershinsky commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1041939078 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputFile.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

[GitHub] [iceberg] ggershinsky commented on a diff in pull request #3231: GCM encryption stream

2022-12-07 Thread GitBox
ggershinsky commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1041943081 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [iceberg] ggershinsky commented on a diff in pull request #3231: GCM encryption stream

2022-12-07 Thread GitBox
ggershinsky commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1041943937 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [iceberg] ggershinsky commented on a diff in pull request #3231: GCM encryption stream

2022-12-07 Thread GitBox
ggershinsky commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1041945463 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [iceberg] ggershinsky commented on a diff in pull request #3231: GCM encryption stream

2022-12-07 Thread GitBox
ggershinsky commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1041949962 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [iceberg] zstraw commented on issue #4550: the snapshot file is lost when write iceberg using flink Failed to open input stream for file File does not exist

2022-12-07 Thread GitBox
zstraw commented on issue #4550: URL: https://github.com/apache/iceberg/issues/4550#issuecomment-1340655807 > I have solved this problem. Thank you. My problem mainly occurs when InMemoryLockManager releases the heartbeat of the lock and reports a NullPointerException; I rewrote InMemoryLoc

[GitHub] [iceberg] pavibhai commented on pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on PR #6293: URL: https://github.com/apache/iceberg/pull/6293#issuecomment-1340672765 > I'm not a big fan of the fake filesystem approach here, mostly because i'm afraid of mocking an object like that when we don't have the full filesystem state. I feel like this patch wo

[GitHub] [iceberg] gaborkaszab commented on pull request #6369: Increase Partition Start Id to 10000

2022-12-07 Thread GitBox
gaborkaszab commented on PR #6369: URL: https://github.com/apache/iceberg/pull/6369#issuecomment-1340673537 This seems a reasonable change for me. Just a question for my better understanding: The tables that we have already written will still have their partition field IDs from 1000, right?

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1041988137 ## orc/src/main/java/org/apache/iceberg/orc/ORC.java: ## @@ -789,7 +808,210 @@ static Reader newFileReader(InputFile file, Configuration config) { ReaderOptions

[GitHub] [iceberg] ayushtkn commented on pull request #6369: Increase Partition Start Id to 10000

2022-12-07 Thread GitBox
ayushtkn commented on PR #6369: URL: https://github.com/apache/iceberg/pull/6369#issuecomment-1340677494 >written prior to this change will still have the collision with the partition field IDs and will only be fixed if they are, or at lest their metadata is rewritten, right? yep -

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1041995292 ## orc/src/main/java/org/apache/iceberg/orc/ORC.java: ## @@ -789,7 +808,210 @@ static Reader newFileReader(InputFile file, Configuration config) { ReaderOptions

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1041996933 ## orc/src/main/java/org/apache/iceberg/orc/ORC.java: ## @@ -789,7 +808,210 @@ static Reader newFileReader(InputFile file, Configuration config) { ReaderOptions

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1042005551 ## orc/src/main/java/org/apache/iceberg/orc/ORC.java: ## @@ -789,7 +808,210 @@ static Reader newFileReader(InputFile file, Configuration config) { ReaderOptions

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1042007374 ## orc/src/main/java/org/apache/iceberg/orc/ORC.java: ## @@ -789,7 +808,210 @@ static Reader newFileReader(InputFile file, Configuration config) { ReaderOptions

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1042007715 ## orc/src/main/java/org/apache/iceberg/orc/ORC.java: ## @@ -789,7 +808,210 @@ static Reader newFileReader(InputFile file, Configuration config) { ReaderOptions

[GitHub] [iceberg] gaborkaszab commented on pull request #6369: Increase Partition Start Id to 10000

2022-12-07 Thread GitBox
gaborkaszab commented on PR #6369: URL: https://github.com/apache/iceberg/pull/6369#issuecomment-1340696570 Thanks for the answer, @ayushtkn! I wonder if it would make sense to make the already written tables work as expected even with more than 1000 cols. E.g. when reading their metadata

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-07 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1042013519 ## core/src/main/java/org/apache/iceberg/BaseFilesTable.java: ## @@ -223,34 +225,28 @@ ManifestFile manifest() { static class ContentFileStructWithMetrics implemen

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-07 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1042019390 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java: ## @@ -104,11 +105,38 @@ public static Schema convert(Schema baseSchema, TableSchem

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1042021455 ## orc/src/main/java/org/apache/iceberg/orc/ORC.java: ## @@ -789,7 +808,210 @@ static Reader newFileReader(InputFile file, Configuration config) { ReaderOptions

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-07 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1042027502 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java: ## @@ -140,15 +142,25 @@ public Catalog catalog() { return icebergCatalog; }

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1042038382 ## orc/src/main/java/org/apache/iceberg/orc/ORC.java: ## @@ -789,7 +808,210 @@ static Reader newFileReader(InputFile file, Configuration config) { ReaderOptions

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1042041584 ## orc/src/main/java/org/apache/iceberg/orc/OrcFileAppender.java: ## @@ -88,8 +86,7 @@ options.fileSystem(((HadoopOutputFile) file).getFileSystem()); }

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1042042743 ## orc/src/test/java/org/apache/iceberg/orc/TestOrcDataWriter.java: ## @@ -126,4 +135,116 @@ public void testDataWriter() throws IOException { Assert.assertEqua

[GitHub] [iceberg] pavibhai commented on a diff in pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on code in PR #6293: URL: https://github.com/apache/iceberg/pull/6293#discussion_r1042049334 ## orc/src/test/java/org/apache/iceberg/orc/TestOrcDataWriter.java: ## @@ -126,4 +135,116 @@ public void testDataWriter() throws IOException { Assert.assertEqua

[GitHub] [iceberg] pavibhai commented on pull request #6293: Added FileIO Support for ORC Reader and Writers

2022-12-07 Thread GitBox
pavibhai commented on PR #6293: URL: https://github.com/apache/iceberg/pull/6293#issuecomment-1340790968 > Looks mostly good overall. Thanks for getting this working @pavibhai! Thanks @rdblue for your comments. I have addressed the comments, there are a few comments where I gave addit

[GitHub] [iceberg] nastra commented on issue #6366: Spark Sql update data failure

2022-12-07 Thread GitBox
nastra commented on issue #6366: URL: https://github.com/apache/iceberg/issues/6366#issuecomment-1340870927 @gnikgnaw did you have a chance looking at my last comment? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [iceberg] gnikgnaw commented on issue #6366: Spark Sql update data failure

2022-12-07 Thread GitBox
gnikgnaw commented on issue #6366: URL: https://github.com/apache/iceberg/issues/6366#issuecomment-1340886112 > @gnikgnaw did you have a chance looking at my last comment? hi @nastra Thank you very much for your help, after modifying the spark version, my problem is solved -- This

[GitHub] [iceberg] gnikgnaw closed issue #6366: Spark Sql update data failure

2022-12-07 Thread GitBox
gnikgnaw closed issue #6366: Spark Sql update data failure URL: https://github.com/apache/iceberg/issues/6366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

[GitHub] [iceberg] ajantha-bhat opened a new issue, #6375: Consider delete manifests for rewrite manifests

2022-12-07 Thread GitBox
ajantha-bhat opened a new issue, #6375: URL: https://github.com/apache/iceberg/issues/6375 ### Feature Request / Improvement As per the code it looks like we are just considering data manifests for the rewrite. Should we also, support delete manifests to be rewritten into a bigger de

[GitHub] [iceberg] rajarshisarkar opened a new pull request, #6376: Docs: Add register table Spark procedure documentation

2022-12-07 Thread GitBox
rajarshisarkar opened a new pull request, #6376: URL: https://github.com/apache/iceberg/pull/6376 Add documentation for https://github.com/apache/iceberg/pull/4810 --- cc: @RussellSpitzer -- This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6376: Docs: Add register table Spark procedure documentation

2022-12-07 Thread GitBox
ajantha-bhat commented on code in PR #6376: URL: https://github.com/apache/iceberg/pull/6376#discussion_r1042165921 ## docs/spark-procedures.md: ## @@ -493,6 +493,35 @@ CALL spark_catalog.system.add_files( ) ``` +### `register_table` + +Creates a catalog entry for a metadata

[GitHub] [iceberg] rajarshisarkar commented on a diff in pull request #6376: Docs: Add register table Spark procedure documentation

2022-12-07 Thread GitBox
rajarshisarkar commented on code in PR #6376: URL: https://github.com/apache/iceberg/pull/6376#discussion_r1042170977 ## docs/spark-procedures.md: ## @@ -493,6 +493,35 @@ CALL spark_catalog.system.add_files( ) ``` +### `register_table` + +Creates a catalog entry for a metada

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6376: Docs: Add register table Spark procedure documentation

2022-12-07 Thread GitBox
ajantha-bhat commented on code in PR #6376: URL: https://github.com/apache/iceberg/pull/6376#discussion_r1042177406 ## docs/spark-procedures.md: ## @@ -493,6 +493,37 @@ CALL spark_catalog.system.add_files( ) ``` +### `register_table` + +Creates a catalog entry for a metadata

[GitHub] [iceberg] pvary commented on issue #6370: What is the purpose of Hive Lock ?

2022-12-07 Thread GitBox
pvary commented on issue #6370: URL: https://github.com/apache/iceberg/issues/6370#issuecomment-1341067753 @dmgcodevil: The purpose of the Hive Lock is to make sure that there are no concurrent changes to the table. Specifically that there is no concurrent Iceberg commit. In theory t

[GitHub] [iceberg] RussellSpitzer merged pull request #6360: Docs: Update Zorder spark support versions.

2022-12-07 Thread GitBox
RussellSpitzer merged PR #6360: URL: https://github.com/apache/iceberg/pull/6360 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

[GitHub] [iceberg] RussellSpitzer commented on pull request #6360: Docs: Update Zorder spark support versions.

2022-12-07 Thread GitBox
RussellSpitzer commented on PR #6360: URL: https://github.com/apache/iceberg/pull/6360#issuecomment-134581 Thanks @ajantha-bhat , looking much clearer now! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6376: Docs: Add register table Spark procedure documentation

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6376: URL: https://github.com/apache/iceberg/pull/6376#discussion_r1042333590 ## docs/spark-procedures.md: ## @@ -493,6 +493,37 @@ CALL spark_catalog.system.add_files( ) ``` +### `register_table` + +Creates a catalog entry for a metada

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6376: Docs: Add register table Spark procedure documentation

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6376: URL: https://github.com/apache/iceberg/pull/6376#discussion_r1042333590 ## docs/spark-procedures.md: ## @@ -493,6 +493,37 @@ CALL spark_catalog.system.add_files( ) ``` +### `register_table` + +Creates a catalog entry for a metada

[GitHub] [iceberg] RussellSpitzer merged pull request #6374: Docs: Remove backticks from Spark procedure headings

2022-12-07 Thread GitBox
RussellSpitzer merged PR #6374: URL: https://github.com/apache/iceberg/pull/6374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

[GitHub] [iceberg] RussellSpitzer commented on pull request #6374: Docs: Remove backticks from Spark procedure headings

2022-12-07 Thread GitBox
RussellSpitzer commented on PR #6374: URL: https://github.com/apache/iceberg/pull/6374#issuecomment-1341121146 Looks good to me, Thanks for the cleanup! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042346944 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java: ## @@ -255,74 +256,90 @@ public static org.apache.iceberg.Table toIcebergTable(Table

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6376: Docs: Add register table Spark procedure documentation

2022-12-07 Thread GitBox
ajantha-bhat commented on code in PR #6376: URL: https://github.com/apache/iceberg/pull/6376#discussion_r1042354666 ## docs/spark-procedures.md: ## @@ -493,6 +493,37 @@ CALL spark_catalog.system.add_files( ) ``` +### `register_table` + +Creates a catalog entry for a metadata

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042357442 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -42,4 +42,9 @@ private SparkSQLProperties() {} // Controls whether t

[GitHub] [iceberg] RussellSpitzer commented on pull request #6369: Increase Partition Start Id to 10000

2022-12-07 Thread GitBox
RussellSpitzer commented on PR #6369: URL: https://github.com/apache/iceberg/pull/6369#issuecomment-1341155158 @gaborkaszab I would probably just recommended dropping and recreating the table (via metadata) or having a separate utility for modifying existing tables. I really don't think man

[GitHub] [iceberg] Fokko commented on a diff in pull request #6372: [Python] Fix incorrect description when set a property

2022-12-07 Thread GitBox
Fokko commented on code in PR #6372: URL: https://github.com/apache/iceberg/pull/6372#discussion_r1042388262 ## python/pyiceberg/cli/console.py: ## @@ -103,7 +103,7 @@ def list(ctx: Context, parent: Optional[str]): # pylint: disable=redefined-buil @click.pass_context @catch_

[GitHub] [iceberg] Fokko merged pull request #6372: [Python] Fix incorrect description when set a property

2022-12-07 Thread GitBox
Fokko merged PR #6372: URL: https://github.com/apache/iceberg/pull/6372 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042394528 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkPartitioningAwareScan.java: ## @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache Software F

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042402581 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042413822 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042415619 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042433535 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042444111 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -215,11 +225,12 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) {

[GitHub] [iceberg] islamismailov commented on pull request #6268: Allow dropping a column used by an old but not currrent partition spec

2022-12-07 Thread GitBox
islamismailov commented on PR #6268: URL: https://github.com/apache/iceberg/pull/6268#issuecomment-1341261730 (i did address Ryan's feedback) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [iceberg] Fokko merged pull request #6268: Allow dropping a column used by an old but not currrent partition spec

2022-12-07 Thread GitBox
Fokko merged PR #6268: URL: https://github.com/apache/iceberg/pull/6268 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042453631 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -42,4 +42,9 @@ private SparkSQLProperties() {} // Controls whether t

[GitHub] [iceberg] islamismailov commented on a diff in pull request #6353: Make sure S3 stream opened by ReadConf ctor is closed

2022-12-07 Thread GitBox
islamismailov commented on code in PR #6353: URL: https://github.com/apache/iceberg/pull/6353#discussion_r1042460459 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetReader.java: ## @@ -79,9 +83,11 @@ private ReadConf init() { nameMapping,

[GitHub] [iceberg] pvary commented on issue #2301: Lock remains in HMS if HiveTableOperations gets killed (direct process shutdown - no signals) after lock is acquired

2022-12-07 Thread GitBox
pvary commented on issue #2301: URL: https://github.com/apache/iceberg/issues/2301#issuecomment-1341362767 > We faced similar issues: > > 1. We are using Flink and some processes attempt to modify a table concurrently, e.g.: ingestion process and data compaction process. If one of th

[GitHub] [iceberg] autumnust commented on a diff in pull request #6327: ORC: Fix error when projecting nested indentity partition column

2022-12-07 Thread GitBox
autumnust commented on code in PR #6327: URL: https://github.com/apache/iceberg/pull/6327#discussion_r1042526245 ## orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java: ## @@ -442,4 +445,23 @@ static TypeDescription applyNameMapping(TypeDescription orcSchema, NameMappin

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6296: Spark-3.3: Use table sort order with sort strategy when user has not specified

2022-12-07 Thread GitBox
ajantha-bhat commented on code in PR #6296: URL: https://github.com/apache/iceberg/pull/6296#discussion_r1042526654 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java: ## @@ -356,6 +356,16 @@ public void testRewrit

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6296: Spark-3.3: Use table sort order with sort strategy when user has not specified

2022-12-07 Thread GitBox
ajantha-bhat commented on code in PR #6296: URL: https://github.com/apache/iceberg/pull/6296#discussion_r1042526654 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java: ## @@ -356,6 +356,16 @@ public void testRewrit

[GitHub] [iceberg] pvary commented on issue #6370: What is the purpose of Hive Lock ?

2022-12-07 Thread GitBox
pvary commented on issue #6370: URL: https://github.com/apache/iceberg/issues/6370#issuecomment-1341372808 @InvisibleProgrammer: What do the Hive guys think about this? Would they be interested in adding this feature to HMS? -- This is an automated message from the Apache Git Service. To

[GitHub] [iceberg] tomtongue commented on pull request #6352: AWS: Fix inconsistent behavior of naming S3 location between read and write operations by allowing only s3 bucket name

2022-12-07 Thread GitBox
tomtongue commented on PR #6352: URL: https://github.com/apache/iceberg/pull/6352#issuecomment-1341372839 Thanks for reviewing this PR, Amogh! (Sorry for delaying my response) I'll check your comments and get back tomorrow. -- This is an automated message from the Apache Git Service. To

[GitHub] [iceberg] TuroczyX commented on issue #6370: What is the purpose of Hive Lock ?

2022-12-07 Thread GitBox
TuroczyX commented on issue #6370: URL: https://github.com/apache/iceberg/issues/6370#issuecomment-1341382512 It is definitely something that we need to consider. We will talk about it on our next meeting. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [iceberg] TuroczyX commented on issue #6368: Delete/Update fails for tables with more than 1000 columns

2022-12-07 Thread GitBox
TuroczyX commented on issue #6368: URL: https://github.com/apache/iceberg/issues/6368#issuecomment-1341384585 @ayushtkn This is settable from hive? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [iceberg] TuroczyX commented on issue #6347: [Docs]: improve ChangeLog

2022-12-07 Thread GitBox
TuroczyX commented on issue #6347: URL: https://github.com/apache/iceberg/issues/6347#issuecomment-1341391351 @code-magician323 Thanks for your feedback. @InvisibleProgrammer Could you please take care on it for the next? -- This is an automated message from the Apache Git Service. To re

[GitHub] [iceberg] TuroczyX commented on issue #6249: Update Iceberg Hive documentation

2022-12-07 Thread GitBox
TuroczyX commented on issue #6249: URL: https://github.com/apache/iceberg/issues/6249#issuecomment-1341395246 Nice :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [iceberg] TuroczyX commented on issue #6347: [Docs]: improve ChangeLog

2022-12-07 Thread GitBox
TuroczyX commented on issue #6347: URL: https://github.com/apache/iceberg/issues/6347#issuecomment-1341395880 @code-magician323 Something like this? https://github.com/apache/iceberg/issues/6249 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [iceberg] sunchao commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
sunchao commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042549493 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkPartitioningAwareScan.java: ## @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache Software Foundati

[GitHub] [iceberg] InvisibleProgrammer commented on issue #6347: [Docs]: improve ChangeLog

2022-12-07 Thread GitBox
InvisibleProgrammer commented on issue #6347: URL: https://github.com/apache/iceberg/issues/6347#issuecomment-1341398027 Yes, I can. @Fokko , what do you think about grouping the changelog? And maybe, it would be worth creating some kind of template to the changelogs to create a

[GitHub] [iceberg] sunchao commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
sunchao commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042551459 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [iceberg] sunchao commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
sunchao commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042555728 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2022-12-07 Thread GitBox
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1042557855 ## core/src/main/java/org/apache/iceberg/BaseMetadataTable.java: ## @@ -64,9 +64,12 @@ protected BaseMetadataTable(TableOperations ops, Table table, String name) {

[GitHub] [iceberg] stevenzwu opened a new pull request, #6377: Flink: add util class to generate test data with extensive coverage d…

2022-12-07 Thread GitBox
stevenzwu opened a new pull request, #6377: URL: https://github.com/apache/iceberg/pull/6377 …ifferent field types: from primitives to complex nested types -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2022-12-07 Thread GitBox
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1042560130 ## core/src/main/java/org/apache/iceberg/SerializableTable.java: ## @@ -361,6 +363,27 @@ private String errorMsg(String operation) { return String.format("Operat

[GitHub] [iceberg] stevenzwu commented on pull request #6377: Flink: add util class to generate test data with extensive coverage d…

2022-12-07 Thread GitBox
stevenzwu commented on PR #6377: URL: https://github.com/apache/iceberg/pull/6377#issuecomment-1341416995 We had this util class internally for testing Avro GenericRecord to Flink RowData converter. It can be useful for writing unit test for the `StructRowData` class from PR #6222. -- Th

[GitHub] [iceberg] stevenzwu commented on pull request #6377: Flink: add util class to generate test data with extensive coverage d…

2022-12-07 Thread GitBox
stevenzwu commented on PR #6377: URL: https://github.com/apache/iceberg/pull/6377#issuecomment-1341418873 Example usage ``` public class TestRowDataToAvroGenericRecordConverter { protected void testConverter(DataGenerator dataGenerator) throws Exception { RowDataTo

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-07 Thread GitBox
szehon-ho commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1042568978 ## core/src/main/java/org/apache/iceberg/BaseFilesTable.java: ## @@ -223,34 +225,28 @@ ManifestFile manifest() { static class ContentFileStructWithMetrics implemen

[GitHub] [iceberg] RussellSpitzer opened a new pull request, #6378: Spark: Extend Timeout During Partial Progress Rewrites

2022-12-07 Thread GitBox
RussellSpitzer opened a new pull request, #6378: URL: https://github.com/apache/iceberg/pull/6378 In order to avoid timing out when writing large manifest files, we increase the timeout allowed for the commit phase of partial progress based on the number of commits left to perform. T

[GitHub] [iceberg] RussellSpitzer commented on issue #6367: Partial Progress Compaction can Timeout on Very Large Manfiest Commits

2022-12-07 Thread GitBox
RussellSpitzer commented on issue #6367: URL: https://github.com/apache/iceberg/issues/6367#issuecomment-1341500255 Filed a quick PR to just extend the timeout -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6378: Spark: Extend Timeout During Partial Progress Rewrites

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6378: URL: https://github.com/apache/iceberg/pull/6378#discussion_r1042612662 ## core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java: ## @@ -225,25 +225,40 @@ public void close() { LOG.info("Closing comm

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6378: Spark: Extend Timeout During Partial Progress Rewrites

2022-12-07 Thread GitBox
RussellSpitzer commented on code in PR #6378: URL: https://github.com/apache/iceberg/pull/6378#discussion_r1042613071 ## core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java: ## @@ -225,25 +225,40 @@ public void close() { LOG.info("Closing comm

[GitHub] [iceberg] Fokko commented on a diff in pull request #6348: Python: Update license-checker

2022-12-07 Thread GitBox
Fokko commented on code in PR #6348: URL: https://github.com/apache/iceberg/pull/6348#discussion_r1042617131 ## python/dev/.rat-excludes: ## @@ -0,0 +1,2 @@ +.rat-excludes Review Comment: I'd rather keep the two projects isolated so we have the possibility to split pyiceber

[GitHub] [iceberg] gaborkaszab commented on issue #6368: Delete/Update fails for tables with more than 1000 columns

2022-12-07 Thread GitBox
gaborkaszab commented on issue #6368: URL: https://github.com/apache/iceberg/issues/6368#issuecomment-1341513400 @TuroczyX The agreement here is that there is no need to make this configurable and hardcoding to 10k is enough. See PR: https://github.com/apache/iceberg/pull/6369 -- Thi

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042622671 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -42,4 +42,9 @@ private SparkSQLProperties() {} // Controls whether to c

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042622671 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -42,4 +42,9 @@ private SparkSQLProperties() {} // Controls whether to c

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042627234 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java: ## @@ -255,74 +256,90 @@ public static org.apache.iceberg.Table toIcebergTable(Table ta

[GitHub] [iceberg] shardulm94 commented on a diff in pull request #6327: ORC: Fix error when projecting nested indentity partition column

2022-12-07 Thread GitBox
shardulm94 commented on code in PR #6327: URL: https://github.com/apache/iceberg/pull/6327#discussion_r1042628482 ## orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java: ## @@ -442,4 +445,23 @@ static TypeDescription applyNameMapping(TypeDescription orcSchema, NameMappi

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042631790 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -42,4 +42,9 @@ private SparkSQLProperties() {} // Controls whether to c

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042631790 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -42,4 +42,9 @@ private SparkSQLProperties() {} // Controls whether to c

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042632953 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkPartitioningAwareScan.java: ## @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache Software Foun

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042635649 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042643130 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042644217 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -215,11 +225,12 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { * t

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042644545 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

2022-12-07 Thread GitBox
aokolnychyi commented on code in PR #6371: URL: https://github.com/apache/iceberg/pull/6371#discussion_r1042644906 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache Software Founda

  1   2   >