Re: [PR] Spark: Ensure that partition stats files are considered for GC procedures [iceberg]

2024-01-02 Thread via GitHub
ajantha-bhat commented on PR #9284: URL: https://github.com/apache/iceberg/pull/9284#issuecomment-1874869959 ping @aokolnychyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Core: Suppress exceptions in case of dropTableData [iceberg]

2024-01-02 Thread via GitHub
nk1506 commented on PR #9184: URL: https://github.com/apache/iceberg/pull/9184#issuecomment-1874868435 @Fokko , A gentle reminder for the same. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Spark: Request distribution and ordering for writes [iceberg]

2024-01-02 Thread via GitHub
maytasm commented on PR #3461: URL: https://github.com/apache/iceberg/pull/3461#issuecomment-1874846377 If in Spark 3.3, users no longer have to explicitly sort their data before INSERT into a partitioned table, then should we insert global sort into the plan rather than a local sort? http

[PR] Spark 3.5: Set log level to WARN for rewrite task failure with partial progress [iceberg]

2024-01-02 Thread via GitHub
manuzhang opened a new pull request, #9400: URL: https://github.com/apache/iceberg/pull/9400 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-02 Thread via GitHub
szehon-ho commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1434014371 ## data/src/main/java/org/apache/iceberg/data/BaseDeleteLoader.java: ## @@ -0,0 +1,260 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-02 Thread via GitHub
szehon-ho commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1434014371 ## data/src/main/java/org/apache/iceberg/data/BaseDeleteLoader.java: ## @@ -0,0 +1,260 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-02 Thread via GitHub
szehon-ho commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1434014371 ## data/src/main/java/org/apache/iceberg/data/BaseDeleteLoader.java: ## @@ -0,0 +1,260 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] Core, Spark: Correct the delete record count for PartitionTable [iceberg]

2024-01-02 Thread via GitHub
ConeyLiu commented on PR #9389: URL: https://github.com/apache/iceberg/pull/9389#issuecomment-1874773959 Thanks @singhpk234 @RussellSpitzer @dramaticlly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] how to integrations object storage ceph ? [iceberg]

2024-01-02 Thread via GitHub
hchautrung commented on issue #7158: URL: https://github.com/apache/iceberg/issues/7158#issuecomment-1874771632 Hi @jkl0898 , I am looking a solution to use Ceph with Iceberg. Currently I used MinIO but the we looking for an alterantive solution to replace MinIO. Could you share tech

Re: [PR] Spark 3.5: Support filtering with buckets in RewriteDataFilesProcedure [iceberg]

2024-01-02 Thread via GitHub
manuzhang commented on PR #9396: URL: https://github.com/apache/iceberg/pull/9396#issuecomment-1874770664 @RussellSpitzer could you please give an example of passing partition(esp. bucket) via filter expression? -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] AWS: Add S3 Access Grants Integration [iceberg]

2024-01-02 Thread via GitHub
adnanhemani commented on code in PR #9385: URL: https://github.com/apache/iceberg/pull/9385#discussion_r1440008488 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -749,4 +795,23 @@ public void applyEndpointConfigurations(T builder) { builde

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-02 Thread via GitHub
szehon-ho commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1433435467 ## core/src/main/java/org/apache/iceberg/deletes/PositionDeleteIndex.java: ## @@ -44,4 +44,14 @@ public interface PositionDeleteIndex { /** Returns true if this

Re: [PR] Core: Close the MetricsReporter when the Catalog is closed. [iceberg]

2024-01-02 Thread via GitHub
huyuanfeng2018 commented on code in PR #9353: URL: https://github.com/apache/iceberg/pull/9353#discussion_r1440003882 ## aws/src/main/java/org/apache/iceberg/aws/dynamodb/DynamoDbCatalog.java: ## @@ -487,6 +486,7 @@ public Configuration getConf() { @Override public void

Re: [PR] Deliver key metadata for encryption of data files [iceberg]

2024-01-02 Thread via GitHub
rdblue commented on PR #9359: URL: https://github.com/apache/iceberg/pull/9359#issuecomment-187478 @ggershinsky, I'm still working through this review, so don't feel like you need to address or respond to my comments yet! Also, here are a few notes for myself when I pick this up tomorro

Re: [PR] Core: Close the MetricsReporter when the Catalog is closed. [iceberg]

2024-01-02 Thread via GitHub
dramaticlly commented on code in PR #9353: URL: https://github.com/apache/iceberg/pull/9353#discussion_r1439989287 ## aws/src/main/java/org/apache/iceberg/aws/dynamodb/DynamoDbCatalog.java: ## @@ -487,6 +486,7 @@ public Configuration getConf() { @Override public void clo

Re: [PR] Deliver key metadata for encryption of data files [iceberg]

2024-01-02 Thread via GitHub
rdblue commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1439988090 ## core/src/main/java/org/apache/iceberg/encryption/StandardEncryptionManager.java: ## @@ -67,7 +72,15 @@ public InputFile decrypt(EncryptedInputFile encrypted) { @Ov

Re: [PR] Deliver key metadata for encryption of data files [iceberg]

2024-01-02 Thread via GitHub
rdblue commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1439979355 ## core/src/main/java/org/apache/iceberg/encryption/StandardEncryptionManager.java: ## @@ -41,7 +42,10 @@ public class StandardEncryptionManager implements EncryptionMa

Re: [PR] AWS: Add S3 Access Grants Integration [iceberg]

2024-01-02 Thread via GitHub
jackye1995 commented on code in PR #9385: URL: https://github.com/apache/iceberg/pull/9385#discussion_r1439983680 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -749,4 +795,23 @@ public void applyEndpointConfigurations(T builder) { builder

Re: [PR] Deliver key metadata for encryption of data files [iceberg]

2024-01-02 Thread via GitHub
rdblue commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1439979355 ## core/src/main/java/org/apache/iceberg/encryption/StandardEncryptionManager.java: ## @@ -41,7 +42,10 @@ public class StandardEncryptionManager implements EncryptionMa

Re: [PR] Deliver key metadata for encryption of data files [iceberg]

2024-01-02 Thread via GitHub
rdblue commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1439979355 ## core/src/main/java/org/apache/iceberg/encryption/StandardEncryptionManager.java: ## @@ -41,7 +42,10 @@ public class StandardEncryptionManager implements EncryptionMa

Re: [PR] Deliver key metadata for encryption of data files [iceberg]

2024-01-02 Thread via GitHub
rdblue commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1439977630 ## core/src/main/java/org/apache/iceberg/encryption/StandardEncryptionManager.java: ## @@ -67,7 +72,15 @@ public InputFile decrypt(EncryptedInputFile encrypted) { @Ov

Re: [PR] Deliver key metadata for encryption of data files [iceberg]

2024-01-02 Thread via GitHub
rdblue commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1439975347 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkAppenderFactory.java: ## @@ -161,7 +162,12 @@ private StructType lazyPosDeleteSparkType() { }

Re: [I] Renaming a table may conflict with the new table with old table name [iceberg]

2024-01-02 Thread via GitHub
github-actions[bot] closed issue #6890: Renaming a table may conflict with the new table with old table name URL: https://github.com/apache/iceberg/issues/6890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Rewrite manifest action can only split large manifest file into two manifests,instead of expected target size [iceberg]

2024-01-02 Thread via GitHub
github-actions[bot] commented on issue #6891: URL: https://github.com/apache/iceberg/issues/6891#issuecomment-1874709553 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Renaming a table may conflict with the new table with old table name [iceberg]

2024-01-02 Thread via GitHub
github-actions[bot] commented on issue #6890: URL: https://github.com/apache/iceberg/issues/6890#issuecomment-1874709577 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Rewrite manifest action can only split large manifest file into two manifests,instead of expected target size [iceberg]

2024-01-02 Thread via GitHub
github-actions[bot] closed issue #6891: Rewrite manifest action can only split large manifest file into two manifests,instead of expected target size URL: https://github.com/apache/iceberg/issues/6891 -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [I] Files are being overwritten on subsequent runs of Spark Structured Streaming [iceberg]

2024-01-02 Thread via GitHub
amogh-jahagirdar commented on issue #8609: URL: https://github.com/apache/iceberg/issues/8609#issuecomment-1874704535 This issue should be resolved in https://github.com/apache/iceberg/pull/9255 and https://github.com/apache/iceberg/pull/9399 (backports to Spark 3.3 and Spark 3.4). T

Re: [I] Files are being overwritten on subsequent runs of Spark Structured Streaming [iceberg]

2024-01-02 Thread via GitHub
amogh-jahagirdar closed issue #8609: Files are being overwritten on subsequent runs of Spark Structured Streaming URL: https://github.com/apache/iceberg/issues/8609 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on PR #9384: URL: https://github.com/apache/iceberg/pull/9384#issuecomment-187452 Thanks for reviewing, @jerqi @zinking @zhongyujiang @rdblue @RussellSpitzer! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi merged PR #9384: URL: https://github.com/apache/iceberg/pull/9384 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.3, 3.4: Backport #9255 - Fix clobbering of files across epochs [iceberg]

2024-01-02 Thread via GitHub
rdblue merged PR #9399: URL: https://github.com/apache/iceberg/pull/9399 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439799553 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -334,6 +335,9 @@ private TableProperties() {} public static final String MAX_REF_AGE_MS = "h

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439797790 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -334,6 +335,9 @@ private TableProperties() {} public static final String MAX_REF_AGE_MS = "h

Re: [PR] AWS: Add S3 Access Grants Integration [iceberg]

2024-01-02 Thread via GitHub
jackye1995 commented on code in PR #9385: URL: https://github.com/apache/iceberg/pull/9385#discussion_r1439793304 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -749,4 +795,23 @@ public void applyEndpointConfigurations(T builder) { builder

Re: [PR] AWS: Add S3 Access Grants Integration [iceberg]

2024-01-02 Thread via GitHub
jackye1995 commented on code in PR #9385: URL: https://github.com/apache/iceberg/pull/9385#discussion_r1437796852 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -749,4 +795,23 @@ public void applyEndpointConfigurations(T builder) { builder

Re: [PR] AWS: Add S3 Access Grants Integration [iceberg]

2024-01-02 Thread via GitHub
jackye1995 commented on code in PR #9385: URL: https://github.com/apache/iceberg/pull/9385#discussion_r1439793304 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -749,4 +795,23 @@ public void applyEndpointConfigurations(T builder) { builder

Re: [PR] Build: Bump pytest from 7.4.3 to 7.4.4 [iceberg-python]

2024-01-02 Thread via GitHub
Fokko merged PR #248: URL: https://github.com/apache/iceberg-python/pull/248 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
rdblue commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439722314 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -334,6 +335,9 @@ private TableProperties() {} public static final String MAX_REF_AGE_MS = "histor

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1439702880 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkExecutorCache.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1439698943 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkExecutorCache.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Core: Add ManifestWrite benchmark [iceberg]

2024-01-02 Thread via GitHub
dramaticlly commented on code in PR #8637: URL: https://github.com/apache/iceberg/pull/8637#discussion_r1439696239 ## core/src/jmh/java/org/apache/iceberg/ManifestWriteBenchmark.java: ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439685145 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -334,6 +335,9 @@ private TableProperties() {} public static final String MAX_REF_AGE_MS = "h

Re: [PR] API: Fix day partition transform result type [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9345: URL: https://github.com/apache/iceberg/pull/9345#discussion_r1439682666 ## format/spec.md: ## @@ -318,7 +318,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncate

Re: [PR] API: Fix day partition transform result type [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9345: URL: https://github.com/apache/iceberg/pull/9345#discussion_r1439682666 ## format/spec.md: ## @@ -318,7 +318,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncate

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439679853 ## core/src/main/java/org/apache/iceberg/deletes/DeleteGranularity.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Bump Nessie to 0.76.0 [iceberg]

2024-01-02 Thread via GitHub
snazy commented on PR #9398: URL: https://github.com/apache/iceberg/pull/9398#issuecomment-1874350376 The build-issue (fixed by the 2nd commit) is caused by a Java 21 class file, properly placed in `META-INF/versions/21/com/fasterxml/jackson/...`. There is yet no shadow-plugin version that

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439679682 ## core/src/main/java/org/apache/iceberg/deletes/DeleteGranularity.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439679439 ## core/src/main/java/org/apache/iceberg/deletes/DeleteGranularity.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on PR #9384: URL: https://github.com/apache/iceberg/pull/9384#issuecomment-1874349288 > One question: > Iceberg has the rewritePositionDeletesAction. Will this pr influence this action? @jerqi, yes, it will. There is a new test in `TestRewritePositionDeleteFi

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439678391 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -334,6 +335,9 @@ private TableProperties() {} public static final String MAX_REF_AGE_MS = "h

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439678214 ## core/src/main/java/org/apache/iceberg/deletes/TargetedPositionDeleteWriter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Data: Allow classes of different packages to implement DeleteFilter [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on PR #9352: URL: https://github.com/apache/iceberg/pull/9352#issuecomment-1874346597 This seem like two different issues attempted to being solved in 1 PR. Please Separate them. I think you also need to offer a much more detailed explanation of why you want to make

Re: [PR] Data: Allow classes of different packages to implement DeleteFilter [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9352: URL: https://github.com/apache/iceberg/pull/9352#discussion_r1439676343 ## data/src/main/java/org/apache/iceberg/data/DeleteFilter.java: ## @@ -138,7 +138,7 @@ public void incrementDeleteCount() { counter.increment(); } -

Re: [PR] Data: Allow classes of different packages to implement DeleteFilter [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9352: URL: https://github.com/apache/iceberg/pull/9352#discussion_r1439676088 ## core/src/main/java/org/apache/iceberg/util/DecimalUtil.java: ## @@ -31,7 +31,7 @@ private DecimalUtil() {} public static byte[] toReusedFixLengthBytes(

Re: [PR] Replace black by Ruff Formatter [iceberg-python]

2024-01-02 Thread via GitHub
hussein-awala commented on PR #127: URL: https://github.com/apache/iceberg-python/pull/127#issuecomment-1874338838 I just merged main and fixed the conflicts, it should be ready to merge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Replace black by Ruff Formatter [iceberg-python]

2024-01-02 Thread via GitHub
rdblue commented on PR #127: URL: https://github.com/apache/iceberg-python/pull/127#issuecomment-1874330644 Looks good to me now. Thanks, @hussein-awala! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Bump Nessie to 0.76.0 [iceberg]

2024-01-02 Thread via GitHub
jbonofre commented on PR #9398: URL: https://github.com/apache/iceberg/pull/9398#issuecomment-1874301447 Maybe worth to have a corresponding issue to populate the changelog/release notes. I will do it. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Bump Nessie to 0.76.0 [iceberg]

2024-01-02 Thread via GitHub
jbonofre commented on PR #9398: URL: https://github.com/apache/iceberg/pull/9398#issuecomment-1874300845 LGTM, @rdblue @nastra I would like to include for Iceberg 1.5.0. Thoughts ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439642290 ## core/src/main/java/org/apache/iceberg/deletes/TargetedPositionDeleteWriter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439633731 ## core/src/main/java/org/apache/iceberg/deletes/DeleteGranularity.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439633731 ## core/src/main/java/org/apache/iceberg/deletes/DeleteGranularity.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439633400 ## core/src/main/java/org/apache/iceberg/deletes/DeleteGranularity.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439631070 ## core/src/main/java/org/apache/iceberg/deletes/DeleteGranularity.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [I] Duplicate file name in Iceberg's metadata [iceberg]

2024-01-02 Thread via GitHub
rdblue closed issue #8953: Duplicate file name in Iceberg's metadata URL: https://github.com/apache/iceberg/issues/8953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] Parquet file overwritten by spark streaming job in subsequent execution with same spark streaming checkpoint location [iceberg]

2024-01-02 Thread via GitHub
rdblue closed issue #9172: Parquet file overwritten by spark streaming job in subsequent execution with same spark streaming checkpoint location URL: https://github.com/apache/iceberg/issues/9172 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Spark Streaming: Fix clobbering of files across streaming epochs [iceberg]

2024-01-02 Thread via GitHub
rdblue merged PR #9255: URL: https://github.com/apache/iceberg/pull/9255 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439622150 ## core/src/main/java/org/apache/iceberg/deletes/DeleteGranularity.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Core, Spark: Correct the delete record count for PartitionTable [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on PR #9389: URL: https://github.com/apache/iceberg/pull/9389#issuecomment-1874247332 LGTM! Thanks @ConeyLiu for the PR and @singhpk234 for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Core, Spark: Correct the delete record count for PartitionTable [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer merged PR #9389: URL: https://github.com/apache/iceberg/pull/9389 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Spark 3.5: Support filtering with buckets in RewriteDataFilesProcedure [iceberg]

2024-01-02 Thread via GitHub
RussellSpitzer commented on PR #9396: URL: https://github.com/apache/iceberg/pull/9396#issuecomment-1874242699 Why does this need a new api? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439605409 ## core/src/main/java/org/apache/iceberg/deletes/TargetedPositionDeleteWriter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439592489 ## core/src/main/java/org/apache/iceberg/deletes/SortingPositionOnlyDeleteWriter.java: ## @@ -60,7 +72,7 @@ public void write(PositionDelete positionDelete) {

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439583443 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -334,6 +335,9 @@ private TableProperties() {} public static final String MAX_REF_AGE_MS = "h

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439581834 ## core/src/main/java/org/apache/iceberg/deletes/TargetedPositionDeleteWriter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439572133 ## core/src/main/java/org/apache/iceberg/io/ClusteredPositionDeleteWriter.java: ## @@ -46,17 +49,39 @@ public ClusteredPositionDeleteWriter( OutputFileFactor

Re: [PR] Core, Data, Spark 3.5: Support file and partition delete granularity [iceberg]

2024-01-02 Thread via GitHub
aokolnychyi commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1439572133 ## core/src/main/java/org/apache/iceberg/io/ClusteredPositionDeleteWriter.java: ## @@ -46,17 +49,39 @@ public ClusteredPositionDeleteWriter( OutputFileFactor

Re: [I] Schema IDs Re-Order? [iceberg-python]

2024-01-02 Thread via GitHub
sebpretzer commented on issue #229: URL: https://github.com/apache/iceberg-python/issues/229#issuecomment-1874187979 @Fokko To follow up, we have fixed this internally ourselves with something similar to below: ```python from pydantic import BaseModel, field_validator from pyiceberg

Re: [I] Schema IDs Re-Order? [iceberg-python]

2024-01-02 Thread via GitHub
sebpretzer closed issue #229: Schema IDs Re-Order? URL: https://github.com/apache/iceberg-python/issues/229 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [I] Create JUnit5-version of FlinkCatalogTestBase [iceberg]

2024-01-02 Thread via GitHub
vinitpatni commented on issue #9079: URL: https://github.com/apache/iceberg/issues/9079#issuecomment-1874038839 Following is the active PR for this one : https://github.com/apache/iceberg/pull/9381 -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Adding Junit 5 conversion and AssertJ style for TestFlinkCatalogTable… [iceberg]

2024-01-02 Thread via GitHub
vinitpatni commented on PR #9381: URL: https://github.com/apache/iceberg/pull/9381#issuecomment-1874010083 @nastra TestStreamScanSql is the only subclass remaining for conversion to Junit 5 but it has dependency on GenericAppenderHelper class which is part of iceberg-data module. Let me kno

Re: [PR] Adding Junit 5 conversion and AssertJ style for TestFlinkCatalogTable… [iceberg]

2024-01-02 Thread via GitHub
vinitpatni commented on PR #9381: URL: https://github.com/apache/iceberg/pull/9381#issuecomment-1874003109 - Adding Junit 5 conversion and AssertJ style for TestFlinkUpsert and TestRewriteDataFilesAction -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [I] Apache Flink not committing new snapshots to Iceberg Table [iceberg]

2024-01-02 Thread via GitHub
FranMorilloAWS commented on issue #9089: URL: https://github.com/apache/iceberg/issues/9089#issuecomment-1873872091 Any thoughts on why this could be happening? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Spark: Added merge schema as spark configuration [iceberg]

2024-01-02 Thread via GitHub
Aleena-M-Georgy closed pull request #9397: Spark: Added merge schema as spark configuration URL: https://github.com/apache/iceberg/pull/9397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci