Re: [PR] Spark 3.5: Preserve content offset and size during manifest rewrites [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi merged PR #11469: URL: https://github.com/apache/iceberg/pull/11469 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Spark 3.5: Preserve content offset and size during manifest rewrites [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on PR #11469: URL: https://github.com/apache/iceberg/pull/11469#issuecomment-2456462069 Thank you, @nastra! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] Including Iceberg Version in metadata json file for better traceability of PendingUpdate [iceberg]

2024-11-04 Thread via GitHub
nastra commented on issue #11471: URL: https://github.com/apache/iceberg/issues/11471#issuecomment-2456455938 @rice668 the iceberg version should already be included in the summary of a particular snapshot. -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Docs: Fix verifying release candidate with Spark and Flink [iceberg]

2024-11-04 Thread via GitHub
nastra commented on code in PR #11461: URL: https://github.com/apache/iceberg/pull/11461#discussion_r1828870345 ## site/docs/how-to-release.md: ## @@ -422,7 +422,7 @@ spark-runtime jar for the Spark installation): ```bash spark-shell \ --conf spark.jars.repositories=${MAV

Re: [PR] Core: Adapt commit, scan, and snapshot stats for DVs [iceberg]

2024-11-04 Thread via GitHub
nastra commented on code in PR #11464: URL: https://github.com/apache/iceberg/pull/11464#discussion_r1828862857 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -84,4 +85,8 @@ public static String referencedDataFileLocation(DeleteFile deleteFile) {

Re: [PR] Core: Adapt commit, scan, and snapshot stats for DVs [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11464: URL: https://github.com/apache/iceberg/pull/11464#discussion_r1828862437 ## core/src/test/java/org/apache/iceberg/TestSnapshotSummary.java: ## @@ -358,4 +358,66 @@ public void rewriteWithDeletesAndDuplicates() { .containsEntry

Re: [PR] Core: Adapt commit, scan, and snapshot stats for DVs [iceberg]

2024-11-04 Thread via GitHub
nastra commented on code in PR #11464: URL: https://github.com/apache/iceberg/pull/11464#discussion_r1828862345 ## core/src/main/java/org/apache/iceberg/metrics/ScanMetricsUtil.java: ## @@ -31,7 +32,11 @@ public static void indexedDeleteFile(ScanMetrics metrics, DeleteFile dele

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828860774 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -92,4 +107,113 @@ public Collection deleteFiles() { public long cardinal

Re: [PR] Core: Adapt commit, scan, and snapshot stats for DVs [iceberg]

2024-11-04 Thread via GitHub
nastra commented on code in PR #11464: URL: https://github.com/apache/iceberg/pull/11464#discussion_r1828855927 ## core/src/main/java/org/apache/iceberg/SnapshotSummary.java: ## @@ -283,8 +292,13 @@ void addedFile(ContentFile file) { this.addedRecords += file.recordCo

Re: [PR] Spark 3.5: Fix flaky test due to temp directory not empty during delete [iceberg]

2024-11-04 Thread via GitHub
nastra commented on PR #11470: URL: https://github.com/apache/iceberg/pull/11470#issuecomment-2456419827 I'd rather try and fix it slightly differently: ``` + @TempDir private File location; + private static SparkSession spark = null; private static JavaSparkContext

[I] Change the doc ( `list_tables` method only return Iceberg Tables ) [iceberg-python]

2024-11-04 Thread via GitHub
omkenge opened a new issue, #1291: URL: https://github.com/apache/iceberg-python/issues/1291 ### Feature Request / Improvement https://github.com/user-attachments/assets/e120891f-0139-4422-952f-85530adb8447";> -- This is an automated message from the Apache Git Service. To respo

[I] Improve documentation on Configuration page [iceberg-python]

2024-11-04 Thread via GitHub
Samreay opened a new issue, #1290: URL: https://github.com/apache/iceberg-python/issues/1290 ### Feature Request / Improvement Hi team! We're currently adopting Iceberg format but struggling to configure everything. For example, we'd like to change the parquet compression code from i

Re: [PR] Spark: support rewrite on specified target branch [iceberg]

2024-11-04 Thread via GitHub
zinking commented on PR #8797: URL: https://github.com/apache/iceberg/pull/8797#issuecomment-2456291421 @amitgilad3 I have some issue doing the rebase with my current computer. so you probably just copy the branch and doing it on your own repo. -- This is an automated message from the Ap

Re: [I] Serialization of the org.apache.iceberg.io.WriteResult class. [iceberg]

2024-11-04 Thread via GitHub
pvary commented on issue #10710: URL: https://github.com/apache/iceberg/issues/10710#issuecomment-2456289631 > > Another question (I'm not familiar with this type of serialization) - how this handles inheritance? > > Do you mean how to write a instance of `TypeInfoFactory` if the targ

[PR] Spark 3.5: Fix flaky test due to temp directory not empty during delete [iceberg]

2024-11-04 Thread via GitHub
manuzhang opened a new pull request, #11470: URL: https://github.com/apache/iceberg/pull/11470 Follow-up of #10811 to ignore other types of `FileSystemException` like `DirectoryNotEmptyException` as well ``` TestDataFrameWrites > testFaultToleranceOnWrite() > format = parquet FAILE

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828695685 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -169,7 +174,18 @@ public DeltaWriterFactory createBatc

[PR] Bump mkdocs-material from 9.5.42 to 9.5.43 [iceberg-python]

2024-11-04 Thread via GitHub
dependabot[bot] opened a new pull request, #1288: URL: https://github.com/apache/iceberg-python/pull/1288 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.42 to 9.5.43. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828678650 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -437,20 +474,51 @@ protected PartitioningWriter newData

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828666174 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -437,20 +474,51 @@ protected PartitioningWriter newData

Re: [I] Spec inconsistency: partition_spec_id column in ManifestList vs. partition_specs in metadata.json [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9739: Spec inconsistency: partition_spec_id column in ManifestList vs. partition_specs in metadata.json URL: https://github.com/apache/iceberg/issues/9739 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] [Docs, Flink] Iceberg Flink docs do not include support for enhanced DDL support added in #7628 [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9755: [Docs, Flink] Iceberg Flink docs do not include support for enhanced DDL support added in #7628 URL: https://github.com/apache/iceberg/issues/9755 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [I] [Docs, Flink] Iceberg Flink docs do not include support for enhanced DDL support added in #7628 [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9755: URL: https://github.com/apache/iceberg/issues/9755#issuecomment-2455962654 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Bug: Flink data loss after failed to refresh table [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9753: Bug: Flink data loss after failed to refresh table URL: https://github.com/apache/iceberg/issues/9753 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] Inconsistency in deleting manifest and data files [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9792: Inconsistency in deleting manifest and data files URL: https://github.com/apache/iceberg/issues/9792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] branch schema affected by main table schema [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9737: branch schema affected by main table schema URL: https://github.com/apache/iceberg/issues/9737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] start-timestamp not utilized in create_changelog_view [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9791: start-timestamp not utilized in create_changelog_view URL: https://github.com/apache/iceberg/issues/9791 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [I] column type change failed : timestamp without timezone change to timestamp with timezone [iceberg]

2024-11-04 Thread via GitHub
amabilee commented on issue #10660: URL: https://github.com/apache/iceberg/issues/10660#issuecomment-2455972992 Iceberg treats `timestamp without timezone` and `timestamp with timezone` as distinct types, and there isn't a built-in mechanism to convert between them directly. This is

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9679: rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client URL: https://github.com/apache/iceberg/issues/9679 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Confusion about latest_schema_id in metadata_log_entries [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9758: URL: https://github.com/apache/iceberg/issues/9758#issuecomment-2455962683 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] branch schema affected by main table schema [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9737: URL: https://github.com/apache/iceberg/issues/9737#issuecomment-2455962528 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] The truncate partition transform is underspecified [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9768: The truncate partition transform is underspecified URL: https://github.com/apache/iceberg/issues/9768 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Iceberg Materialized Views [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #10043: URL: https://github.com/apache/iceberg/issues/10043#issuecomment-2455963256 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Spark 3.5.0 `MERGE INTO` breaks [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9827: Spark 3.5.0 `MERGE INTO` breaks URL: https://github.com/apache/iceberg/issues/9827 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] Inconsistency in deleting manifest and data files [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9792: URL: https://github.com/apache/iceberg/issues/9792#issuecomment-2455962915 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] REST Catalog Spec: Snapshot Summary Class [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9837: URL: https://github.com/apache/iceberg/issues/9837#issuecomment-2455963175 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] truncate partitioning underflows, leads to wrong results [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9767: URL: https://github.com/apache/iceberg/issues/9767#issuecomment-2455962746 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Can we load iceberg table using external volume instead of external stage ? [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9828: Can we load iceberg table using external volume instead of external stage ? URL: https://github.com/apache/iceberg/issues/9828 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] Can we load iceberg table using external volume instead of external stage ? [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9828: URL: https://github.com/apache/iceberg/issues/9828#issuecomment-2455963107 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Spark job application finished with failed status when trying to read iceberg hive tables from remote jupyter notebook pod [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9824: URL: https://github.com/apache/iceberg/issues/9824#issuecomment-2455963034 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Bug: Flink data loss after failed to refresh table [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9753: URL: https://github.com/apache/iceberg/issues/9753#issuecomment-2455962618 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Spark job application finished with failed status when trying to read iceberg hive tables from remote jupyter notebook pod [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9824: Spark job application finished with failed status when trying to read iceberg hive tables from remote jupyter notebook pod URL: https://github.com/apache/iceberg/issues/9824 -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] Writing Equality Deletes using Iceberg Java API [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9808: Writing Equality Deletes using Iceberg Java API URL: https://github.com/apache/iceberg/issues/9808 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-2455962507 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] [AWS] S3FileIO - Add Cross-Region Bucket Access [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on PR #9804: URL: https://github.com/apache/iceberg/pull/9804#issuecomment-2455962958 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Cannot find constructor for interface org.apache.parquet.column.page.PageWriteStore? [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9802: Cannot find constructor for interface org.apache.parquet.column.page.PageWriteStore? URL: https://github.com/apache/iceberg/issues/9802 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] Calling `rewrite_position_delete_files` rewrites into same amount of files [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9833: Calling `rewrite_position_delete_files` rewrites into same amount of files URL: https://github.com/apache/iceberg/issues/9833 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Spec inconsistency: partition_spec_id column in ManifestList vs. partition_specs in metadata.json [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9739: URL: https://github.com/apache/iceberg/issues/9739#issuecomment-2455962556 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Spec is ambiguous w.r.t. optional fields in field_summary [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9740: Spec is ambiguous w.r.t. optional fields in field_summary URL: https://github.com/apache/iceberg/issues/9740 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Confusion about latest_schema_id in metadata_log_entries [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9758: Confusion about latest_schema_id in metadata_log_entries URL: https://github.com/apache/iceberg/issues/9758 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] Spec is ambiguous w.r.t. optional fields in field_summary [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9740: URL: https://github.com/apache/iceberg/issues/9740#issuecomment-2455962581 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] start-timestamp not utilized in create_changelog_view [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9791: URL: https://github.com/apache/iceberg/issues/9791#issuecomment-2455962881 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] REST Catalog Spec: Snapshot Summary Class [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9837: REST Catalog Spec: Snapshot Summary Class URL: https://github.com/apache/iceberg/issues/9837 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] Is it possible to add a set of existing partitioned parquet files to the Iceberg table via the Java Standalone API [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9763: URL: https://github.com/apache/iceberg/issues/9763#issuecomment-2455962721 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Calling `rewrite_position_delete_files` rewrites into same amount of files [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9833: URL: https://github.com/apache/iceberg/issues/9833#issuecomment-2455963143 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Cannot find constructor for interface org.apache.parquet.column.page.PageWriteStore? [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9802: URL: https://github.com/apache/iceberg/issues/9802#issuecomment-2455962939 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Writing Equality Deletes using Iceberg Java API [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9808: URL: https://github.com/apache/iceberg/issues/9808#issuecomment-2455963005 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] truncate partitioning underflows, leads to wrong results [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9767: truncate partitioning underflows, leads to wrong results URL: https://github.com/apache/iceberg/issues/9767 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] S3FileIO does not support Iceberg Cross-Region API Calls to Amazon S3 buckets [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9785: S3FileIO does not support Iceberg Cross-Region API Calls to Amazon S3 buckets URL: https://github.com/apache/iceberg/issues/9785 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Is it possible to add a set of existing partitioned parquet files to the Iceberg table via the Java Standalone API [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9763: Is it possible to add a set of existing partitioned parquet files to the Iceberg table via the Java Standalone API URL: https://github.com/apache/iceberg/issues/9763 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] Spark read failed when migrate hive orc table with `timestamp` column [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9784: URL: https://github.com/apache/iceberg/issues/9784#issuecomment-2455962819 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] S3FileIO does not support Iceberg Cross-Region API Calls to Amazon S3 buckets [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] commented on issue #9785: URL: https://github.com/apache/iceberg/issues/9785#issuecomment-2455962855 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Spark read failed when migrate hive orc table with `timestamp` column [iceberg]

2024-11-04 Thread via GitHub
github-actions[bot] closed issue #9784: Spark read failed when migrate hive orc table with `timestamp` column URL: https://github.com/apache/iceberg/issues/9784 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Spark: add property to disable client-side purging in spark [iceberg]

2024-11-04 Thread via GitHub
RussellSpitzer commented on code in PR #11317: URL: https://github.com/apache/iceberg/pull/11317#discussion_r1828530333 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/TestRestDropPurgeTable.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828503271 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -92,4 +107,113 @@ public Collection deleteFiles() { public long cardinal

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828500703 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -92,4 +107,113 @@ public Collection deleteFiles() { public long cardinal

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828450352 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -92,4 +107,113 @@ public Collection deleteFiles() { public long car

[PR] Bump griffe from 1.3.1 to 1.5.1 [iceberg-python]

2024-11-04 Thread via GitHub
dependabot[bot] opened a new pull request, #1289: URL: https://github.com/apache/iceberg-python/pull/1289 Bumps [griffe](https://github.com/mkdocstrings/griffe) from 1.3.1 to 1.5.1. Release notes Sourced from https://github.com/mkdocstrings/griffe/releases";>griffe's releases.

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828432675 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMergeOnReadDelete.java: ## @@ -99,6 +101,83 @@ public void testDeletePar

[I] kafka connect iceberg connect: option to fail connector on N number of failed commit cycles [iceberg]

2024-11-04 Thread via GitHub
zschwein opened a new issue, #11468: URL: https://github.com/apache/iceberg/issues/11468 ### Proposed Change Currently the kafka connect iceberg sink will retry failed commits to a table forever, using the control topic offsets stored in table iceberg snapshot properties. This

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828432675 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMergeOnReadDelete.java: ## @@ -99,6 +101,83 @@ public void testDeletePar

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828426913 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMergeOnReadDelete.java: ## @@ -99,6 +101,83 @@ public void testDeletePar

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828421067 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -437,20 +474,51 @@ protected PartitioningWriter newDataWrite

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828418948 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -437,20 +474,51 @@ protected PartitioningWriter newDataWrite

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828415621 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -361,7 +394,6 @@ private static class PositionDeltaWriteFact

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828413699 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMergeOnReadUpdate.java: ## @@ -91,4 +155,19 @@ private void checkUpdateFileGr

Re: [I] Support dynamic overwrite [iceberg-python]

2024-11-04 Thread via GitHub
koenvo commented on issue #1287: URL: https://github.com/apache/iceberg-python/issues/1287#issuecomment-2455736587 Ah the PR contains quite some similar functionality indeed. It seems that the PR does a delete+append. If I understand correctly, this could lead to reading incomplete d

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828410250 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -169,7 +174,18 @@ public DeltaWriterFactory createBatchWrit

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828401147 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMergeOnReadDelete.java: ## @@ -99,6 +101,83 @@ public void testDeletePartitio

Re: [PR] Spark: Merge new position deletes with old deletes during writing [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1828399254 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMergeOnReadDelete.java: ## @@ -99,6 +101,83 @@ public void testDeletePartitio

Re: [PR] Spark: support rewrite on specified target branch [iceberg]

2024-11-04 Thread via GitHub
jackye1995 commented on PR #8797: URL: https://github.com/apache/iceberg/pull/8797#issuecomment-2455701262 I think you might want to rebase the commit against latest main branch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Core: Adapt commit, scan, and snapshot stats for DVs [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11464: URL: https://github.com/apache/iceberg/pull/11464#discussion_r1828388869 ## core/src/main/java/org/apache/iceberg/SnapshotSummary.java: ## @@ -283,8 +292,13 @@ void addedFile(ContentFile file) { this.addedRecords += file.rec

Re: [PR] Core: Adapt commit, scan, and snapshot stats for DVs [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11464: URL: https://github.com/apache/iceberg/pull/11464#discussion_r1828388869 ## core/src/main/java/org/apache/iceberg/SnapshotSummary.java: ## @@ -283,8 +292,13 @@ void addedFile(ContentFile file) { this.addedRecords += file.rec

[PR] Core: Support DVs in DeleteFileIndex [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi opened a new pull request, #11467: URL: https://github.com/apache/iceberg/pull/11467 This PR adds supports for DVs in `DeleteFileIndex`. This work is part of #11122. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Serialization of the org.apache.iceberg.io.WriteResult class. [iceberg]

2024-11-04 Thread via GitHub
simonykq commented on issue #10710: URL: https://github.com/apache/iceberg/issues/10710#issuecomment-2455694298 > Another question (I'm not familiar with this type of serialization) - how this handles inheritance? Do you mean how to write a instance of `TypeInfoFactory` if a class inh

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828350898 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -43,6 +53,11 @@ class BitmapPositionDeleteIndex implements PositionDeleteI

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828350898 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -43,6 +53,11 @@ class BitmapPositionDeleteIndex implements PositionDeleteI

Re: [PR] 1.7.0-RC1 Cherry-picks [iceberg]

2024-11-04 Thread via GitHub
RussellSpitzer merged PR #11466: URL: https://github.com/apache/iceberg/pull/11466 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Spark 3.5: Preserve data file reference during manifest rewrites [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on PR #11457: URL: https://github.com/apache/iceberg/pull/11457#issuecomment-2455629166 Thanks, @nastra! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Spark 3.5: Preserve data file reference during manifest rewrites [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi merged PR #11457: URL: https://github.com/apache/iceberg/pull/11457 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Build: Revert Parquet to 1.13.1 [iceberg]

2024-11-04 Thread via GitHub
RussellSpitzer commented on PR #11462: URL: https://github.com/apache/iceberg/pull/11462#issuecomment-245554 Please review the 1.7.0 RC1 CherryPicks https://github.com/apache/iceberg/pull/11466 On Mon, Nov 4, 2024 at 12:18 PM Fokko Driesprong ***@***.***> wrote: > @Rus

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
danielcweeks commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828272218 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -43,6 +53,11 @@ class BitmapPositionDeleteIndex implements PositionDelete

Re: [PR] Build: Revert Parquet to 1.13.1 [iceberg]

2024-11-04 Thread via GitHub
RussellSpitzer merged PR #11462: URL: https://github.com/apache/iceberg/pull/11462 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

[I] Flink: TestMetadataTableReadableMetrics relies on Hardcoded File Sizes [iceberg]

2024-11-04 Thread via GitHub
RussellSpitzer opened a new issue, #11465: URL: https://github.com/apache/iceberg/issues/11465 ### Feature Request / Improvement TestMetadataTableReadableMetrics currently hardcodes in the expected size into the metrics rows rather than actually checking the sizes from the underlying

[PR] Core: Adapt commit, scan, and snapshot stats for DVs [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi opened a new pull request, #11464: URL: https://github.com/apache/iceberg/pull/11464 This PR adapts commit, scan, and snapshot stats for DVs. This work is part of #11122. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] REST: AuthManager API [iceberg]

2024-11-04 Thread via GitHub
adutra commented on code in PR #10753: URL: https://github.com/apache/iceberg/pull/10753#discussion_r1828246578 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -378,7 +289,8 @@ public List listTables(SessionContext context, Namespace ns) {

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828223745 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -92,4 +107,113 @@ public Collection deleteFiles() { public long cardinal

Re: [PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on code in PR #11463: URL: https://github.com/apache/iceberg/pull/11463#discussion_r1828223745 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -92,4 +107,113 @@ public Collection deleteFiles() { public long cardinal

Re: [PR] API, Core: Add content offset and size to DeleteFile [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi merged PR #11446: URL: https://github.com/apache/iceberg/pull/11446 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] API, Core: Add content offset and size to DeleteFile [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi commented on PR #11446: URL: https://github.com/apache/iceberg/pull/11446#issuecomment-2455433971 Thanks for reviewing, @nastra @rdblue! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[PR] Core: Make PositionDeleteIndex serializable [iceberg]

2024-11-04 Thread via GitHub
aokolnychyi opened a new pull request, #11463: URL: https://github.com/apache/iceberg/pull/11463 This PR makes `PositionDeleteIndex` as per the V3 spec. This work is part of #11122. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

  1   2   >