Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-14 Thread via GitHub
RussellSpitzer commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1799986667 ## format/spec.md: ## @@ -598,6 +702,14 @@ Notes: 1. Lower and upper bounds are serialized to bytes using the single-object serialization in Appendix D. The

Re: [PR] Spec: Support geo type [iceberg]

2024-10-14 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1799985651 ## format/spec.md: ## @@ -1286,6 +1291,7 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** | Not

Re: [PR] Spec: Support geo type [iceberg]

2024-10-14 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1799986355 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaN

Re: [PR] feat(catalog/glue): add support for list namespaces [iceberg-go]

2024-10-14 Thread via GitHub
zeroshade commented on code in PR #169: URL: https://github.com/apache/iceberg-go/pull/169#discussion_r1800098570 ## catalog/glue.go: ## @@ -150,8 +151,33 @@ func (c *GlueCatalog) UpdateNamespaceProperties(ctx context.Context, namespace t return PropertiesUpdateSummary{

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-14 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1800101451 ## open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java: ## @@ -64,7 +65,9 @@ public Map configuration() { private CatalogContext i

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-14 Thread via GitHub
rodmeneses commented on code in PR #11305: URL: https://github.com/apache/iceberg/pull/11305#discussion_r1800112079 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSinkBuilder.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-14 Thread via GitHub
rodmeneses commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2412301033 @arkadius please take a look as the CI is broken -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-14 Thread via GitHub
rdblue commented on code in PR #11067: URL: https://github.com/apache/iceberg/pull/11067#discussion_r1800199082 ## format/spec.md: ## @@ -121,9 +121,9 @@ Tables do not require random-access writes. Once written, data and metadata file Tables do not require rename, except for t

Re: [PR] Core: Rename DeleteFileHolder to PendingDeleteFile / Optimize duplicate data/delete file detection [iceberg]

2024-10-14 Thread via GitHub
aokolnychyi commented on code in PR #11254: URL: https://github.com/apache/iceberg/pull/11254#discussion_r1800023409 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -82,11 +82,9 @@ abstract class MergingSnapshotProducer extends SnapshotProducer {

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-14 Thread via GitHub
rdblue commented on code in PR #11067: URL: https://github.com/apache/iceberg/pull/11067#discussion_r1800199969 ## format/spec.md: ## @@ -158,27 +158,27 @@ Readers should be more permissive because v1 metadata files are allowed in v2 ta Readers may be more strict for metadat

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-14 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1799766917 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1800273374 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -78,9 +78,11 @@ public String partition() { private boolean failMissingDeletePa

Re: [I] PyIceberg Near-Term Roadmap [iceberg-python]

2024-10-14 Thread via GitHub
jaehyeon-kim commented on issue #736: URL: https://github.com/apache/iceberg-python/issues/736#issuecomment-2412464577 It look BigLake metastore is going to be replaced to BigQuery metastore. Is the version 0.8.0 still relevant? https://github.com/trinodb/trino/issues/20031#issuecomme

Re: [PR] Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882 [iceberg]

2024-10-14 Thread via GitHub
pvary commented on PR #6570: URL: https://github.com/apache/iceberg/pull/6570#issuecomment-2412467877 @chenwyi2: If you backport the changes to Hive 1, then you can use the feature. I suggest to create your own release for Iceberg as well. -- This is an automated message from the Apache G

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-14 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1800167581 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +70,47 @@ protected static Object[][] parameters() { }

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-14 Thread via GitHub
danielcweeks commented on code in PR #11067: URL: https://github.com/apache/iceberg/pull/11067#discussion_r1800321978 ## format/spec.md: ## @@ -158,27 +158,27 @@ Readers should be more permissive because v1 metadata files are allowed in v2 ta Readers may be more strict for m

[PR] Task: Simulating OOM error during merge equality deletes [iceberg]

2024-10-14 Thread via GitHub
nicole-martinez opened a new pull request, #11320: URL: https://github.com/apache/iceberg/pull/11320 - This PR adds a new test (`testMergeEqualityDeletesOOM`) to handle out-of-memory scenarios during merge equality deletes. - The test generates a large dataset and attempts to merge equal

Re: [PR] Spark: add property to disable client-side purging in spark [iceberg]

2024-10-14 Thread via GitHub
RussellSpitzer commented on PR #11317: URL: https://github.com/apache/iceberg/pull/11317#issuecomment-2412029752 The problem is that table properties will only be respected by clients which know how to use it, so although you may set this property, you have no guarantee clients will follow

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-14 Thread via GitHub
RussellSpitzer commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1799982379 ## format/spec.md: ## @@ -298,16 +298,101 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns i

Re: [PR] feat(catalog/glue): add support for list namespaces [iceberg-go]

2024-10-14 Thread via GitHub
oguzerdogmus commented on code in PR #169: URL: https://github.com/apache/iceberg-go/pull/169#discussion_r1800236271 ## catalog/glue.go: ## @@ -150,8 +151,33 @@ func (c *GlueCatalog) UpdateNamespaceProperties(ctx context.Context, namespace t return PropertiesUpdateSumma

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-14 Thread via GitHub
advancedxy commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1800356102 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct val

Re: [PR] Core: Switch usage to DataFileSet / DeleteFileSet [iceberg]

2024-10-14 Thread via GitHub
nastra commented on code in PR #11158: URL: https://github.com/apache/iceberg/pull/11158#discussion_r1800532477 ## hive-metastore/src/test/java/org/apache/iceberg/hive/HiveTableTest.java: ## @@ -213,7 +213,7 @@ public void testDropTable() throws IOException { table.newAppen

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-14 Thread via GitHub
arkadius commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2413009425 > > > @arkadius please take a look as the CI is broken > > > > > > Do you have an option to retry this build stage? It is rather impossible that extraction of an interface co

Re: [PR] Core: Rename DeleteFileHolder to PendingDeleteFile / Optimize duplicate data/delete file detection [iceberg]

2024-10-14 Thread via GitHub
nastra commented on code in PR #11254: URL: https://github.com/apache/iceberg/pull/11254#discussion_r1800550234 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -772,17 +773,139 @@ protected static class DeleteFileHolder { * * @param deleteFile d

Re: [PR] Core: Rename DeleteFileHolder to PendingDeleteFile / Optimize duplicate data/delete file detection [iceberg]

2024-10-14 Thread via GitHub
nastra commented on code in PR #11254: URL: https://github.com/apache/iceberg/pull/11254#discussion_r1800552495 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -974,7 +970,8 @@ private List newDataFilesAsManifests() { newDataFilesBySpec.forEac

Re: [PR] Core: Rename DeleteFileHolder to PendingDeleteFile / Optimize duplicate data/delete file detection [iceberg]

2024-10-14 Thread via GitHub
nastra commented on code in PR #11254: URL: https://github.com/apache/iceberg/pull/11254#discussion_r1800552749 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -1005,7 +1002,8 @@ private List newDeleteFilesAsManifests() { newDeleteFilesBySpec.

Re: [PR] Core: Rename DeleteFileHolder to PendingDeleteFile / Optimize duplicate data/delete file detection [iceberg]

2024-10-14 Thread via GitHub
nastra commented on code in PR #11254: URL: https://github.com/apache/iceberg/pull/11254#discussion_r1800558085 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -82,11 +82,9 @@ abstract class MergingSnapshotProducer extends SnapshotProducer { priv

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-10-14 Thread via GitHub
danielcweeks commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1799977130 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,22 @@ components: uuid: type: string +StorageCredential: + type: objec

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-14 Thread via GitHub
arkadius commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2412364720 > @arkadius please take a look as the CI is broken Do you have an option to retry this build stage? It is rather impossible that extraction of an interface could cause a test to

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-14 Thread via GitHub
rodmeneses commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2412366243 > > @arkadius please take a look as the CI is broken > > Do you have an option to retry this build stage? It is rather impossible that extraction of an interface could cause a

Re: [PR] Spec: Support geo type [iceberg]

2024-10-14 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1799988864 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaN

[I] Manifest List/Entry Creation [iceberg-go]

2024-10-14 Thread via GitHub
dwilson1988 opened a new issue, #172: URL: https://github.com/apache/iceberg-go/issues/172 ### Feature Request / Improvement Hello, I'm working on a use case where I need to be my own catalog and need to be able to create my own Iceberg tables purely in Go. I understand that table cr

Re: [PR] Spec: Support geo type [iceberg]

2024-10-14 Thread via GitHub
paleolimbot commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1800414577 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. N

Re: [PR] chore: Fix build after merge [iceberg-rust]

2024-10-14 Thread via GitHub
Xuanwo closed pull request #670: chore: Fix build after merge URL: https://github.com/apache/iceberg-rust/pull/670 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[PR] Revert "feat: Add equality delete writer (#372)" [iceberg-rust]

2024-10-14 Thread via GitHub
Xuanwo opened a new pull request, #672: URL: https://github.com/apache/iceberg-rust/pull/672 This reverts commit ad89eac02712ceac2c3cff6bf0fe5d1b6e289a26. I have to revert PR #372 since it can't pass the unit tests and I didn't find a quick way to fix it. -- This is an automated me

Re: [PR] chore: Fix build after merge [iceberg-rust]

2024-10-14 Thread via GitHub
Xuanwo commented on PR #670: URL: https://github.com/apache/iceberg-rust/pull/670#issuecomment-2412887881 Replaced by https://github.com/apache/iceberg-rust/pull/672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] feat: Add equality delete writer [iceberg-rust]

2024-10-14 Thread via GitHub
Xuanwo commented on PR #372: URL: https://github.com/apache/iceberg-rust/pull/372#issuecomment-2412889476 Hi, I'm sorry, but I need to revert this PR. @Dysprosium0626, could you reopen and rebase your original PR and test it again? -- This is an automated message from the Apache Git Servi

Re: [PR] Revert "feat: Add equality delete writer (#372)" [iceberg-rust]

2024-10-14 Thread via GitHub
Xuanwo commented on PR #672: URL: https://github.com/apache/iceberg-rust/pull/672#issuecomment-2412892346 Hi, @kevinjqliu, could you take a look? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Remove unnecessary copying of FileScanTask [iceberg]

2024-10-14 Thread via GitHub
huaxingao commented on PR #11319: URL: https://github.com/apache/iceberg/pull/11319#issuecomment-2412123590 cc @szehon-ho -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-14 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1800188393 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -89,13 +138,37 @@ public static void dropWarehouse() throws IOExceptio

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-14 Thread via GitHub
szehon-ho commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1800307665 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level

Re: [PR] Deprecate ContentCache.invalidateAll [iceberg]

2024-10-14 Thread via GitHub
findepi commented on PR #10494: URL: https://github.com/apache/iceberg/pull/10494#issuecomment-2412095546 thank you @RussellSpitzer for your review. i think this one could deserve a fix: https://github.com/apache/iceberg/issues/10493 -- This is an automated message from the Apache Git

Re: [PR] Core: Deprecate ContentCache.invalidateAll [iceberg]

2024-10-14 Thread via GitHub
findepi merged PR #10494: URL: https://github.com/apache/iceberg/pull/10494 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Bump moto from 5.0.14 to 5.0.16 [iceberg-python]

2024-10-14 Thread via GitHub
dependabot[bot] commented on PR #1212: URL: https://github.com/apache/iceberg-python/pull/1212#issuecomment-2412443326 Superseded by #1230. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Bump moto from 5.0.14 to 5.0.16 [iceberg-python]

2024-10-14 Thread via GitHub
dependabot[bot] closed pull request #1212: Bump moto from 5.0.14 to 5.0.16 URL: https://github.com/apache/iceberg-python/pull/1212 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[PR] Bump moto from 5.0.14 to 5.0.17 [iceberg-python]

2024-10-14 Thread via GitHub
dependabot[bot] opened a new pull request, #1230: URL: https://github.com/apache/iceberg-python/pull/1230 Bumps [moto](https://github.com/getmoto/moto) from 5.0.14 to 5.0.17. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog.

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-14 Thread via GitHub
szehon-ho commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1800282377 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the fo

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-14 Thread via GitHub
ajantha-bhat commented on code in PR #11067: URL: https://github.com/apache/iceberg/pull/11067#discussion_r1800310393 ## format/spec.md: ## @@ -121,9 +121,9 @@ Tables do not require random-access writes. Once written, data and metadata file Tables do not require rename, except

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-14 Thread via GitHub
ajantha-bhat commented on PR #11067: URL: https://github.com/apache/iceberg/pull/11067#issuecomment-2412652792 New TOC with this change https://github.com/user-attachments/assets/e469de32-a608-4277-8dda-63c40b0fe0e9";> -- This is an automated message from the Apache Git Service.

Re: [PR] Add Snowflake catalog [iceberg-python]

2024-10-14 Thread via GitHub
prabodh1194 commented on PR #687: URL: https://github.com/apache/iceberg-python/pull/687#issuecomment-2412914581 closing as I won't be able to continue with this PR now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add Snowflake catalog [iceberg-python]

2024-10-14 Thread via GitHub
prabodh1194 closed pull request #687: Add Snowflake catalog URL: https://github.com/apache/iceberg-python/pull/687 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[PR] (AWS) Docs: List all AWS S3 properties from all language impl. [iceberg]

2024-10-14 Thread via GitHub
hsiang-c opened a new pull request, #11321: URL: https://github.com/apache/iceberg/pull/11321 ### Note to reviewers - Closes https://github.com/apache/iceberg/issues/10674 - I moved S3 properties to its own doc (`aws-s3-fileio-properties.md`) and link to it from the original `aws.md

Re: [PR] (AWS) Docs: List all AWS S3 properties from all language impl. [iceberg]

2024-10-14 Thread via GitHub
hsiang-c commented on PR #11321: URL: https://github.com/apache/iceberg/pull/11321#issuecomment-2412945006 cc @Fokko @Xuanwo for reviews, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-14 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1800277755 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-14 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1799796613 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1800279697 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1800279697 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1800279697 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1800283158 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [I] Implement remaining operations for Glue catalog [iceberg-go]

2024-10-14 Thread via GitHub
vivekkoya commented on issue #64: URL: https://github.com/apache/iceberg-go/issues/64#issuecomment-2412610788 Hello, I can take this task. How can I get started? Can you please direct me to the relevant files and directories? Thanks for the help -- This is an automated message f

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1800275372 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -370,14 +407,7 @@ private boolean canContainDeletedFiles(ManifestFile manifest) {

Re: [I] PyIceberg Production Use case survey [iceberg-python]

2024-10-14 Thread via GitHub
mariotaddeucci commented on issue #1202: URL: https://github.com/apache/iceberg-python/issues/1202#issuecomment-2412604690 Hey, actually I'm using in production for small datasets in combination with duckdb specially to avoid small files with webscrapping. For ingestion, reading many

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1800279697 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-14 Thread via GitHub
RussellSpitzer commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1800174701 ## format/spec.md: ## @@ -684,34 +796,38 @@ The atomic operation used to commit metadata depends on how tables are tracked a Table metadata consists of the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-14 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1800243303 ## format/puffin-spec.md: ## @@ -123,6 +123,44 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-14 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1800243797 ## format/puffin-spec.md: ## @@ -123,6 +123,49 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-14 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1800245093 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [I] EPIC: Rust Based Compaction [iceberg-rust]

2024-10-14 Thread via GitHub
camuel commented on issue #624: URL: https://github.com/apache/iceberg-rust/issues/624#issuecomment-2412544556 Does anyone has any insights on how computation heavy is the compaction workload really? Like on a beefy machine what compaction rate will be possible? Like 1GB/sec? 10GB/sec? A ba

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-14 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1800243915 ## core/src/main/java/org/apache/iceberg/RESTPlanningMode.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more con

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-14 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1800243915 ## core/src/main/java/org/apache/iceberg/RESTPlanningMode.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more con

Re: [I] flink:FlinkSink support dynamically changed schema [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on issue #4190: URL: https://github.com/apache/iceberg/issues/4190#issuecomment-2412570266 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Long overflow when Iceberg reading INT96 timestamp column from Spark parquet table [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on issue #8949: URL: https://github.com/apache/iceberg/issues/8949#issuecomment-2412570370 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Does the Java API support primary keys for creating tables [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on issue #8950: URL: https://github.com/apache/iceberg/issues/8950#issuecomment-2412570398 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Question on BaseMetastoreViewCatalog#buildView [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on issue #8967: URL: https://github.com/apache/iceberg/issues/8967#issuecomment-2412570449 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Flink: Decouple the iceberg integration work from hadoop libraries [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on issue #3117: URL: https://github.com/apache/iceberg/issues/3117#issuecomment-2412570245 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] equality delete files can be removed immediately after rewrite? [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8933: equality delete files can be removed immediately after rewrite? URL: https://github.com/apache/iceberg/issues/8933 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed pull request #9233: Spark SystemFunctions are not pushed down during JOIN URL: https://github.com/apache/iceberg/pull/9233 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Does the Java API support primary keys for creating tables [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8950: Does the Java API support primary keys for creating tables URL: https://github.com/apache/iceberg/issues/8950 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Rest Catalog: Add RESTful data operations [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed pull request #9237: Rest Catalog: Add RESTful data operations URL: https://github.com/apache/iceberg/pull/9237 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Rest Catalog: Add RESTful data operations [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on PR #9237: URL: https://github.com/apache/iceberg/pull/9237#issuecomment-2412570932 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [I] Why are updateSchema and UpdatePartitionSpec commit not retried? [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on issue #8964: URL: https://github.com/apache/iceberg/issues/8964#issuecomment-2412570423 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Question on BaseMetastoreViewCatalog#buildView [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8967: Question on BaseMetastoreViewCatalog#buildView URL: https://github.com/apache/iceberg/issues/8967 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] Why are updateSchema and UpdatePartitionSpec commit not retried? [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8964: Why are updateSchema and UpdatePartitionSpec commit not retried? URL: https://github.com/apache/iceberg/issues/8964 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] flink1.13.2+iceberg0.13.0+hive-metastore3.0.0+minio(S3) Forbidden (Service: Amazon S3; Status Code: 403 [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8968: flink1.13.2+iceberg0.13.0+hive-metastore3.0.0+minio(S3) Forbidden (Service: Amazon S3; Status Code: 403 URL: https://github.com/apache/iceberg/issues/8968 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] fix when equalityFieldColumns is not null and upsert is false, position delete in write function will lead to unstable result if flink checkpoint interval is not same [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on PR #9300: URL: https://github.com/apache/iceberg/pull/9300#issuecomment-2412571038 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [I] flink1.13.2+iceberg0.13.0+hive-metastore3.0.0+minio(S3) Forbidden (Service: Amazon S3; Status Code: 403 [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on issue #8968: URL: https://github.com/apache/iceberg/issues/8968#issuecomment-2412570467 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Remove redundant error propagation check. [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed pull request #9143: Remove redundant error propagation check. URL: https://github.com/apache/iceberg/pull/9143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Remove redundant error propagation check. [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on PR #9143: URL: https://github.com/apache/iceberg/pull/9143#issuecomment-2412570759 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Parquet: Add a table property to control the Parquet row-group size of position delete files [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed pull request #9177: Parquet: Add a table property to control the Parquet row-group size of position delete files URL: https://github.com/apache/iceberg/pull/9177 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Core, Hive, Nessie: Use ResolvingFileIO as default instead of HadoopFileIO [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on PR #8272: URL: https://github.com/apache/iceberg/pull/8272#issuecomment-2412570302 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: Suppress exceptions in case of dropTableData [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed pull request #9184: Core: Suppress exceptions in case of dropTableData URL: https://github.com/apache/iceberg/pull/9184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Spark: IN clause on system function is not pushed down [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on PR #9192: URL: https://github.com/apache/iceberg/pull/9192#issuecomment-2412570842 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [I] equality delete files can be removed immediately after rewrite? [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on issue #8933: URL: https://github.com/apache/iceberg/issues/8933#issuecomment-2412570345 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Spark: Use Awaitility instead of Thread.sleep [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed pull request #9224: Spark: Use Awaitility instead of Thread.sleep URL: https://github.com/apache/iceberg/pull/9224 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Long overflow when Iceberg reading INT96 timestamp column from Spark parquet table [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8949: Long overflow when Iceberg reading INT96 timestamp column from Spark parquet table URL: https://github.com/apache/iceberg/issues/8949 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Spark: Use Awaitility instead of Thread.sleep [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on PR #9224: URL: https://github.com/apache/iceberg/pull/9224#issuecomment-2412570880 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-2412570907 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [I] Support MOR CDC view [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8975: Support MOR CDC view URL: https://github.com/apache/iceberg/issues/8975 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] org.apache.iceberg.spark.source.SerializableTableWithSize cannot be cast to org.apache.iceberg.Table [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8978: org.apache.iceberg.spark.source.SerializableTableWithSize cannot be cast to org.apache.iceberg.Table URL: https://github.com/apache/iceberg/issues/8978 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] Data duplicate after the partition is modified [iceberg]

2024-10-14 Thread via GitHub
github-actions[bot] closed issue #8979: Data duplicate after the partition is modified URL: https://github.com/apache/iceberg/issues/8979 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

  1   2   3   >