Re: [I] Remove orphan files without creating listed files dataframe in spark ? [iceberg]

2025-02-04 Thread via GitHub
irshadcc closed issue #12158: Remove orphan files without creating listed files dataframe in spark ? URL: https://github.com/apache/iceberg/issues/12158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Implementation of version metadata table for view [iceberg]

2025-02-04 Thread via GitHub
nastra commented on code in PR #12014: URL: https://github.com/apache/iceberg/pull/12014#discussion_r1942374605 ## core/src/main/java/org/apache/iceberg/BaseViewMetadataTable.java: ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [I] Snowflake managed Open Catalog and Azure ADLS2 [iceberg-python]

2025-02-04 Thread via GitHub
HonahX commented on issue #1606: URL: https://github.com/apache/iceberg-python/issues/1606#issuecomment-2635904022 Thanks for reporting this! This seems related to: https://github.com/apache/iceberg/issues/10127 and https://github.com/apache/iceberg/pull/11830. The solution is to su

Re: [PR] Add "clean" NOTICE/LICENSE in jar files [iceberg]

2025-02-04 Thread via GitHub
jbonofre commented on PR #12127: URL: https://github.com/apache/iceberg/pull/12127#issuecomment-2635835774 > > Imho, code copied from another project documented in `LICENSE` would make sense for source jar, not for binary jar. For binary jar, we have to mention other binary bundled in the j

Re: [PR] Fix NOTICE and LICENSE in the spark-runtime jar [iceberg]

2025-02-04 Thread via GitHub
jbonofre commented on code in PR #12160: URL: https://github.com/apache/iceberg/pull/12160#discussion_r1942337481 ## spark/v3.5/spark-runtime/LICENSE: ## @@ -379,14 +334,6 @@ License: http://www.apache.org/licenses/LICENSE-2.0

Re: [PR] Fix NOTICE and LICENSE in the flink-runtime jar [iceberg]

2025-02-04 Thread via GitHub
jbonofre commented on code in PR #12145: URL: https://github.com/apache/iceberg/pull/12145#discussion_r1942332400 ## flink/v1.20/flink-runtime/LICENSE: ## @@ -508,3 +466,63 @@ This binary artifact contains failsafe. Copyright: Jonathan Halterman and friends Home page: https://

Re: [PR] fix: gurantee the deserialize order of struct is same as the struct type [iceberg-rust]

2025-02-04 Thread via GitHub
ZENOTME commented on PR #795: URL: https://github.com/apache/iceberg-rust/pull/795#issuecomment-2635824232 > Hi, I'm a bit confused about why we need to care about this. [De]serialization is a very format-specific task, and it's really challenging to ensure our implementations meet all form

Re: [PR] backport c++23 std::expected [iceberg-cpp]

2025-02-04 Thread via GitHub
wgtmac commented on PR #40: URL: https://github.com/apache/iceberg-cpp/pull/40#issuecomment-2635820910 Is it possible to check the macro [__cpp_lib_expected](https://en.cppreference.com/w/cpp/feature_test#cpp_lib_expected) and directly use `std::expected` if available? -- This is an auto

Re: [PR] #12081: "Add deleteFileThreshold parameter to SizeBasedDataRewriter, update logic, and include tests" [iceberg]

2025-02-04 Thread via GitHub
jangalasriramd7 closed pull request #12133: #12081: "Add deleteFileThreshold parameter to SizeBasedDataRewriter, update logic, and include tests" URL: https://github.com/apache/iceberg/pull/12133 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] feat: `SnapshotSummary` outline [iceberg-rust]

2025-02-04 Thread via GitHub
jonathanc-n commented on PR #939: URL: https://github.com/apache/iceberg-rust/pull/939#issuecomment-2635735838 cc @Fokko or others. I'm trying to debug a deserialization issue, and I suspect the server might be encountering a schema mismatch when handling the SnapshotSummary. However, since

Re: [PR] Fix NOTICE and LICENSE in the flink-runtime jar [iceberg]

2025-02-04 Thread via GitHub
jbonofre commented on PR #12145: URL: https://github.com/apache/iceberg/pull/12145#issuecomment-2635717141 > Are we sure that we capture everthing? Unzipping the JAR shows more dependencies, such as datasketches: > > > > ``` > > unzip iceberg-flink-runtime-1.20-1.8.0-SN

Re: [PR] Fix NOTICE and LICENSE in the flink-runtime jar [iceberg]

2025-02-04 Thread via GitHub
jbonofre commented on PR #12145: URL: https://github.com/apache/iceberg/pull/12145#issuecomment-2635715189 > Are we sure that we capture everthing? Unzipping the JAR shows more dependencies, such as datasketches: > > > > ``` > > unzip iceberg-flink-runtime-1.20-1.8.0-SN

Re: [I] Add files to add existing Parquet files to a table [iceberg-rust]

2025-02-04 Thread via GitHub
ZENOTME commented on issue #932: URL: https://github.com/apache/iceberg-rust/issues/932#issuecomment-2635708236 > I would like to try working on this. Thanks @jonathanc-n! Feel free to send the PR for this. -- This is an automated message from the Apache Git Service. To respond to t

Re: [I] Add V3 type `unknown` [iceberg-python]

2025-02-04 Thread via GitHub
kaushiksrini commented on issue #1553: URL: https://github.com/apache/iceberg-python/issues/1553#issuecomment-2635667189 hey @Fokko, I'd like to work on this if it's available -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Spec: Update partition stats for V3 [iceberg]

2025-02-04 Thread via GitHub
stevenzwu commented on code in PR #12098: URL: https://github.com/apache/iceberg/pull/12098#discussion_r1942216004 ## format/spec.md: ## @@ -927,20 +927,21 @@ These rows must be sorted (in ascending manner with NULL FIRST) by `partition` f The schema of the partition statist

Re: [PR] spec: Remove `source-ids` for `V{1,2}` tables [iceberg]

2025-02-04 Thread via GitHub
advancedxy commented on PR #12161: URL: https://github.com/apache/iceberg/pull/12161#issuecomment-2635591315 @Fokko Thanks for bring this up. I'm supporting this to only allow multi-arg transforms for V3 onwards. When multi-arg transform was first added to the spec, V3 was far from completi

Re: [PR] Spec: Support geo type [iceberg]

2025-02-04 Thread via GitHub
wgtmac commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1942170948 ## format/spec.md: ## @@ -205,15 +205,40 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Add relevant NOTICE portions from ALv2 bundled dependencies [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on PR #12095: URL: https://github.com/apache/iceberg/pull/12095#issuecomment-2635466295 I see the other PRs that replaced this one referenced, but the Spark one is missing: https://github.com/apache/iceberg/pull/12160 -- This is an automated message from the Apache Git Se

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942103932 ## data/src/test/java/org/apache/iceberg/data/avro/TestGenericData.java: ## @@ -76,4 +86,29 @@ protected void writeAndValidate(Schema writeSchema, Schema expectedSche

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942102793 ## core/src/test/java/org/apache/iceberg/avro/TestSchemaConversions.java: ## @@ -370,4 +370,17 @@ public void testFieldDocsArePreserved() { Lists.newArrayList

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942101398 ## core/src/test/java/org/apache/iceberg/avro/TestBuildAvroProjection.java: ## @@ -401,4 +402,31 @@ public void projectMapWithLessFieldInValueSchema() { .as("

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942096627 ## core/src/test/java/org/apache/iceberg/TestSchemaUnionByFieldName.java: ## @@ -75,7 +75,7 @@ private static NestedField[] primitiveFields( optional(

Re: [PR] Add support for `write.data.path` [iceberg-python]

2025-02-04 Thread via GitHub
smaheshwar-pltr commented on code in PR #1611: URL: https://github.com/apache/iceberg-python/pull/1611#discussion_r1942091778 ## tests/table/test_locations.py: ## @@ -133,3 +133,13 @@ def test_hash_injection(data_file_name: str, expected_hash: str) -> None: provider = load

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942093906 ## api/src/main/java/org/apache/iceberg/types/Types.java: ## @@ -61,6 +61,14 @@ private Types() {} private static final Pattern DECIMAL = Pattern.compile("de

Re: [PR] Add support for `write.data.path` [iceberg-python]

2025-02-04 Thread via GitHub
smaheshwar-pltr commented on code in PR #1611: URL: https://github.com/apache/iceberg-python/pull/1611#discussion_r1942086611 ## mkdocs/docs/configuration.md: ## @@ -54,18 +54,19 @@ Iceberg tables support table properties to configure table behavior. ### Write options -| K

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942092261 ## core/src/main/java/org/apache/iceberg/avro/BuildAvroProjection.java: ## @@ -56,6 +56,10 @@ class BuildAvroProjection extends AvroCustomOrderSchemaVisitor names, I

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942090932 ## core/src/main/java/org/apache/iceberg/SchemaParser.java: ## @@ -145,6 +145,8 @@ static void toJson(Type.PrimitiveType primitive, JsonGenerator generator) throws

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942090932 ## core/src/main/java/org/apache/iceberg/SchemaParser.java: ## @@ -145,6 +145,8 @@ static void toJson(Type.PrimitiveType primitive, JsonGenerator generator) throws

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942088807 ## api/src/main/java/org/apache/iceberg/types/Types.java: ## @@ -81,6 +82,16 @@ public static PrimitiveType fromPrimitiveString(String typeString) { throw new Il

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942081865 ## core/src/main/java/org/apache/iceberg/SchemaParser.java: ## @@ -42,6 +42,7 @@ private SchemaParser() {} private static final String STRUCT = "struct"; private

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942078866 ## api/src/test/java/org/apache/iceberg/types/TestTypeUtil.java: ## @@ -645,4 +647,37 @@ public void testReassignOrRefreshIdsCaseInsensitive() { requi

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942081865 ## core/src/main/java/org/apache/iceberg/SchemaParser.java: ## @@ -42,6 +42,7 @@ private SchemaParser() {} private static final String STRUCT = "struct"; private

[PR] Support Distinct Counts in Manifest [iceberg-python]

2025-02-04 Thread via GitHub
jpugliesi opened a new pull request, #1613: URL: https://github.com/apache/iceberg-python/pull/1613 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942080873 ## api/src/main/java/org/apache/iceberg/types/TypeUtil.java: ## @@ -709,6 +709,10 @@ public T map(Types.MapType map, Supplier keyResult, Supplier valueResult)

Re: [I] On droping table with shared location the data got deleted which should not be case after version 0.14.0 of Apache iceberg [iceberg]

2025-02-04 Thread via GitHub
github-actions[bot] commented on issue #10779: URL: https://github.com/apache/iceberg/issues/10779#issuecomment-2635421871 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942078866 ## api/src/test/java/org/apache/iceberg/types/TestTypeUtil.java: ## @@ -645,4 +647,37 @@ public void testReassignOrRefreshIdsCaseInsensitive() { requi

Re: [I] On droping table with shared location the data got deleted which should not be case after version 0.14.0 of Apache iceberg [iceberg]

2025-02-04 Thread via GitHub
github-actions[bot] closed issue #10779: On droping table with shared location the data got deleted which should not be case after version 0.14.0 of Apache iceberg URL: https://github.com/apache/iceberg/issues/10779 -- This is an automated message from the Apache Git Service. To respond to t

Re: [I] Formal verification discovers potential consistency issue [iceberg]

2025-02-04 Thread via GitHub
github-actions[bot] commented on issue #10720: URL: https://github.com/apache/iceberg/issues/10720#issuecomment-2635421818 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Formal verification discovers potential consistency issue [iceberg]

2025-02-04 Thread via GitHub
github-actions[bot] closed issue #10720: Formal verification discovers potential consistency issue URL: https://github.com/apache/iceberg/issues/10720 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Do not override finalize [iceberg]

2025-02-04 Thread via GitHub
github-actions[bot] commented on issue #10901: URL: https://github.com/apache/iceberg/issues/10901#issuecomment-2635421963 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Core: add variant type support [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1942077666 ## api/src/test/java/org/apache/iceberg/types/TestTypeUtil.java: ## @@ -645,4 +647,37 @@ public void testReassignOrRefreshIdsCaseInsensitive() { requi

Re: [PR] Fix NOTICE and LICENSE in the spark-runtime jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on PR #12160: URL: https://github.com/apache/iceberg/pull/12160#issuecomment-2635404145 The description for this issue has: > Remove delta (not found in the jar) I don't see the change anymore, but in case I'm missing it: this was included because source code is pa

Re: [PR] Auth Manager API part 4: RESTClient, HTTPClient [iceberg]

2025-02-04 Thread via GitHub
danielcweeks commented on PR #11992: URL: https://github.com/apache/iceberg/pull/11992#issuecomment-2635399088 I feel like there may have been a small understanding with how we handle the initial authSession. We don't want to default it to empty otherwise if someone adds a request like `cl

Re: [PR] Auth Manager API part 4: RESTClient, HTTPClient [iceberg]

2025-02-04 Thread via GitHub
danielcweeks commented on code in PR #11992: URL: https://github.com/apache/iceberg/pull/11992#discussion_r1942068160 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -242,7 +242,10 @@ public void initialize(String name, Map unresolved) { }

Re: [PR] Auth Manager API part 4: RESTClient, HTTPClient [iceberg]

2025-02-04 Thread via GitHub
danielcweeks commented on code in PR #11992: URL: https://github.com/apache/iceberg/pull/11992#discussion_r1942062420 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -74,7 +75,11 @@ private RESTClient httpClient() { if (null == client) {

Re: [PR] Auth Manager API part 4: RESTClient, HTTPClient [iceberg]

2025-02-04 Thread via GitHub
danielcweeks commented on code in PR #11992: URL: https://github.com/apache/iceberg/pull/11992#discussion_r1942060653 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -192,6 +192,7 @@ private RESTClient httpClient() { HTTPClie

Re: [PR] Fix NOTICE and LICENSE in the spark-runtime jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12160: URL: https://github.com/apache/iceberg/pull/12160#discussion_r1942061139 ## spark/v3.5/spark-runtime/NOTICE: ## @@ -500,9 +363,52 @@ file: This binary artifact includes Project Nessie with the following in its NOTICE file: +| NOTICE app

Re: [PR] Fix NOTICE and LICENSE in the spark-runtime jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12160: URL: https://github.com/apache/iceberg/pull/12160#discussion_r1942059143 ## spark/v3.5/spark-runtime/LICENSE: ## @@ -519,30 +419,6 @@ License: http://www.apache.org/licenses/LICENSE-2.0 --

Re: [PR] Fix NOTICE and LICENSE in the spark-runtime jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12160: URL: https://github.com/apache/iceberg/pull/12160#discussion_r1942058747 ## spark/v3.5/spark-runtime/LICENSE: ## @@ -403,22 +350,6 @@ License: http://www.apache.org/licenses/LICENSE-2.0 --

Re: [PR] Fix NOTICE and LICENSE in the spark-runtime jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12160: URL: https://github.com/apache/iceberg/pull/12160#discussion_r1942057928 ## spark/v3.5/spark-runtime/LICENSE: ## @@ -403,22 +350,6 @@ License: http://www.apache.org/licenses/LICENSE-2.0 --

Re: [PR] Fix NOTICE and LICENSE in the spark-runtime jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12160: URL: https://github.com/apache/iceberg/pull/12160#discussion_r1942057187 ## spark/v3.5/spark-runtime/LICENSE: ## @@ -379,14 +334,6 @@ License: http://www.apache.org/licenses/LICENSE-2.0 --

Re: [PR] Core: Refactor variants to enable moving interfaces to API module [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on PR #12167: URL: https://github.com/apache/iceberg/pull/12167#issuecomment-2635332665 Thanks for reviewing, @amogh-jahagirdar! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Core: Refactor variants to enable moving interfaces to API module [iceberg]

2025-02-04 Thread via GitHub
amogh-jahagirdar merged PR #12167: URL: https://github.com/apache/iceberg/pull/12167 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
danielcweeks commented on PR #8202: URL: https://github.com/apache/iceberg/pull/8202#issuecomment-2635302317 I would agree with @rdblue's comment that there's a lot of stats overlap with what exists elsewhere and I'm not convinced this serialized format is standardized well enough to be use

[PR] Build: Bump mkdocstrings-python from 1.13.0 to 1.14.4 [iceberg-python]

2025-02-04 Thread via GitHub
dependabot[bot] opened a new pull request, #1612: URL: https://github.com/apache/iceberg-python/pull/1612 Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 1.13.0 to 1.14.4. Release notes Sourced from https://github.com/mkdocstrings/python/releases";>mkdocstr

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1941962764 ## pyiceberg/table/__init__.py: ## @@ -1064,6 +1065,110 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMapping."""

Re: [PR] Fix NOTICE and LICENSE in the gcp-bundle jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12144: URL: https://github.com/apache/iceberg/pull/12144#discussion_r1941966254 ## gcp-bundle/LICENSE: ## @@ -325,24 +315,30 @@ License: The Apache Software License, Version 2.0 - http://www.apache.org/licens -

Re: [PR] Implement column projection [iceberg-python]

2025-02-04 Thread via GitHub
gabeiglio commented on code in PR #1443: URL: https://github.com/apache/iceberg-python/pull/1443#discussion_r1941979863 ## pyiceberg/io/pyarrow.py: ## @@ -1216,6 +1218,45 @@ def _field_id(self, field: pa.Field) -> int: return -1 +def _get_column_projection_values( +

Re: [PR] Fix NOTICE and LICENSE in the flink-runtime jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12145: URL: https://github.com/apache/iceberg/pull/12145#discussion_r1941973579 ## flink/v1.20/flink-runtime/LICENSE: ## @@ -464,47 +385,76 @@ License text: -Th

Re: [PR] Fix NOTICE and LICENSE in the flink-runtime jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12145: URL: https://github.com/apache/iceberg/pull/12145#discussion_r1941972173 ## flink/v1.20/flink-runtime/NOTICE: ## @@ -63,29 +63,277 @@ NOTICE file: -This

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
deniskuzZ commented on code in PR #8202: URL: https://github.com/apache/iceberg/pull/8202#discussion_r1941767628 ## format/puffin-spec.md: ## @@ -181,6 +181,23 @@ for Puffin v1. [roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=rea

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2635183500 Thanks for your patience here @mattmartin14 🙌 It is getting late here, and I'll do another round of review to make sure that we haven't missed anything (this is a pretty important fe

Re: [PR] Fix NOTICE and LICENSE in the gcp-bundle jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12144: URL: https://github.com/apache/iceberg/pull/12144#discussion_r1941966254 ## gcp-bundle/LICENSE: ## @@ -325,24 +315,30 @@ License: The Apache Software License, Version 2.0 - http://www.apache.org/licens -

Re: [PR] Fix NOTICE and LICENSE in the gcp-bundle jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12144: URL: https://github.com/apache/iceberg/pull/12144#discussion_r1941965760 ## gcp-bundle/LICENSE: ## @@ -220,100 +220,90 @@ License: Apache 2.0 - http://www.apache.org/licenses/LICENSE-2.0

Re: [PR] Fix NOTICE and LICENSE in the gcp-bundle jar [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12144: URL: https://github.com/apache/iceberg/pull/12144#discussion_r1941964907 ## gcp-bundle/LICENSE: ## @@ -325,24 +315,24 @@ License: The Apache Software License, Version 2.0 - http://www.apache.org/licens Review Comment: The license of

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2635180106 > this might be a silly question. does the function you listed in your suedo code above "expression_to_arrow" exist? or is that just a name you made up? So sorry, it is called

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1941961551 ## pyiceberg/table/__init__.py: ## @@ -1064,6 +1064,119 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMapping."""

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1941960339 ## pyiceberg/table/__init__.py: ## @@ -1064,6 +1064,119 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMapping."""

Re: [PR] [infra] add testpypi nightly build [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on code in PR #1601: URL: https://github.com/apache/iceberg-python/pull/1601#discussion_r1941937605 ## .github/workflows/nightly-pypi-build.yml: ## @@ -0,0 +1,89 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license ag

Re: [PR] [infra] add testpypi nightly build [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on PR #1601: URL: https://github.com/apache/iceberg-python/pull/1601#issuecomment-2635149393 @kevinjqliu This looks great! Thanks for working on it. Very nice how you re-use the build workflow. I left some small comments on how we can maybe simplify manual dispatch, apart fr

Re: [PR] [infra] add testpypi nightly build [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on code in PR #1601: URL: https://github.com/apache/iceberg-python/pull/1601#discussion_r1941939426 ## .github/workflows/nightly-pypi-build.yml: ## @@ -0,0 +1,89 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license ag

Re: [PR] [infra] add testpypi nightly build [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on code in PR #1601: URL: https://github.com/apache/iceberg-python/pull/1601#discussion_r1941938065 ## .github/workflows/pypi-build-artifacts.yml: ## @@ -35,7 +35,7 @@ jobs: runs-on: ${{ matrix.os }} strategy: matrix: -os: [ ubuntu-22.04,

Re: [PR] [infra] add testpypi nightly build [iceberg-python]

2025-02-04 Thread via GitHub
Fokko commented on code in PR #1601: URL: https://github.com/apache/iceberg-python/pull/1601#discussion_r1941936984 ## .github/workflows/nightly-pypi-build.yml: ## @@ -0,0 +1,89 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license ag

Re: [PR] Core: Fix RewriteTablePath Incremental Replication [iceberg]

2025-02-04 Thread via GitHub
barronfuentes commented on PR #12172: URL: https://github.com/apache/iceberg/pull/12172#issuecomment-2635117705 @szehon-ho - Would love to get your feedback on these changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[I] Discussion: support append files [iceberg-go]

2025-02-04 Thread via GitHub
laskoviymishka opened a new issue, #287: URL: https://github.com/apache/iceberg-go/issues/287 ### Feature Request / Improvement ### **Feature Request: Implement Append Operation for Iceberg Tables** **Description** Currently, the `iceberg-go` library lacks support fo

Re: [PR] Core: Refactor variants to enable moving interfaces to API module [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12167: URL: https://github.com/apache/iceberg/pull/12167#discussion_r1941900883 ## core/src/main/java/org/apache/iceberg/variants/VariantMetadata.java: ## @@ -34,4 +35,12 @@ public interface VariantMetadata extends Variants.Serialized { /**

Re: [PR] Parquet: Implement Variant readers [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12139: URL: https://github.com/apache/iceberg/pull/12139#discussion_r1941890941 ## api/src/main/java/org/apache/iceberg/types/GetProjectedIds.java: ## @@ -47,7 +47,9 @@ public Set struct(Types.StructType struct, List> fieldResu @Override

[PR] Bump the versions of `site/mkdocs.yml` [iceberg]

2025-02-04 Thread via GitHub
Fokko opened a new pull request, #12175: URL: https://github.com/apache/iceberg/pull/12175 This brings them in line with 1.7.1: https://github.com/apache/iceberg/blob/apache-iceberg-1.7.1/gradle/libs.versions.toml -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
mattmartin14 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2635079402 Alright @Fokko @tscottcoombes1 , some good news. I just pushed an update that removes the dependency of datafusion on the main pyiceberg merge_rows function. My test file sti

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1941894395 ## tests/table/test_merge_rows.py: ## @@ -0,0 +1,380 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] Docs: Fix latest and nightly link on javadoc (according to site README.md) [iceberg]

2025-02-04 Thread via GitHub
jbonofre commented on PR #12023: URL: https://github.com/apache/iceberg/pull/12023#issuecomment-2635069592 @Fokko thanks and no problem for the time it took 😀 we are all very busy and we do our best 😄 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
tscottcoombes1 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1941888231 ## tests/table/test_merge_rows.py: ## @@ -0,0 +1,380 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agree

Re: [PR] Variants: Implement toString [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #12138: URL: https://github.com/apache/iceberg/pull/12138#discussion_r1941887589 ## core/src/main/java/org/apache/iceberg/variants/VariantMetadata.java: ## @@ -34,4 +34,20 @@ public interface VariantMetadata extends Variants.Serialized { /**

Re: [PR] Variants: Implement toString [iceberg]

2025-02-04 Thread via GitHub
rdblue merged PR #12138: URL: https://github.com/apache/iceberg/pull/12138 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Variants: Implement toString [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on PR #12138: URL: https://github.com/apache/iceberg/pull/12138#issuecomment-2635061053 Thanks for the reviews, everyone! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
mattmartin14 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2635043790 > @mattmartin14 more useful: > > ``` > df_1 = pd.DataFrame({'name': ["tom", "matt"]}) > tbl_1 = pa.Table.from_pandas(df_1) > > df_2 = pd.DataFrame({'name': [

Re: [PR] Core/RewriteFiles: Duplicate Data Bug - Fixed dropping delete files that are still required [iceberg]

2025-02-04 Thread via GitHub
rdblue commented on code in PR #10962: URL: https://github.com/apache/iceberg/pull/10962#discussion_r1941874030 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -363,6 +363,10 @@ private ManifestFile filterManifest( } private boolean canContainD

Re: [PR] Docs: Fix latest and nightly link on javadoc (according to site README.md) [iceberg]

2025-02-04 Thread via GitHub
Fokko merged PR #12023: URL: https://github.com/apache/iceberg/pull/12023 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
tscottcoombes1 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2634966793 @mattmartin14 more useful: ``` df_1 = pd.DataFrame({'name': ["tom", "matt"]}) tbl_1 = pa.Table.from_pandas(df_1) df_2 = pd.DataFrame({'name': ["tom", "harr

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
tscottcoombes1 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2634935042 @mattmartin14 super simple example for pa.filter() ``` import pyarrow as pa import pandas as pd df = pd.DataFrame({"name": ["tom", "matt"]}) tbl = pa.Ta

[PR] Add support for `write.data.path` [iceberg-python]

2025-02-04 Thread via GitHub
Fokko opened a new pull request, #1611: URL: https://github.com/apache/iceberg-python/pull/1611 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
deniskuzZ commented on code in PR #8202: URL: https://github.com/apache/iceberg/pull/8202#discussion_r1941767628 ## format/puffin-spec.md: ## @@ -181,6 +181,23 @@ for Puffin v1. [roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=rea

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
deniskuzZ commented on code in PR #8202: URL: https://github.com/apache/iceberg/pull/8202#discussion_r1941767628 ## format/puffin-spec.md: ## @@ -181,6 +181,23 @@ for Puffin v1. [roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=rea

Re: [PR] Implementation of version metadata table for view [iceberg]

2025-02-04 Thread via GitHub
huan233usc commented on PR #12014: URL: https://github.com/apache/iceberg/pull/12014#issuecomment-2634887332 merge conflicts resolved -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
deniskuzZ commented on code in PR #8202: URL: https://github.com/apache/iceberg/pull/8202#discussion_r1941767628 ## format/puffin-spec.md: ## @@ -181,6 +181,23 @@ for Puffin v1. [roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=rea

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
deniskuzZ commented on code in PR #8202: URL: https://github.com/apache/iceberg/pull/8202#discussion_r1941767628 ## format/puffin-spec.md: ## @@ -181,6 +181,23 @@ for Puffin v1. [roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=rea

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
deniskuzZ commented on code in PR #8202: URL: https://github.com/apache/iceberg/pull/8202#discussion_r1941767628 ## format/puffin-spec.md: ## @@ -181,6 +181,23 @@ for Puffin v1. [roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=rea

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
deniskuzZ commented on code in PR #8202: URL: https://github.com/apache/iceberg/pull/8202#discussion_r1941770061 ## format/puffin-spec.md: ## @@ -181,6 +181,23 @@ for Puffin v1. [roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=rea

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-04 Thread via GitHub
mattmartin14 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2634871294 > > On generating my test sets of data though, can I use duckdb for that (which appears you already support as a dependency)? > > For tests you can use whatever you like

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2025-02-04 Thread via GitHub
deniskuzZ commented on code in PR #8202: URL: https://github.com/apache/iceberg/pull/8202#discussion_r1941770061 ## format/puffin-spec.md: ## @@ -181,6 +181,23 @@ for Puffin v1. [roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=rea

  1   2   >