Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
ZENOTME commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1906731431 ## crates/integrations/datafusion/src/statistics.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
ZENOTME commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1906731431 ## crates/integrations/datafusion/src/statistics.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] Spark 3.5: Avoid deprecated method [iceberg]

2025-01-08 Thread via GitHub
ebyhr commented on code in PR #11874: URL: https://github.com/apache/iceberg/pull/11874#discussion_r1906838956 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -59,106 +59,101 @@ public static T visit(DataType sType, Type

Re: [PR] Core, Spark: Rewrite data files with high delete ratio [iceberg]

2025-01-08 Thread via GitHub
nastra commented on code in PR #11825: URL: https://github.com/apache/iceberg/pull/11825#discussion_r1906840998 ## core/src/main/java/org/apache/iceberg/actions/SizeBasedDataRewriter.java: ## @@ -84,13 +86,30 @@ private boolean shouldRewrite(List group) { return enoughInput

Re: [PR] Core, Spark: Rewrite data files with high delete ratio [iceberg]

2025-01-08 Thread via GitHub
nastra commented on code in PR #11825: URL: https://github.com/apache/iceberg/pull/11825#discussion_r1906840998 ## core/src/main/java/org/apache/iceberg/actions/SizeBasedDataRewriter.java: ## @@ -84,13 +86,30 @@ private boolean shouldRewrite(List group) { return enoughInput

Re: [PR] Flink 1.20: Support default values in Parquet reader [iceberg]

2025-01-08 Thread via GitHub
jbonofre commented on PR #11839: URL: https://github.com/apache/iceberg/pull/11839#issuecomment-2577341761 @rdblue @pvary @RussellSpitzer I'm resuming the work on this PR (about the tests). -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-08 Thread via GitHub
xxchan commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1907230588 ## crates/iceberg/src/inspect/snapshots.rs: ## @@ -130,59 +130,14 @@ mod tests { Field { name: "manifest_list", data_type: Utf8, nullable: false, di

Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2025-01-08 Thread via GitHub
ismailsimsek commented on code in PR #11906: URL: https://github.com/apache/iceberg/pull/11906#discussion_r1907274886 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -589,21 +620,42 @@ private FileURI toFileURI(I input)

[PR] Infra: Add manuzhang to collaborators [iceberg]

2025-01-08 Thread via GitHub
manuzhang opened a new pull request, #11927: URL: https://github.com/apache/iceberg/pull/11927 @samredai has [agreed to be swapped out of the collaborator list](https://github.com/apache/iceberg/pull/11859#discussion_r1907239801). Thanks, Sam! -- This is an automated message from the Apa

Re: [PR] Core: Fix loading a table in CachingCatalog with metadata table name [iceberg]

2025-01-08 Thread via GitHub
nastra commented on PR #11738: URL: https://github.com/apache/iceberg/pull/11738#issuecomment-2577996633 I'll take a look at this in the next few days -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] feat(catalog): Add Catalog Registry [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade commented on code in PR #244: URL: https://github.com/apache/iceberg-go/pull/244#discussion_r1907519396 ## catalog/rest_test.go: ## @@ -114,6 +114,39 @@ func (r *RestCatalogSuite) TestToken200() { r.Equal(r.configVals.Get("warehouse"), "s3://some-bucket") }

Re: [PR] feat(catalog): Add Catalog Registry [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade commented on code in PR #244: URL: https://github.com/apache/iceberg-go/pull/244#discussion_r1907519396 ## catalog/rest_test.go: ## @@ -114,6 +114,39 @@ func (r *RestCatalogSuite) TestToken200() { r.Equal(r.configVals.Get("warehouse"), "s3://some-bucket") }

Re: [PR] Implemented Remaining Catalog operations for REST catalog [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade commented on PR #240: URL: https://github.com/apache/iceberg-go/pull/240#issuecomment-2578191104 @chil-pavn that would be fantastic thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Implemented Remaining Catalog operations for REST catalog [iceberg-go]

2025-01-08 Thread via GitHub
chil-pavn commented on PR #240: URL: https://github.com/apache/iceberg-go/pull/240#issuecomment-2578201718 @zeroshade Sure, I will take that up. Also, it would be very helpful if i we link PRs to the respected issues. -- This is an automated message from the Apache Git Service. To respond

Re: [I] PyIceberg Production Use case survey [iceberg-python]

2025-01-08 Thread via GitHub
vikramsg commented on issue #1202: URL: https://github.com/apache/iceberg-python/issues/1202#issuecomment-2577900498 > These apps used to write messages to Kafka and then Flink would stream them to our data lake, but pyiceberg is simpler, costs less and supports schema evolution better.

Re: [PR] feat(datafusion): support metadata tables for Datafusion [iceberg-rust]

2025-01-08 Thread via GitHub
xxchan commented on code in PR #879: URL: https://github.com/apache/iceberg-rust/pull/879#discussion_r1907342718 ## crates/integrations/datafusion/src/table/metadata_table.rs: ## @@ -0,0 +1,95 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] Infra: Add manuzhang to collaborators [iceberg]

2025-01-08 Thread via GitHub
samredai commented on code in PR #11859: URL: https://github.com/apache/iceberg/pull/11859#discussion_r1907239801 ## .asf.yaml: ## @@ -54,7 +54,7 @@ github: - SreeramGarlapati - samredai Review Comment: Not a problem! Please feel free to swap my name out instead. I

Re: [PR] Spark 3.5: Implement RewriteTablePath [iceberg]

2025-01-08 Thread via GitHub
szehon-ho merged PR #11555: URL: https://github.com/apache/iceberg/pull/11555 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.5: Implement RewriteTablePath [iceberg]

2025-01-08 Thread via GitHub
szehon-ho commented on PR #11555: URL: https://github.com/apache/iceberg/pull/11555#issuecomment-2577872032 Thanks a lot @flyrain and @dramaticlly for review, we can continue improving this in follow up prs -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on PR #11919: URL: https://github.com/apache/iceberg/pull/11919#issuecomment-2577847304 @rdblue: Thanks a lot for the review. I have addressed all the comments. PR is ready. -- This is an automated message from the Apache Git Service. To respond to the message, pl

[I] Iceberg API is unable to connect to Hive Metastore > 4.0.0-beta-1 [iceberg]

2025-01-08 Thread via GitHub
mderoy opened a new issue, #11928: URL: https://github.com/apache/iceberg/issues/11928 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine None ### Please describe the bug šŸž between hive metastore 4.0.0-beta-1 and 4.0.0, the hive metastore co

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907642990 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907648464 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907650559 ## format/spec.md: ## @@ -449,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | Transform name| Description

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907648464 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907654445 ## format/spec.md: ## @@ -603,8 +608,9 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single dat

Re: [PR] Call For Proposals Banner.html [iceberg]

2025-01-08 Thread via GitHub
RussellSpitzer commented on code in PR #11924: URL: https://github.com/apache/iceberg/pull/11924#discussion_r1907646692 ## site/overrides/home.html: ## @@ -36,6 +36,15 @@ Apache Icebergā„¢ The open table format for analytic datasets. +

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907651448 ## format/spec.md: ## @@ -603,8 +608,9 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single dat

Re: [PR] Call For Proposals Banner.html [iceberg]

2025-01-08 Thread via GitHub
RussellSpitzer commented on code in PR #11924: URL: https://github.com/apache/iceberg/pull/11924#discussion_r1907653274 ## site/overrides/home.html: ## @@ -36,6 +36,15 @@ Apache Icebergā„¢ The open table format for analytic datasets. +

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907662104 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907663335 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907666748 ## format/spec.md: ## @@ -940,9 +946,7 @@ Note that partition data tuple's schema is based on the partition spec output us The unified partition type is a struct con

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907667162 ## format/spec.md: ## @@ -1154,6 +1158,8 @@ Maps with non-string keys must use an array representation with the `map` logica |**`struct`**|`record`|| |**`list`**|`a

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907666113 ## format/spec.md: ## @@ -603,8 +608,9 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single dat

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907668388 ## format/spec.md: ## @@ -1239,6 +1247,9 @@ When reading an `unknown` column, any corresponding column must be ignored and r | **`struct`** | `struct`

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907672639 ## format/spec.md: ## @@ -1480,6 +1494,9 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** | Not sup

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907674111 ## format/spec.md: ## @@ -1506,6 +1523,8 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** | **`JSON object by

Re: [I] Forbidden Exception creating Polaris Rest catalog with Flink 1.20 [iceberg]

2025-01-08 Thread via GitHub
shantanu-dahiya commented on issue #11836: URL: https://github.com/apache/iceberg/issues/11836#issuecomment-2578591980 I believe the root cause of this issue is the envoy proxy on Istio sidecars not supporting the `Upgrade: TLS/1.2` header, causing client requests with this header to be [re

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1908005218 ## core/src/test/java/org/apache/iceberg/avro/AvroTestHelpers.java: ## @@ -126,9 +139,18 @@ private static void assertEquals(Type type, Object expected, Object

Re: [I] [feature] UpdateSchema.add_column supports both parent and child in the same transaction [iceberg-python]

2025-01-08 Thread via GitHub
kevinjqliu commented on issue #1493: URL: https://github.com/apache/iceberg-python/issues/1493#issuecomment-2579119883 sure @jiakai-li assigned to you! let me know if you have any questions -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] feat(catalog): Add Catalog Registry [iceberg-go]

2025-01-08 Thread via GitHub
kevinjqliu commented on code in PR #244: URL: https://github.com/apache/iceberg-go/pull/244#discussion_r1908133151 ## catalog/rest_test.go: ## @@ -114,6 +114,39 @@ func (r *RestCatalogSuite) TestToken200() { r.Equal(r.configVals.Get("warehouse"), "s3://some-bucket") }

Re: [PR] API: Support removeUnusedSpecs in ExpireSnapshots [iceberg]

2025-01-08 Thread via GitHub
advancedxy commented on PR #10755: URL: https://github.com/apache/iceberg/pull/10755#issuecomment-2579123258 Thanks all for reviewing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Iceberg View Support [iceberg-rust]

2025-01-08 Thread via GitHub
c-thiel commented on issue #55: URL: https://github.com/apache/iceberg-rust/issues/55#issuecomment-2579350785 @liurenjie1024, @Xuanwo, @Fokko, @ZENOTME is any of you aware of someone currently working on the `ViewMetadataBuilder`? Otherwise I would start working on it next week :) -- Thi

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
gruuya commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1908289944 ## crates/integrations/datafusion/src/statistics.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

[PR] Avro: Add variant type support [iceberg]

2025-01-08 Thread via GitHub
XBaith opened a new pull request, #11934: URL: https://github.com/apache/iceberg/pull/11934 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Spark 3.5: Procedure to rewrite table path [iceberg]

2025-01-08 Thread via GitHub
dramaticlly closed pull request #11931: Spark 3.5: Procedure to rewrite table path URL: https://github.com/apache/iceberg/pull/11931 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] feat: support S3 Table Buckets with S3TablesCatalog [iceberg-python]

2025-01-08 Thread via GitHub
felixscherz commented on code in PR #1429: URL: https://github.com/apache/iceberg-python/pull/1429#discussion_r1908308879 ## pyiceberg/catalog/s3tables.py: ## @@ -0,0 +1,324 @@ +import re +from typing import TYPE_CHECKING, List, Optional, Set, Tuple, Union + +import boto3 + +fro

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
liurenjie1024 commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1908170767 ## crates/integrations/datafusion/src/table/mod.rs: ## @@ -41,16 +42,21 @@ pub struct IcebergTableProvider { table: Table, /// Table snapshot id that

Re: [I] feat: Expose Iceberg table statistics in DataFusion interface(s) [iceberg-rust]

2025-01-08 Thread via GitHub
liurenjie1024 commented on issue #869: URL: https://github.com/apache/iceberg-rust/issues/869#issuecomment-2579193342 Thanks @gruuya for doing this, let's continue the discussion in pr. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Manifests table scan should return iceberg schema rather arrow schema [iceberg-rust]

2025-01-08 Thread via GitHub
liurenjie1024 commented on issue #868: URL: https://github.com/apache/iceberg-rust/issues/868#issuecomment-2579192057 The reason I suggest returning iceberg schema is that metadata table is a concept in iceberg library, not only in datafusion integration. The difference is that, iceberg li

Re: [I] Validation Error in ConfigResponse Model with RestCatalog in PyIceberg using Nessie REST API [iceberg]

2025-01-08 Thread via GitHub
heman026 commented on issue #11255: URL: https://github.com/apache/iceberg/issues/11255#issuecomment-2579231141 Hi I am getting the same error - ValidationError: 'defaults' and 'overrides' fields are missing in the ConfigResponse model. Did you resolve it. -- This is an automated

[PR] Spark 3.5: Refactor delete logic in batch reading [iceberg]

2025-01-08 Thread via GitHub
huaxingao opened a new pull request, #11933: URL: https://github.com/apache/iceberg/pull/11933 Address the comments in https://github.com/apache/iceberg/pull/9841#discussion_r1906083743 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Spark 3.5: Refactor delete logic in batch reading [iceberg]

2025-01-08 Thread via GitHub
huaxingao commented on code in PR #11933: URL: https://github.com/apache/iceberg/pull/11933#discussion_r1908213655 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnVectorBuilder.java: ## @@ -26,13 +26,6 @@ class ColumnVectorBuilder { private

Re: [PR] Parquet: Add readers and writers for the internal object model [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on code in PR #11904: URL: https://github.com/apache/iceberg/pull/11904#discussion_r1908235843 ## parquet/src/main/java/org/apache/iceberg/data/parquet/InternalWriter.java: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Parquet: Add readers and writers for the internal object model [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on code in PR #11904: URL: https://github.com/apache/iceberg/pull/11904#discussion_r1908250543 ## parquet/src/main/java/org/apache/iceberg/data/parquet/InternalReader.java: ## @@ -0,0 +1,207 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Parquet: Add readers and writers for the internal object model [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on code in PR #11904: URL: https://github.com/apache/iceberg/pull/11904#discussion_r1908250543 ## parquet/src/main/java/org/apache/iceberg/data/parquet/InternalReader.java: ## @@ -0,0 +1,207 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[I] A casting error occurs when Sanitizing the expression value in a specific case. [iceberg]

2025-01-08 Thread via GitHub
dmgkeke opened a new issue, #11932: URL: https://github.com/apache/iceberg/issues/11932 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine Flink ### Please describe the bug šŸž I found a code suspected of being a bug while running rewrite data

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
liurenjie1024 commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1908168716 ## crates/integrations/datafusion/src/statistics.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

Re: [PR] feat(catalog): Add Catalog Registry [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade commented on code in PR #244: URL: https://github.com/apache/iceberg-go/pull/244#discussion_r1907510409 ## catalog/glue.go: ## @@ -54,6 +57,50 @@ var ( _ Catalog = (*GlueCatalog)(nil) ) +func init() { + Register("glue", RegistrarFunc(func(_ string, pro

Re: [PR] Infra: Add manuzhang to collaborators [iceberg]

2025-01-08 Thread via GitHub
amogh-jahagirdar merged PR #11927: URL: https://github.com/apache/iceberg/pull/11927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Implemented Remaining Catalog operations for REST catalog [iceberg-go]

2025-01-08 Thread via GitHub
chil-pavn commented on PR #240: URL: https://github.com/apache/iceberg-go/pull/240#issuecomment-2578183988 Hey @zeroshade , it appears there was already PR #146 for the same table operations, which also got merged. Should i work on the unit tests, as i could see that was not included in the

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr commented on PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#issuecomment-2578213037 Sounds good, thanks Kevin for your excellent review here! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1907541402 ## tests/integration/test_partitioning_key.py: ## @@ -721,6 +753,27 @@ VALUES (CAST('2023-01-01 11:55:59.99' AS TIMESTAMP),

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-08 Thread via GitHub
xxchan commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1907428200 ## crates/iceberg/src/inspect/entries.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreemen

Re: [PR] Nit fixes to URL-encoding of partition field names [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr commented on code in PR #1499: URL: https://github.com/apache/iceberg-python/pull/1499#discussion_r1907586459 ## tests/integration/test_partitioning_key.py: ## @@ -18,15 +18,15 @@ import uuid from datetime import date, datetime, timedelta, timezone from decima

[PR] Nit fixes to URL-encoding of partition field names [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr opened a new pull request, #1499: URL: https://github.com/apache/iceberg-python/pull/1499 Follow-up to #1457 that addresses nits on that PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1907584770 ## core/src/test/java/org/apache/iceberg/avro/TestInternalAvro.java: ## @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Nit fixes to URL-encoding of partition field names [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr commented on code in PR #1499: URL: https://github.com/apache/iceberg-python/pull/1499#discussion_r1907586150 ## pyiceberg/partitioning.py: ## @@ -237,8 +237,7 @@ def partition_to_path(self, data: Record, schema: Schema) -> str: value_str = quote_pl

Re: [PR] Implemented Remaining Catalog operations for REST catalog [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade commented on PR #240: URL: https://github.com/apache/iceberg-go/pull/240#issuecomment-2578303311 Ideally we should definitely be doing that, I'll definitely admit to my own mistakes in not doing so lately. :( -- This is an automated message from the Apache Git Service. To respon

Re: [I] Implement remaining operations for Glue catalog [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade closed issue #64: Implement remaining operations for Glue catalog URL: https://github.com/apache/iceberg-go/issues/64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1907587480 ## tests/integration/test_partitioning_key.py: ## @@ -721,6 +753,27 @@ VALUES (CAST('2023-01-01 11:55:59.99' AS TIMESTAMP),

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr commented on PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#issuecomment-2578300672 > Going to merge this PR as is and we can deal with nit comment as a followup I've put up #1499 for the nits. -- This is an automated message from the Apache Git Se

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1907587135 ## core/src/test/java/org/apache/iceberg/avro/AvroTestHelpers.java: ## @@ -78,6 +79,18 @@ static void assertEquals(Types.StructType struct, Record expected, Record ac

Re: [PR] Modified exception objects being thrown when converting Pyarrow tables [iceberg-python]

2025-01-08 Thread via GitHub
kevinjqliu commented on code in PR #1498: URL: https://github.com/apache/iceberg-python/pull/1498#discussion_r1907568373 ## pyiceberg/io/pyarrow.py: ## @@ -1140,6 +1147,12 @@ def map(self, map_type: pa.MapType, key_result: IcebergType, value_result: Icebe return MapTyp

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1907589600 ## core/src/test/java/org/apache/iceberg/avro/AvroTestHelpers.java: ## @@ -126,9 +139,18 @@ private static void assertEquals(Type type, Object expected, Object actual

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1907595448 ## core/src/test/java/org/apache/iceberg/avro/AvroTestHelpers.java: ## @@ -126,9 +139,18 @@ private static void assertEquals(Type type, Object expected, Object actual

Re: [I] [feature] Add support for `write.data.path` and `write.metadata.path` [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr commented on issue #1492: URL: https://github.com/apache/iceberg-python/issues/1492#issuecomment-2578312852 Thanks for volunteering @jiakai-li! Happy to review the `LocationProvider`-related changes for `write.data.path` if it'd help šŸ˜„ -- This is an automated message fro

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2025-01-08 Thread via GitHub
stevenzwu commented on PR #11781: URL: https://github.com/apache/iceberg/pull/11781#issuecomment-2578314363 BTW, I like the new direction that @RussellSpitzer outlined. using byte size (instead of number of elements) is more intuitive and easier to calculate a good default to cap memory foo

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-08 Thread via GitHub
rshkv commented on PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#issuecomment-2577942180 I've rebased on #870 and #872 to address the follow: * The entries table now lives in a separate `entries.rs` file. * Batches for manifest files are now computed asynchronously.

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-08 Thread via GitHub
xxchan commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1907240996 ## crates/iceberg/src/inspect/entries.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreemen

Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2025-01-08 Thread via GitHub
ismailsimsek commented on code in PR #11906: URL: https://github.com/apache/iceberg/pull/11906#discussion_r1907270693 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRemoveOrphanFilesAction.java: ## @@ -854,12 +867,14 @@ public void testCompareToFileList()

Re: [I] PyIceberg Production Use case survey [iceberg-python]

2025-01-08 Thread via GitHub
nickdelnano commented on issue #1202: URL: https://github.com/apache/iceberg-python/issues/1202#issuecomment-2577929467 @vikramsg i updated my comment a bit, but yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-08 Thread via GitHub
rshkv commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1907187465 ## crates/iceberg/src/inspect/entries.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-08 Thread via GitHub
rshkv commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1907182301 ## crates/iceberg/src/inspect/snapshots.rs: ## @@ -130,59 +130,14 @@ mod tests { Field { name: "manifest_list", data_type: Utf8, nullable: false, dic

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-08 Thread via GitHub
rshkv commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1907315872 ## crates/iceberg/src/inspect/entries.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] Impl rest catalog + table updates & requirements [iceberg-go]

2025-01-08 Thread via GitHub
chil-pavn commented on PR #146: URL: https://github.com/apache/iceberg-go/pull/146#issuecomment-2578234507 Part of #63 . @jwtryg let me know if you are already working on the unit tests so that we both don't end up doing the same. -- This is an automated message from the Apache Git Servic

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2025-01-08 Thread via GitHub
kevinjqliu commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1907551758 ## tests/integration/test_partitioning_key.py: ## @@ -721,6 +753,27 @@ VALUES (CAST('2023-01-01 11:55:59.99' AS TIMESTAMP), CAS

Re: [PR] Call For Proposals Banner.html [iceberg]

2025-01-08 Thread via GitHub
Nhyi-streamlit commented on code in PR #11924: URL: https://github.com/apache/iceberg/pull/11924#discussion_r1907557513 ## site/overrides/home.html: ## @@ -36,6 +36,15 @@ Apache Icebergā„¢ The open table format for analytic datasets. +

Re: [PR] Call For Proposals Banner.html [iceberg]

2025-01-08 Thread via GitHub
RussellSpitzer commented on PR #11924: URL: https://github.com/apache/iceberg/pull/11924#issuecomment-2578125032 @Nhyi-streamlit could you paste a preview of what the site looks like with the banner in this pr? -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] Call For Proposals Banner.html [iceberg]

2025-01-08 Thread via GitHub
RussellSpitzer commented on PR #11924: URL: https://github.com/apache/iceberg/pull/11924#issuecomment-2578135352 ``` File "/Users/rspitzer/repos/iceberg/site/overrides/home.html", line 326, in template {% endblock %} jinja2.exceptions.TemplateSyntaxError: Unexpected end of templat

Re: [PR] Call For Proposals Banner.html [iceberg]

2025-01-08 Thread via GitHub
Nhyi-streamlit commented on code in PR #11924: URL: https://github.com/apache/iceberg/pull/11924#discussion_r1907566551 ## site/overrides/home.html: ## @@ -36,6 +36,15 @@ Apache Icebergā„¢ The open table format for analytic datasets. +

Re: [PR] Core, Spark: Rewrite data files with high delete ratio [iceberg]

2025-01-08 Thread via GitHub
nastra closed pull request #11825: Core, Spark: Rewrite data files with high delete ratio URL: https://github.com/apache/iceberg/pull/11825 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[PR] Core, Spark: Rewrite data files with high delete ratio [iceberg]

2025-01-08 Thread via GitHub
nastra opened a new pull request, #11825: URL: https://github.com/apache/iceberg/pull/11825 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [I] Implement Other Filesystems Using Go CDK [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade closed issue #92: Implement Other Filesystems Using Go CDK URL: https://github.com/apache/iceberg-go/issues/92 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1907583554 ## core/src/test/java/org/apache/iceberg/avro/RandomAvroData.java: ## @@ -51,6 +53,66 @@ public static List generate(Schema schema, int numRecords, long seed) {

Re: [I] Implement remaining operations for Glue catalog [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade commented on issue #64: URL: https://github.com/apache/iceberg-go/issues/64#issuecomment-2578293725 Yup! They've been implemented! I'll close this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Potential improvements to the release/verify rc scripts [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade closed issue #204: Potential improvements to the release/verify rc scripts URL: https://github.com/apache/iceberg-go/issues/204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2025-01-08 Thread via GitHub
ismailsimsek commented on code in PR #11906: URL: https://github.com/apache/iceberg/pull/11906#discussion_r1907274886 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -589,21 +620,42 @@ private FileURI toFileURI(I input)

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-08 Thread via GitHub
rshkv commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1907317767 ## crates/iceberg/src/metadata_scan.rs: ## @@ -128,6 +140,84 @@ impl<'a> SnapshotsTable<'a> { } } +/// Entries table containing the manifest file's entries. R

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907319937 ## format/spec.md: ## @@ -584,8 +589,8 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _optional_

  1   2   >