Re: [PR] feat: support S3 Table Buckets with S3TablesCatalog [iceberg-python]

2025-01-08 Thread via GitHub
felixscherz commented on code in PR #1429: URL: https://github.com/apache/iceberg-python/pull/1429#discussion_r1908308879 ## pyiceberg/catalog/s3tables.py: ## @@ -0,0 +1,324 @@ +import re +from typing import TYPE_CHECKING, List, Optional, Set, Tuple, Union + +import boto3 + +fro

Re: [PR] Spark 3.5: Procedure to rewrite table path [iceberg]

2025-01-08 Thread via GitHub
dramaticlly closed pull request #11931: Spark 3.5: Procedure to rewrite table path URL: https://github.com/apache/iceberg/pull/11931 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[PR] Avro: Add variant type support [iceberg]

2025-01-08 Thread via GitHub
XBaith opened a new pull request, #11934: URL: https://github.com/apache/iceberg/pull/11934 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
gruuya commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1908289944 ## crates/integrations/datafusion/src/statistics.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [I] Iceberg View Support [iceberg-rust]

2025-01-08 Thread via GitHub
c-thiel commented on issue #55: URL: https://github.com/apache/iceberg-rust/issues/55#issuecomment-2579350785 @liurenjie1024, @Xuanwo, @Fokko, @ZENOTME is any of you aware of someone currently working on the `ViewMetadataBuilder`? Otherwise I would start working on it next week :) -- Thi

Re: [PR] Parquet: Add readers and writers for the internal object model [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on code in PR #11904: URL: https://github.com/apache/iceberg/pull/11904#discussion_r1908250543 ## parquet/src/main/java/org/apache/iceberg/data/parquet/InternalReader.java: ## @@ -0,0 +1,207 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Parquet: Add readers and writers for the internal object model [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on code in PR #11904: URL: https://github.com/apache/iceberg/pull/11904#discussion_r1908250543 ## parquet/src/main/java/org/apache/iceberg/data/parquet/InternalReader.java: ## @@ -0,0 +1,207 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Parquet: Add readers and writers for the internal object model [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on code in PR #11904: URL: https://github.com/apache/iceberg/pull/11904#discussion_r1908235843 ## parquet/src/main/java/org/apache/iceberg/data/parquet/InternalWriter.java: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Spark 3.5: Refactor delete logic in batch reading [iceberg]

2025-01-08 Thread via GitHub
huaxingao commented on code in PR #11933: URL: https://github.com/apache/iceberg/pull/11933#discussion_r1908213655 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnVectorBuilder.java: ## @@ -26,13 +26,6 @@ class ColumnVectorBuilder { private

[PR] Spark 3.5: Refactor delete logic in batch reading [iceberg]

2025-01-08 Thread via GitHub
huaxingao opened a new pull request, #11933: URL: https://github.com/apache/iceberg/pull/11933 Address the comments in https://github.com/apache/iceberg/pull/9841#discussion_r1906083743 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [I] Validation Error in ConfigResponse Model with RestCatalog in PyIceberg using Nessie REST API [iceberg]

2025-01-08 Thread via GitHub
heman026 commented on issue #11255: URL: https://github.com/apache/iceberg/issues/11255#issuecomment-2579231141 Hi I am getting the same error - ValidationError: 'defaults' and 'overrides' fields are missing in the ConfigResponse model. Did you resolve it. -- This is an automated

[I] A casting error occurs when Sanitizing the expression value in a specific case. [iceberg]

2025-01-08 Thread via GitHub
dmgkeke opened a new issue, #11932: URL: https://github.com/apache/iceberg/issues/11932 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine Flink ### Please describe the bug šŸž I found a code suspected of being a bug while running rewrite data

Re: [I] Manifests table scan should return iceberg schema rather arrow schema [iceberg-rust]

2025-01-08 Thread via GitHub
liurenjie1024 commented on issue #868: URL: https://github.com/apache/iceberg-rust/issues/868#issuecomment-2579192057 The reason I suggest returning iceberg schema is that metadata table is a concept in iceberg library, not only in datafusion integration. The difference is that, iceberg li

Re: [I] feat: Expose Iceberg table statistics in DataFusion interface(s) [iceberg-rust]

2025-01-08 Thread via GitHub
liurenjie1024 commented on issue #869: URL: https://github.com/apache/iceberg-rust/issues/869#issuecomment-2579193342 Thanks @gruuya for doing this, let's continue the discussion in pr. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
liurenjie1024 commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1908170767 ## crates/integrations/datafusion/src/table/mod.rs: ## @@ -41,16 +42,21 @@ pub struct IcebergTableProvider { table: Table, /// Table snapshot id that

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
liurenjie1024 commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1908168716 ## crates/integrations/datafusion/src/statistics.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

Re: [PR] feat(catalog): Add Catalog Registry [iceberg-go]

2025-01-08 Thread via GitHub
kevinjqliu commented on code in PR #244: URL: https://github.com/apache/iceberg-go/pull/244#discussion_r1908133151 ## catalog/rest_test.go: ## @@ -114,6 +114,39 @@ func (r *RestCatalogSuite) TestToken200() { r.Equal(r.configVals.Get("warehouse"), "s3://some-bucket") }

Re: [PR] API: Support removeUnusedSpecs in ExpireSnapshots [iceberg]

2025-01-08 Thread via GitHub
advancedxy commented on PR #10755: URL: https://github.com/apache/iceberg/pull/10755#issuecomment-2579123258 Thanks all for reviewing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] [feature] UpdateSchema.add_column supports both parent and child in the same transaction [iceberg-python]

2025-01-08 Thread via GitHub
kevinjqliu commented on issue #1493: URL: https://github.com/apache/iceberg-python/issues/1493#issuecomment-2579119883 sure @jiakai-li assigned to you! let me know if you have any questions -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] Iceberg API is unable to connect to Hive Metastore > 4.0.0-beta-1 [iceberg]

2025-01-08 Thread via GitHub
manuzhang commented on issue #11928: URL: https://github.com/apache/iceberg/issues/11928#issuecomment-2579111744 We haven't supported connecting to metastore with Hive 4.0 yet. There's ongoing [PR](https://github.com/apache/iceberg/pull/11750) and [discussion](https://lists.apache.org/threa

Re: [PR] Spark 3.5: Procedure to rewrite table path [iceberg]

2025-01-08 Thread via GitHub
dramaticlly commented on PR #11931: URL: https://github.com/apache/iceberg/pull/11931#issuecomment-2579105024 FYI @szehon-ho @flyrain @karuppayya @anuragmantri if you want to take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[PR] Spark 3.5: Procedure to rewrite table path [iceberg]

2025-01-08 Thread via GitHub
dramaticlly opened a new pull request, #11931: URL: https://github.com/apache/iceberg/pull/11931 Add spark procedure for rewrite table path -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Support Location Providers [iceberg-python]

2025-01-08 Thread via GitHub
jiakai-li commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1908109298 ## pyiceberg/table/locations.py: ## @@ -0,0 +1,82 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [I] [feature] UpdateSchema.add_column supports both parent and child in the same transaction [iceberg-python]

2025-01-08 Thread via GitHub
jiakai-li commented on issue #1493: URL: https://github.com/apache/iceberg-python/issues/1493#issuecomment-2579086561 I'm happy to pick this up if it's not assigned yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] [feature] Add support for `write.data.path` and `write.metadata.path` [iceberg-python]

2025-01-08 Thread via GitHub
jiakai-li commented on issue #1492: URL: https://github.com/apache/iceberg-python/issues/1492#issuecomment-2579084317 Hey @smaheshwar-pltr , thanks for the offer! I hadnā€™t realized this feature is closely tied to LocationProvider. After looking into it, I think it could fit well with the c

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2025-01-08 Thread via GitHub
mun1r0b0t commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2578955279 @bryanck Any further thoughts on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Core: Relocate parquet to core [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on PR #11716: URL: https://github.com/apache/iceberg/pull/11716#issuecomment-2578938991 Currently introducing internal writers for parquet: https://github.com/apache/iceberg/pull/11904 Once, it is merged, I will revive/rework the original https://github.com/apa

Re: [PR] Impl rest catalog + table updates & requirements [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade commented on PR #146: URL: https://github.com/apache/iceberg-go/pull/146#issuecomment-2578935791 Btw, since both of you are being active on here, I'd love it either of you would be able and willing to give a look at the PRs I filed recently (#244 and #245) for more catalog functio

Re: [I] Bucket name getting appended to minIO service name [iceberg-python]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #908: URL: https://github.com/apache/iceberg-python/issues/908#issuecomment-2578932372 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity oc

Re: [PR] Core: Relocate parquet to core [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on PR #11716: URL: https://github.com/apache/iceberg/pull/11716#issuecomment-2578930131 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pul

Re: [PR] Flink: Replace use of deprecated methods [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on PR #11658: URL: https://github.com/apache/iceberg/pull/11658#issuecomment-2578930092 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [I] Potential bug in `o.a.i.mapping.MappingUtil.UpdateMapping.addNewFields()` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10596: URL: https://github.com/apache/iceberg/issues/10596#issuecomment-2578929955 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Potential NPE in `o.a.i.orc.OrcValueReaders.StructReader#readInternal` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10594: URL: https://github.com/apache/iceberg/issues/10594#issuecomment-2578929935 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] `o.a.i.util.Tasks.Builder.runSingleThreaded` possbily broken [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10597: `o.a.i.util.Tasks.Builder.runSingleThreaded` possbily broken URL: https://github.com/apache/iceberg/issues/10597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Flink: Replace use of deprecated methods [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed pull request #11658: Flink: Replace use of deprecated methods URL: https://github.com/apache/iceberg/pull/11658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on PR #11497: URL: https://github.com/apache/iceberg/pull/11497#issuecomment-2578930024 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pul

Re: [I] Potential NPE in `o.a.i.orc.OrcValueReaders.StructReader#readInternal` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10594: Potential NPE in `o.a.i.orc.OrcValueReaders.StructReader#readInternal` URL: https://github.com/apache/iceberg/issues/10594 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2578930057 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pul

Re: [I] `o.a.i.util.Tasks.Builder.runSingleThreaded` possbily broken [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10597: URL: https://github.com/apache/iceberg/issues/10597#issuecomment-2578929974 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Java visibility issue for `DataStatisticsSerializer` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10588: Java visibility issue for `DataStatisticsSerializer` URL: https://github.com/apache/iceberg/issues/10588 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Java visibility issue for `org.apache.iceberg.dell.ecs.EcsURI` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10587: Java visibility issue for `org.apache.iceberg.dell.ecs.EcsURI` URL: https://github.com/apache/iceberg/issues/10587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Java visibility issue for `org.apache.iceberg.dell.ecs.EcsURI` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10587: URL: https://github.com/apache/iceberg/issues/10587#issuecomment-2578929846 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Potential bug in `o.a.i.mapping.MappingUtil.UpdateMapping.addNewFields()` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10596: Potential bug in `o.a.i.mapping.MappingUtil.UpdateMapping.addNewFields()` URL: https://github.com/apache/iceberg/issues/10596 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] Java visibility issues in `o.a.i.flink.sink.shuffle.AggregatedStatisticsSerializer` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10590: Java visibility issues in `o.a.i.flink.sink.shuffle.AggregatedStatisticsSerializer` URL: https://github.com/apache/iceberg/issues/10590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Java visibility issue for `SnowflakeTableMetadata` and `SnowflakeIdentifier` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10586: URL: https://github.com/apache/iceberg/issues/10586#issuecomment-2578929830 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Java visibility issues in `o.a.i.flink.sink.shuffle.AggregatedStatisticsSerializer` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10590: URL: https://github.com/apache/iceberg/issues/10590#issuecomment-2578929915 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Java visibility issue for `TableScanContext` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10589: URL: https://github.com/apache/iceberg/issues/10589#issuecomment-2578929889 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Java visibility issue for `DataStatisticsSerializer` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10588: URL: https://github.com/apache/iceberg/issues/10588#issuecomment-2578929873 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Java visibility issue for `TableScanContext` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10589: Java visibility issue for `TableScanContext` URL: https://github.com/apache/iceberg/issues/10589 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Java visibility issue for `SnowflakeTableMetadata` and `SnowflakeIdentifier` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10586: Java visibility issue for `SnowflakeTableMetadata` and `SnowflakeIdentifier` URL: https://github.com/apache/iceberg/issues/10586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Java visibility issue for `org.apache.iceberg.ManifestEntry` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10585: Java visibility issue for `org.apache.iceberg.ManifestEntry` URL: https://github.com/apache/iceberg/issues/10585 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Java visibility issue for `org.apache.iceberg.ManifestEntry` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10585: URL: https://github.com/apache/iceberg/issues/10585#issuecomment-2578929815 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Java visibility issue for `org.apache.iceberg.expressions.BoundAggregate.Aggregator` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] commented on issue #10584: URL: https://github.com/apache/iceberg/issues/10584#issuecomment-2578929798 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Java visibility issue for `org.apache.iceberg.expressions.BoundAggregate.Aggregator` [iceberg]

2025-01-08 Thread via GitHub
github-actions[bot] closed issue #10584: Java visibility issue for `org.apache.iceberg.expressions.BoundAggregate.Aggregator` URL: https://github.com/apache/iceberg/issues/10584 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
ajantha-bhat commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1908005218 ## core/src/test/java/org/apache/iceberg/avro/AvroTestHelpers.java: ## @@ -126,9 +139,18 @@ private static void assertEquals(Type type, Object expected, Object

[PR] feat(catalog): Initial implementation of sql catalog [iceberg-go]

2025-01-08 Thread via GitHub
zeroshade opened a new pull request, #246: URL: https://github.com/apache/iceberg-go/pull/246 Building on #244 and #245 (both of which would need to be merged before this), this creates an initial implementation of a SQL catalog. We're utilizing the https://bun.uptrace.dev/ ORM to han

Re: [PR] Spark: support statistics files in RewriteTablePath [iceberg]

2025-01-08 Thread via GitHub
dramaticlly commented on code in PR #11929: URL: https://github.com/apache/iceberg/pull/11929#discussion_r1907943810 ## core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java: ## @@ -126,8 +126,7 @@ public static TableMetadata replacePaths( metadata.snapshotLog

Re: [PR] Core: Unimplement Map from CharSequenceMap to obey contract [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on PR #11704: URL: https://github.com/apache/iceberg/pull/11704#issuecomment-2578783272 I'm in favor of documenting these differences and moving on. I think the utility of the Map interface is worth it, and I don't think that the minor issues, like not being able to modify

Re: [PR] Spark: support statistics files in RewriteTablePath [iceberg]

2025-01-08 Thread via GitHub
flyrain commented on code in PR #11929: URL: https://github.com/apache/iceberg/pull/11929#discussion_r1907927060 ## core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java: ## @@ -126,8 +126,7 @@ public static TableMetadata replacePaths( metadata.snapshotLog(),

[PR] API, CORE: Adds Row Lineage Fields [iceberg]

2025-01-08 Thread via GitHub
RussellSpitzer opened a new pull request, #11930: URL: https://github.com/apache/iceberg/pull/11930 https://docs.google.com/document/d/146YuAnU17prnIhyuvbCtCtVSavyd5N7hKryyVRaFDTE/edit?tab=t.0#heading=h.f2e8ffw3fu7n -- This is an automated message from the Apache Git Service. To respond t

Re: [I] Forbidden Exception creating Polaris Rest catalog with Flink 1.20 [iceberg]

2025-01-08 Thread via GitHub
shantanu-dahiya commented on issue #11836: URL: https://github.com/apache/iceberg/issues/11836#issuecomment-2578764964 Client logs for the same error when running Trino with the Iceberg connector: ``` 2025-01-08T20:39:24.124Z ERROR dispatcher-query-22 io.trino.execution.Query

[PR] Spark: support statistics files in RewriteTablePath [iceberg]

2025-01-08 Thread via GitHub
dramaticlly opened a new pull request, #11929: URL: https://github.com/apache/iceberg/pull/11929 Statistics files are helpful to determine the NDV for each columns in a table and can be collected via engines like [trino](https://trino.io/docs/current/connector/iceberg.html#updating-table-st

Re: [PR] Impl rest catalog + table updates & requirements [iceberg-go]

2025-01-08 Thread via GitHub
jwtryg commented on PR #146: URL: https://github.com/apache/iceberg-go/pull/146#issuecomment-2578710739 @zeroshade that's super :) @chil-pavn I have only just started - so feel free to go ahead. Will you only be working on the unit tests for the rest catalog operations, however? --

Re: [I] Kafka Connect: Add SMTs for Debezium and AWS DMS [iceberg]

2025-01-08 Thread via GitHub
fuzing commented on issue #10844: URL: https://github.com/apache/iceberg/issues/10844#issuecomment-2578699172 Also looking for the JSONToMap transform from the original Tabular kafka-connect version. I'm wondering why these weren't made part of the contribution to Apache, as many folks are

[PR] [WIP] Feat: replace sort order [iceberg-python]

2025-01-08 Thread via GitHub
JasperHG90 opened a new pull request, #1500: URL: https://github.com/apache/iceberg-python/pull/1500 This PR adds functionality to replace a table's sort order. Closes #1245 Some basic tests are implemented but need to be expanded. Currently, a new sort order ID is assigned. Th

Re: [I] Official iceberg kafka-connect is missing SMTs from original Databricks/Tabular repository [iceberg]

2025-01-08 Thread via GitHub
ismailsimsek commented on issue #11914: URL: https://github.com/apache/iceberg/issues/11914#issuecomment-2578694285 duplicate of https://github.com/apache/iceberg/issues/10844 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
gruuya commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1907874949 ## crates/integrations/datafusion/src/table/mod.rs: ## @@ -41,16 +42,21 @@ pub struct IcebergTableProvider { table: Table, /// Table snapshot id that will b

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-08 Thread via GitHub
gruuya commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1907869791 ## crates/integrations/datafusion/src/statistics.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] Add table statistics [iceberg-python]

2025-01-08 Thread via GitHub
ndrluis commented on PR #1285: URL: https://github.com/apache/iceberg-python/pull/1285#issuecomment-2578671151 @kevinjqliu Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Modified exception objects being thrown when converting Pyarrow tables [iceberg-python]

2025-01-08 Thread via GitHub
DevChrisCross commented on PR #1498: URL: https://github.com/apache/iceberg-python/pull/1498#issuecomment-2578661029 @kevinjqliu Ah yes I've noticed that part as well, I've initially placed on the `primitive` because based on my understanding, it traverses through the `schema` until it reac

Re: [PR] Add `all_manifests` metadata table with tests [iceberg-python]

2025-01-08 Thread via GitHub
soumya-ghosh commented on PR #1241: URL: https://github.com/apache/iceberg-python/pull/1241#issuecomment-2578625757 @kevinjqliu conflict is resolved. As the PR is approved but not merged for over a month now, hence merge conflicts happen occasionally. -- This is an automated messa

Re: [I] Forbidden Exception creating Polaris Rest catalog with Flink 1.20 [iceberg]

2025-01-08 Thread via GitHub
shantanu-dahiya commented on issue #11836: URL: https://github.com/apache/iceberg/issues/11836#issuecomment-2578591980 I believe the root cause of this issue is the envoy proxy on Istio sidecars not supporting the `Upgrade: TLS/1.2` header, causing client requests with this header to be [re

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
RussellSpitzer commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907808596 ## format/spec.md: ## @@ -1480,6 +1494,9 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
paleolimbot commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907689826 ## format/spec.md: ## @@ -1633,3 +1652,27 @@ might indicate different snapshot IDs for a specific timestamp. The discrepancie When processing point in time qu

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907683453 ## format/spec.md: ## @@ -1633,3 +1652,27 @@ might indicate different snapshot IDs for a specific timestamp. The discrepancie When processing point in time queries

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907682053 ## format/spec.md: ## @@ -1633,3 +1652,27 @@ might indicate different snapshot IDs for a specific timestamp. The discrepancie When processing point in time queries

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907681619 ## format/spec.md: ## @@ -1633,3 +1652,27 @@ might indicate different snapshot IDs for a specific timestamp. The discrepancie When processing point in time queries

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907680822 ## format/spec.md: ## @@ -1633,3 +1652,27 @@ might indicate different snapshot IDs for a specific timestamp. The discrepancie When processing point in time queries

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907680512 ## format/spec.md: ## @@ -1633,3 +1652,27 @@ might indicate different snapshot IDs for a specific timestamp. The discrepancie When processing point in time queries

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907674111 ## format/spec.md: ## @@ -1506,6 +1523,8 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** | **`JSON object by

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907672639 ## format/spec.md: ## @@ -1480,6 +1494,9 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** | Not sup

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907668388 ## format/spec.md: ## @@ -1239,6 +1247,9 @@ When reading an `unknown` column, any corresponding column must be ignored and r | **`struct`** | `struct`

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907667162 ## format/spec.md: ## @@ -1154,6 +1158,8 @@ Maps with non-string keys must use an array representation with the `map` logica |**`struct`**|`record`|| |**`list`**|`a

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907666113 ## format/spec.md: ## @@ -603,8 +608,9 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single dat

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907666748 ## format/spec.md: ## @@ -940,9 +946,7 @@ Note that partition data tuple's schema is based on the partition spec output us The unified partition type is a struct con

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907663335 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907662104 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907648464 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907654445 ## format/spec.md: ## @@ -603,8 +608,9 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single dat

Re: [PR] Call For Proposals Banner.html [iceberg]

2025-01-08 Thread via GitHub
RussellSpitzer commented on code in PR #11924: URL: https://github.com/apache/iceberg/pull/11924#discussion_r1907653274 ## site/overrides/home.html: ## @@ -36,6 +36,15 @@ Apache Icebergā„¢ The open table format for analytic datasets. +

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907651448 ## format/spec.md: ## @@ -603,8 +608,9 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single dat

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907650559 ## format/spec.md: ## @@ -449,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | Transform name| Description

Re: [PR] Call For Proposals Banner.html [iceberg]

2025-01-08 Thread via GitHub
RussellSpitzer commented on code in PR #11924: URL: https://github.com/apache/iceberg/pull/11924#discussion_r1907646692 ## site/overrides/home.html: ## @@ -36,6 +36,15 @@ Apache Icebergā„¢ The open table format for analytic datasets. +

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907648464 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Spec: Support geo type [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1907642990 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`** |

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2025-01-08 Thread via GitHub
stevenzwu commented on PR #11781: URL: https://github.com/apache/iceberg/pull/11781#issuecomment-2578314363 BTW, I like the new direction that @RussellSpitzer outlined. using byte size (instead of number of elements) is more intuitive and easier to calculate a good default to cap memory foo

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1907595448 ## core/src/test/java/org/apache/iceberg/avro/AvroTestHelpers.java: ## @@ -126,9 +139,18 @@ private static void assertEquals(Type type, Object expected, Object actual

Re: [I] [feature] Add support for `write.data.path` and `write.metadata.path` [iceberg-python]

2025-01-08 Thread via GitHub
smaheshwar-pltr commented on issue #1492: URL: https://github.com/apache/iceberg-python/issues/1492#issuecomment-2578312852 Thanks for volunteering @jiakai-li! Happy to review the `LocationProvider`-related changes for `write.data.path` if it'd help šŸ˜„ -- This is an automated message fro

Re: [PR] Modified exception objects being thrown when converting Pyarrow tables [iceberg-python]

2025-01-08 Thread via GitHub
kevinjqliu commented on code in PR #1498: URL: https://github.com/apache/iceberg-python/pull/1498#discussion_r1907568373 ## pyiceberg/io/pyarrow.py: ## @@ -1140,6 +1147,12 @@ def map(self, map_type: pa.MapType, key_result: IcebergType, value_result: Icebe return MapTyp

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-08 Thread via GitHub
rdblue commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1907589600 ## core/src/test/java/org/apache/iceberg/avro/AvroTestHelpers.java: ## @@ -126,9 +139,18 @@ private static void assertEquals(Type type, Object expected, Object actual

  1   2   >