Re: [I] gc.enabled property is set to false by default for Apache Iceberg table created in Nessie Catalog [iceberg]

2024-12-12 Thread via GitHub
yunlou11 commented on issue #9562: URL: https://github.com/apache/iceberg/issues/9562#issuecomment-2540814281 > Yes, Nessie GC tool will clean up the expired or unreferenced data files as well along with Iceberg metadata files. Thank you. ``` select * from nessie.robot.ods_robot_dat

Re: [PR] Bump mypy-boto3-glue from 1.35.74 to 1.35.80 [iceberg-python]

2024-12-12 Thread via GitHub
Fokko merged PR #1428: URL: https://github.com/apache/iceberg-python/pull/1428 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Move puffin crate contents inside iceberg crate [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko commented on PR #789: URL: https://github.com/apache/iceberg-rust/pull/789#issuecomment-2540805212 I can see value in both. It could be that folks just want to have logic to read the Puffin files, and then they could just use the crate (for example PyIceberg) without having to pull in

[I] REST catalog doesn't return old history if we execute `CREATE OR REPLACE TABLE` statement [iceberg]

2024-12-12 Thread via GitHub
ebyhr opened a new issue, #11777: URL: https://github.com/apache/iceberg/issues/11777 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine Trino ### Please describe the bug ๐Ÿž https://github.com/trinodb/trino/pull/24312 is trying to use `org.ap

[I] Allow reuse of FileIO object in GlueCatalog for manifest caching to work [iceberg]

2024-12-12 Thread via GitHub
mothukur opened a new issue, #11776: URL: https://github.com/apache/iceberg/issues/11776 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine None ### Please describe the bug ๐Ÿž The current GlueCatalog implementation does not allow for the reuse

Re: [PR] fix: return type for year and month transform should be int [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko commented on PR #776: URL: https://github.com/apache/iceberg-rust/pull/776#issuecomment-2540707195 Let's get this in, thanks @xxchan for fixing this, and thanks @sdd for the review ๐Ÿ™Œ -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] fix: return type for year and month transform should be int [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko merged PR #776: URL: https://github.com/apache/iceberg-rust/pull/776 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-12 Thread via GitHub
ajantha-bhat commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1883384970 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of

Re: [PR] Spark3.4,3.5: In describe extended view command: fix wrong view catalโ€ฆ [iceberg]

2024-12-12 Thread via GitHub
Ppei-Wang commented on code in PR #11751: URL: https://github.com/apache/iceberg/pull/11751#discussion_r1882185303 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestViews.java: ## @@ -1414,7 +1414,42 @@ public void describeExtendedView() {

[PR] Core: Fix numeric overflow of timestamp nano literal [iceberg]

2024-12-12 Thread via GitHub
ebyhr opened a new pull request, #11775: URL: https://github.com/apache/iceberg/pull/11775 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] Feat: support aliyun oss backend. [iceberg-go]

2024-12-12 Thread via GitHub
divinerapier commented on PR #216: URL: https://github.com/apache/iceberg-go/pull/216#issuecomment-2540608798 @zeroshade Fixed, PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] Data loss bug in MergeIntoCommand [iceberg]

2024-12-12 Thread via GitHub
BsoBird commented on issue #11765: URL: https://github.com/apache/iceberg/issues/11765#issuecomment-2540547547 > Shouldn't be any problem i'm aware of. If there are incorrectly applied merges in that situation it would be the validations at commit time are failing. Sir, currently it a

Re: [PR] feat(puffin): Add Puffin crate and CompressionCodec [iceberg-rust]

2024-12-12 Thread via GitHub
fqaiser94 commented on code in PR #745: URL: https://github.com/apache/iceberg-rust/pull/745#discussion_r1883224527 ## crates/puffin/src/compression.rs: ## Review Comment: > I'm thinking how will we deal with PuffinReader/PuffindWriter, which will depend on FileIO?

Re: [PR] feat(puffin): Add Puffin crate and CompressionCodec [iceberg-rust]

2024-12-12 Thread via GitHub
fqaiser94 commented on code in PR #745: URL: https://github.com/apache/iceberg-rust/pull/745#discussion_r1883224527 ## crates/puffin/src/compression.rs: ## Review Comment: > I'm thinking how will we deal with PuffinReader/PuffindWriter, which will depend on FileIO?

[PR] Move puffin crate contents inside iceberg crate [iceberg-rust]

2024-12-12 Thread via GitHub
fqaiser94 opened a new pull request, #789: URL: https://github.com/apache/iceberg-rust/pull/789 Part of https://github.com/apache/iceberg-rust/issues/744 # Summary - Move contents of the puffin crate over to the existing iceberg crate - Delete the puffin crate # Context

Re: [PR] refactor: avoid async_trait macro for IcebergWriter and provide extra dyn trait for object safety [iceberg-rust]

2024-12-12 Thread via GitHub
Xuanwo commented on PR #760: URL: https://github.com/apache/iceberg-rust/pull/760#issuecomment-2540446608 Hi, thank you @wenym1 for your work on this, and thanks to @ZENOTME and @liurenjie1024 for their reviews. I'm a bit concerned about the complexity this PR introduces. > One

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-12 Thread via GitHub
zeroshade commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1883188454 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of th

Re: [PR] Spark3.4,3.5: In describe extended view command: fix wrong view catalโ€ฆ [iceberg]

2024-12-12 Thread via GitHub
ebyhr commented on code in PR #11751: URL: https://github.com/apache/iceberg/pull/11751#discussion_r1883190569 ## spark/v3.4/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeV2ViewExec.scala: ## @@ -55,18 +55,20 @@ case class DescribeV2ViewEx

[PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-12 Thread via GitHub
liurenjie1024 opened a new pull request, #11772: URL: https://github.com/apache/iceberg/pull/11772 Add status page for different implemantations. Thread: https://lists.apache.org/thread/ny59d0o1128k9lf7p5hz2z7jshgny8qg Design doc: https://docs.google.com/document/d/1sRsTatGQJJ

Re: [PR] infra: Dismiss stale reviews [iceberg-rust]

2024-12-12 Thread via GitHub
Xuanwo merged PR #779: URL: https://github.com/apache/iceberg-rust/pull/779 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [I] Data loss bug in MergeIntoCommand [iceberg]

2024-12-12 Thread via GitHub
sfc-gh-rspitzer commented on issue #11765: URL: https://github.com/apache/iceberg/issues/11765#issuecomment-2540419261 Shouldn't be any problem i'm aware of. If there are incorrectly applied merges in that situation it would be the validations at commit time are failing. -- This is an a

Re: [PR] API: add hashcode cache in StructType [iceberg]

2024-12-12 Thread via GitHub
wzx140 commented on PR #11764: URL: https://github.com/apache/iceberg/pull/11764#issuecomment-2540416245 > Q: does it completely mitigate the flatness observed ? can you please attach the flame graph now ? Interesting find @wzx140 **Performance Comparison After Adding Cache** |

Re: [I] Performance Regression Caused by Schema Hash in Spark PartitionPruning with Wide Tables [iceberg]

2024-12-12 Thread via GitHub
wzx140 commented on issue #11763: URL: https://github.com/apache/iceberg/issues/11763#issuecomment-2540415360 **Performance Comparison After Adding Cache** | Metric | Before Adding Cache | After Adding Cache | |-|-

Re: [I] Data loss bug in MergeIntoCommand [iceberg]

2024-12-12 Thread via GitHub
BsoBird commented on issue #11765: URL: https://github.com/apache/iceberg/issues/11765#issuecomment-2540372837 > We don't have auto merge schema so I don't think we have the same issue as in the Delta issue (at least not yet). Do you have any more details about the data loss? Sir, if

Re: [PR] Core: Allow adding files to multiple partition specs in FastAppend [iceberg]

2024-12-12 Thread via GitHub
anuragmantri commented on PR #11771: URL: https://github.com/apache/iceberg/pull/11771#issuecomment-2540277299 @aokolnychyi @stevenzwu - Could you please take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[PR] Core: Allow adding files to multiple partition specs in FastAppend [iceberg]

2024-12-12 Thread via GitHub
anuragmantri opened a new pull request, #11771: URL: https://github.com/apache/iceberg/pull/11771 This PR adds ability to add files to multiple partition specs in FastAppend. Inspired by MergeAppend code. -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Support partitioning spec during data file rewrites in Spark. [iceberg]

2024-12-12 Thread via GitHub
github-actions[bot] commented on PR #11368: URL: https://github.com/apache/iceberg/pull/11368#issuecomment-2540264936 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Support partitioning spec during data file rewrites in Spark. [iceberg]

2024-12-12 Thread via GitHub
github-actions[bot] closed pull request #11368: Support partitioning spec during data file rewrites in Spark. URL: https://github.com/apache/iceberg/pull/11368 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] SnapshotTableProcedure to migrate iceberg tables from one namespace to another [iceberg]

2024-12-12 Thread via GitHub
github-actions[bot] commented on issue #10262: URL: https://github.com/apache/iceberg/issues/10262#issuecomment-2540264645 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Docs: Change to Flink directory for instructions [iceberg]

2024-12-12 Thread via GitHub
github-actions[bot] commented on PR #11031: URL: https://github.com/apache/iceberg/pull/11031#issuecomment-2540264846 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatโ€™s incorrect or this pul

Re: [I] Spark: Dropping partition column from old partition table corrupts entire table [iceberg]

2024-12-12 Thread via GitHub
github-actions[bot] commented on issue #10234: URL: https://github.com/apache/iceberg/issues/10234#issuecomment-2540264612 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] AWS: Add AWS crt client support [iceberg]

2024-12-12 Thread via GitHub
github-actions[bot] closed pull request #10217: AWS: Add AWS crt client support URL: https://github.com/apache/iceberg/pull/10217 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] AWS: Add AWS crt client support [iceberg]

2024-12-12 Thread via GitHub
github-actions[bot] commented on PR #10217: URL: https://github.com/apache/iceberg/pull/10217#issuecomment-2540264566 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-12 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2540225753 Then I suspect it might be the `s3ForcePathStyle` option referred to [here](https://github.com/google/go-cloud/issues/3472). It affected Minio in particular once they moved to s3 V2

[PR] Retry object store reads on temporary errors. [iceberg-rust]

2024-12-12 Thread via GitHub
ryzhyk opened a new pull request, #788: URL: https://github.com/apache/iceberg-rust/pull/788 I noticed that, when reading many parquet files from S3, one of the reads fails occasionally with a temporary error such as "connection closed before message completed". I think such transient failu

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2540196113 I've been able to replicate and debug the issue myself locally. Aside from needing to make a bunch of changes to fix the prefix, bucket and key strings, I was still unable to get goclo

[PR] Bump mypy-boto3-glue from 1.35.74 to 1.35.80 [iceberg-python]

2024-12-12 Thread via GitHub
dependabot[bot] opened a new pull request, #1428: URL: https://github.com/apache/iceberg-python/pull/1428 Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.35.74 to 1.35.80. Commits See full diff in https://github.com/youtype/mypy_boto3_builder/commi

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-12 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2540021664 I did some debugging by copying some of the test scenarios into a regular Go program (if anyone can tell me how to run Delve in VsCode on a test that uses testify please let me know)

Re: [I] Wrong name for parquet page row count min and max stats [iceberg]

2024-12-12 Thread via GitHub
namrathamyske commented on issue #11770: URL: https://github.com/apache/iceberg/issues/11770#issuecomment-2539987558 cc: @stevenzwu @rdblue @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Spec: Support geo type [iceberg]

2024-12-12 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1882776271 ## format/spec.md: ## @@ -1480,6 +1497,9 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** | Not

Re: [PR] Kafka Connect: Commit coordination [iceberg]

2024-12-12 Thread via GitHub
eshishki commented on code in PR #10351: URL: https://github.com/apache/iceberg/pull/10351#discussion_r1882755111 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/channel/Channel.java: ## @@ -0,0 +1,167 @@ +/* + * Licensed to the Apache Software Foundation

Re: [I] Decouple building and serialization [iceberg-rust]

2024-12-12 Thread via GitHub
Sl1mb0 commented on issue #778: URL: https://github.com/apache/iceberg-rust/issues/778#issuecomment-2539813154 > If I understand correctly if you could provide your own implementation of FileIO, would you be able to make it work? This would avoid the copy. Hmm - this may work, but it'

[I] Wrong name for parquet page row count min and max stats [iceberg]

2024-12-12 Thread via GitHub
namrathamyske opened a new issue, #11770: URL: https://github.com/apache/iceberg/issues/11770 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine None ### Please describe the bug ๐Ÿž In TableProperties, the properties PARQUET_ROW_GROUP_CHECK_MAX

Re: [PR] Impl rest catalog + table updates & requirements [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade commented on code in PR #146: URL: https://github.com/apache/iceberg-go/pull/146#discussion_r1882697400 ## table/metadata.go: ## @@ -80,20 +92,544 @@ type Metadata interface { SnapshotByName(name string) *Snapshot // CurrentSnapshot returns the table's

Re: [PR] REST: AuthManager API [iceberg]

2024-12-12 Thread via GitHub
adutra closed pull request #10753: REST: AuthManager API URL: https://github.com/apache/iceberg/pull/10753 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [PR] Impl rest catalog + table updates & requirements [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade commented on PR #146: URL: https://github.com/apache/iceberg-go/pull/146#issuecomment-2539733353 My apologies for the long delay here, could you resolve the conflicts? I should be able to give this a new pass of review in the next day or so. -- This is an automated message from

Re: [PR] Feat: support aliyun oss backend. [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade commented on code in PR #216: URL: https://github.com/apache/iceberg-go/pull/216#discussion_r1882673169 ## catalog/catalog.go: ## @@ -32,6 +32,12 @@ type CatalogType string type AwsProperties map[string]string +type OSSConfig struct { Review Comment: Can we ad

Re: [PR] Auth Manager API part 1: HTTPRequest, HTTPHeader [iceberg]

2024-12-12 Thread via GitHub
adutra commented on code in PR #11769: URL: https://github.com/apache/iceberg/pull/11769#discussion_r1882683361 ## core/src/main/java/org/apache/iceberg/rest/HTTPHeaders.java: ## @@ -0,0 +1,168 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more con

Re: [PR] Auth Manager API part 1: HTTPRequest, HTTPHeader [iceberg]

2024-12-12 Thread via GitHub
adutra commented on code in PR #11769: URL: https://github.com/apache/iceberg/pull/11769#discussion_r1882682585 ## core/src/main/java/org/apache/iceberg/rest/HTTPHeaders.java: ## @@ -0,0 +1,168 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more con

Re: [PR] REST: AuthManager API [iceberg]

2024-12-12 Thread via GitHub
adutra commented on PR #10753: URL: https://github.com/apache/iceberg/pull/10753#issuecomment-2539723438 As requested, this PR will be split in many ones. The first one is #11769. I'm going to close this one now. -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] Auth Manager API part 1: HTTPRequest, HTTPHeader [iceberg]

2024-12-12 Thread via GitHub
adutra commented on code in PR #11769: URL: https://github.com/apache/iceberg/pull/11769#discussion_r1882679648 ## core/src/main/java/org/apache/iceberg/rest/HTTPRequest.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more cont

[PR] Auth Manager API part 1: HTTPRequest, HTTPHeader [iceberg]

2024-12-12 Thread via GitHub
adutra opened a new pull request, #11769: URL: https://github.com/apache/iceberg/pull/11769 As requested, I'm splitting #10753 in many PRs. This one is the first one. It introduces `HTTPRequest` which is a prerequisite for the `AuthManager` API. -- This is an automated message from the Ap

Re: [PR] API: add hashcode cache in StructType [iceberg]

2024-12-12 Thread via GitHub
singhpk234 commented on PR #11764: URL: https://github.com/apache/iceberg/pull/11764#issuecomment-2539708719 Q: does it completely mitigate the flatness observed ? can you please attach the flame graph now ? Interesting find @wzx140 -- This is an automated message from the Apache Git

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-12 Thread via GitHub
dramaticlly commented on PR #11597: URL: https://github.com/apache/iceberg/pull/11597#issuecomment-2539699512 Thank you @danielcweeks @szehon-ho @pvary @kevinjqliu @gaborkaszab @haizhou-zhao for the review! I will look into similar change for hive view existence check -- This is an autom

Re: [PR] build(deps): bump github.com/stretchr/testify from 1.9.0 to 1.10.0 [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade merged PR #218: URL: https://github.com/apache/iceberg-go/pull/218 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Fix `release_rc.sh`, use the right artifact file name [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade merged PR #203: URL: https://github.com/apache/iceberg-go/pull/203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] build(deps): bump golang.org/x/sync from 0.9.0 to 0.10.0 [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade merged PR #223: URL: https://github.com/apache/iceberg-go/pull/223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] build(deps): bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.67.1 to 1.71.0 [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade merged PR #225: URL: https://github.com/apache/iceberg-go/pull/225 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-12 Thread via GitHub
zeroshade commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2539692409 @loicalleyne looks like the integration tests are failing, unable to read the manifest files from the minio instance. -- This is an automated message from the Apache Git Service. To

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-12 Thread via GitHub
danielcweeks commented on PR #11597: URL: https://github.com/apache/iceberg/pull/11597#issuecomment-2539677480 Thanks @dramaticlly !! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-12 Thread via GitHub
danielcweeks merged PR #11597: URL: https://github.com/apache/iceberg/pull/11597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [I] ParallelIterable is deadlocking and is generally really complicated [iceberg]

2024-12-12 Thread via GitHub
sopel39 commented on issue #11768: URL: https://github.com/apache/iceberg/issues/11768#issuecomment-2539614928 @alexjo2144 had a fix that tries to workaround this bug https://github.com/trinodb/trino/pull/23321, but it's only mitigates effects rather than fixing core issue -- This is an

[I] ParallelIterable is deadlocking and is generally really complicated [iceberg]

2024-12-12 Thread via GitHub
sopel39 opened a new issue, #11768: URL: https://github.com/apache/iceberg/issues/11768 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine Trino ### Please describe the bug ๐Ÿž `ParallelIterable` implementation is really complicated and has sub

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-12 Thread via GitHub
nastra commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1882572295 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -46,6 +46,8 @@ public class Endpoint { Endpoint.create("POST", ResourcePaths.V1_NAMESPACE_PRO

Re: [PR] Core: Use HEAD request to check if namespace exists [iceberg]

2024-12-12 Thread via GitHub
nastra merged PR #11761: URL: https://github.com/apache/iceberg/pull/11761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Core: Use HEAD request to check if view exists [iceberg]

2024-12-12 Thread via GitHub
nastra merged PR #11760: URL: https://github.com/apache/iceberg/pull/11760 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Spec: add variant type [iceberg]

2024-12-12 Thread via GitHub
emkornfield commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1882513910 ## format/spec.md: ## @@ -182,6 +182,21 @@ A **`list`** is a collection of values with some element type. The element field A **`map`** is a collection of key

[I] Misleading use of LoadTableResponse in RESTTableOperations.commit() [iceberg]

2024-12-12 Thread via GitHub
creechy opened a new issue, #11767: URL: https://github.com/apache/iceberg/issues/11767 ### Apache Iceberg version None ### Query engine None ### Please describe the bug ๐Ÿž This is a little nit-picky, but the Iceberg REST spec defines the response of the Upd

Re: [PR] Hive: Add Hive 4 support and remove Hive 3 [iceberg]

2024-12-12 Thread via GitHub
nastra commented on code in PR #11750: URL: https://github.com/apache/iceberg/pull/11750#discussion_r1882459582 ## gradle/libs.versions.toml: ## @@ -139,10 +139,10 @@ hive2-exec = { module = "org.apache.hive:hive-exec", version.ref = "hive2" } hive2-metastore = { module = "org

Re: [PR] Core: Change Delete granularity to file for new tables [iceberg]

2024-12-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #11478: URL: https://github.com/apache/iceberg/pull/11478#discussion_r1882456464 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMerge.java: ## @@ -231,7 +233,6 @@ public void testMergeWithVectorizedRe

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-12 Thread via GitHub
dwilson1988 commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2539362190 @loicalleyne - you should just be able to manually remove conflicts in `go.mod`, delete `go.sum` and run a `go mod tidy`. Probably best to do this after syncing your fork and rebasin

Re: [PR] Core: Use HEAD request to check if view exists [iceberg]

2024-12-12 Thread via GitHub
nastra commented on PR #11760: URL: https://github.com/apache/iceberg/pull/11760#issuecomment-2539361082 @amogh-jahagirdar good point, I've added a test to `TestRESTViewCatalog` as that's the more appropriate place to check this (instead of `TestRESTCatalog`) -- This is an automated messa

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-12 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2539346597 @zeroshade > In the meantime can you resolve the conflict in the go.mod? Thanks! I tried updating the `go.mod` version and toolchain versions to match `main` and ran `go mod ti

Re: [I] Data loss bug in MergeIntoCommand [iceberg]

2024-12-12 Thread via GitHub
RussellSpitzer commented on issue #11765: URL: https://github.com/apache/iceberg/issues/11765#issuecomment-2539320431 We don't have auto merge schema so I don't think we have the same issue as in the Delta issue (at least not yet). Do you have any more details about the data loss? -- Thi

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-12 Thread via GitHub
ajreid21 commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1882395425 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -61,6 +63,19 @@ public class Endpoint { Endpoint.create("POST", ResourcePaths.V1_TABLE_REGI

Re: [PR] Spec: Support geo type [iceberg]

2024-12-12 Thread via GitHub
paleolimbot commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1882353225 ## format/spec.md: ## @@ -603,6 +608,10 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a sing

Re: [PR] Spec: Support geo type [iceberg]

2024-12-12 Thread via GitHub
paleolimbot commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1882346427 ## format/spec.md: ## @@ -1480,6 +1497,9 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** | No

Re: [PR] Spec: Support geo type [iceberg]

2024-12-12 Thread via GitHub
paleolimbot commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1882340902 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`**

Re: [PR] feat: eagerly project the arrow schema to scope out non-selected fields [iceberg-rust]

2024-12-12 Thread via GitHub
gruuya commented on PR #785: URL: https://github.com/apache/iceberg-rust/pull/785#issuecomment-2539092128 > let's see if others have any concerns. Thanks! I've revised the test (with a slightly contrived example) seeing that Int8 example is now support with #787 (which addresses my im

Re: [I] Discussion: Support conversion of Arrow `Int8` and `Int16` to `PrimitiveType::Int` [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko closed issue #783: Discussion: Support conversion of Arrow `Int8` and `Int16` to `PrimitiveType::Int` URL: https://github.com/apache/iceberg-rust/issues/783 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Suport conversion of Arrow Int8 and Int16 to Iceberg Int [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko merged PR #787: URL: https://github.com/apache/iceberg-rust/pull/787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Spark3.4,3.5: In describe extended view command: fix wrong view catalโ€ฆ [iceberg]

2024-12-12 Thread via GitHub
Ppei-Wang commented on code in PR #11751: URL: https://github.com/apache/iceberg/pull/11751#discussion_r1882185303 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestViews.java: ## @@ -1414,7 +1414,42 @@ public void describeExtendedView() {

Re: [I] Discussion: Support conversion of Arrow `Int8` and `Int16` to `PrimitiveType::Int` [iceberg-rust]

2024-12-12 Thread via GitHub
gruuya commented on issue #783: URL: https://github.com/apache/iceberg-rust/issues/783#issuecomment-2539028722 Nice, thanks! I've opened a corresponding PR (also took the liberty of adding support for the new Utf8View as well). -- This is an automated message from the Apache Git Se

Re: [I] Eagerly project arrow schema when calculating the parquet `ProjectionMask` [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko commented on issue #784: URL: https://github.com/apache/iceberg-rust/issues/784#issuecomment-2539010624 Thanks @gruuya That would be a great addition. I think we should only project the needed fields anyway. I was doing some testing along the same line, and also noticed that we

Re: [PR] Add more integration tests [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko commented on code in PR #786: URL: https://github.com/apache/iceberg-rust/pull/786#discussion_r1882162789 ## crates/integration_tests/tests/read_evolved_schema.rs: ## @@ -0,0 +1,80 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

[I] Freshness aware table loading in REST catalog [iceberg]

2024-12-12 Thread via GitHub
gaborkaszab opened a new issue, #11766: URL: https://github.com/apache/iceberg/issues/11766 ### Proposed Change There are clients of the Iceberg table format (e.g. query engines) that cache table metadata. In order to keep the cache up-to-date they implement different mechanisms like

Re: [I] Discussion: Support conversion of Arrow `Int8` and `Int16` to `PrimitiveType::Int` [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko commented on issue #783: URL: https://github.com/apache/iceberg-rust/issues/783#issuecomment-2538992416 @gruuya I think that would be fine to cast those into an int ๐Ÿ‘ In PyIceberg we do the same: https://github.com/apache/iceberg-python/blob/547d881948dfe17c92bdde9e5b63a94d095a1

[PR] Add more integration tests [iceberg-rust]

2024-12-12 Thread via GitHub
Fokko opened a new pull request, #786: URL: https://github.com/apache/iceberg-rust/pull/786 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[PR] Eagerly project the arrow schema to scope out non-selected fields [iceberg-rust]

2024-12-12 Thread via GitHub
gruuya opened a new pull request, #785: URL: https://github.com/apache/iceberg-rust/pull/785 Closes #784. Make use of the projected fields to scope down the arrow schema, and thus potentially avoid some conversions which are not supported yet. -- This is an automated message from t

Re: [PR] feat: Expose disable_config_load opendal S3 option [iceberg-rust]

2024-12-12 Thread via GitHub
Xuanwo merged PR #782: URL: https://github.com/apache/iceberg-rust/pull/782 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [I] Expose `disable_config_load` opendal S3config option [iceberg-rust]

2024-12-12 Thread via GitHub
Xuanwo closed issue #781: Expose `disable_config_load` opendal S3config option URL: https://github.com/apache/iceberg-rust/issues/781 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[I] Data loss bug in MergeIntoCommand [iceberg]

2024-12-12 Thread via GitHub
BsoBird opened a new issue, #11765: URL: https://github.com/apache/iceberg/issues/11765 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine Spark ### Please describe the bug ๐Ÿž Recently, I've noticed that when using the merge into statement in

[I] Performance Regression Caused by Schema Hash in Spark PartitionPruning with Wide Tables [iceberg]

2024-12-12 Thread via GitHub
wzx140 opened a new issue, #11763: URL: https://github.com/apache/iceberg/issues/11763 ### Apache Iceberg version 1.5.0 ### Query engine Spark ### Please describe the bug ๐Ÿž **Description**: In Sparkโ€™s optimization rule *PartitionPruning*, the method `Spa

[I] Discussion: Support conversion of Arrow `Int8` and `Int16` to `PrimitiveType::Int` [iceberg-rust]

2024-12-12 Thread via GitHub
gruuya opened a new issue, #783: URL: https://github.com/apache/iceberg-rust/issues/783 Presently only `Int32` is converted into the corresponding Iceberg type `Int` https://github.com/apache/iceberg-rust/blob/42aff04658a00b390122260dbbeaf512d11af61f/crates/iceberg/src/arrow/schema.rs#L370

[I] Rename the partition field and add a field with the same name as the old partition field GOT ERROR [iceberg]

2024-12-12 Thread via GitHub
madeirak opened a new issue, #11762: URL: https://github.com/apache/iceberg/issues/11762 ### Apache Iceberg version 1.4.3 ### Query engine Spark ### Please describe the bug ๐Ÿž ``` CREATE TABLE db03.test_123 ( id INT COMMENT '11', name STRING COM

Re: [PR] Add clang format [iceberg-cpp]

2024-12-12 Thread via GitHub
Fokko commented on PR #4: URL: https://github.com/apache/iceberg-cpp/pull/4#issuecomment-2538644674 Let's move this forward, thanks everyone for chiming in here! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Add clang format [iceberg-cpp]

2024-12-12 Thread via GitHub
Fokko merged PR #4: URL: https://github.com/apache/iceberg-cpp/pull/4 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-12 Thread via GitHub
nastra commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1881884126 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -46,6 +46,8 @@ public class Endpoint { Endpoint.create("POST", ResourcePaths.V1_NAMESPACE_PRO

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-12 Thread via GitHub
nastra commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1881884126 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -46,6 +46,8 @@ public class Endpoint { Endpoint.create("POST", ResourcePaths.V1_NAMESPACE_PRO

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-12 Thread via GitHub
nastra commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1881888246 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -61,6 +63,19 @@ public class Endpoint { Endpoint.create("POST", ResourcePaths.V1_TABLE_REGIST

  1   2   >