Re: [PR] Core: Allow servers to express supported endpoints via endpoint field in ConfigResponse [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #10929: URL: https://github.com/apache/iceberg/pull/10929#discussion_r1758309153 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contrib

Re: [PR] Core: Allow servers to express supported endpoints via endpoint field in ConfigResponse [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #10929: URL: https://github.com/apache/iceberg/pull/10929#discussion_r1757340312 ## core/src/main/java/org/apache/iceberg/rest/RESTTableOperations.java: ## @@ -97,12 +102,14 @@ public TableMetadata current() { @Override public TableMetadata

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758340280 ## core/src/main/java/org/apache/iceberg/CatalogUtil.java: ## @@ -137,6 +138,18 @@ public static void dropTableData(FileIO io, TableMetadata metadata) { deleteFile

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758346413 ## core/src/main/java/org/apache/iceberg/CatalogUtil.java: ## @@ -137,6 +138,18 @@ public static void dropTableData(FileIO io, TableMetadata metadata) { deleteFile

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758353596 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -229,12 +328,18 @@ private List listIcebergTables( .collect(Collectors.toList())

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758353596 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -229,12 +328,18 @@ private List listIcebergTables( .collect(Collectors.toList())

Re: [I] Does iceberg have a plan to support Multi-Statement and Multi-Table Transactions ? [iceberg]

2024-09-13 Thread via GitHub
HemantMarve commented on issue #1074: URL: https://github.com/apache/iceberg/issues/1074#issuecomment-2348269381 Hi @rdblue , Is there any engine starts supporting Multi-Table transactions? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] feat (datafusion integration): convert datafusion expr filters to Iceberg Predicate [iceberg-rust]

2024-09-13 Thread via GitHub
a-agmon commented on PR #588: URL: https://github.com/apache/iceberg-rust/pull/588#issuecomment-2348277213 Hi @liurenjie1024 , just fixed a few conflicts due to https://github.com/apache/iceberg-rust/pull/594 (@FANNG1) Should be good now -- This is an automated message from the Ap

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nk1506 commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758368163 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -229,12 +328,18 @@ private List listIcebergTables( .collect(Collectors.toList())

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758427065 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -229,12 +328,18 @@ private List listIcebergTables( .collect(Collectors.toList())

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758453980 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758455344 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758455791 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758457669 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [I] Iceberg data file Not Found but have an entry in table.files catalog [iceberg]

2024-09-13 Thread via GitHub
dorsegal commented on issue #8338: URL: https://github.com/apache/iceberg/issues/8338#issuecomment-2348401004 > Is there a particular reason that you're using Hadoop's `S3AFileSystem`? You could switch to using `S3FileIO` when using Minio. Is it possible that `s3a://XX/data/event_ts

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758464245 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758467635 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758468636 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758471587 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758473323 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [I] Failing to create a table using pyiceberg [iceberg-python]

2024-09-13 Thread via GitHub
ArijitSinghEDA commented on issue #692: URL: https://github.com/apache/iceberg-python/issues/692#issuecomment-2348421783 Hello all, I am facing the same issue when trying to use SQL Catalog with PostgreSQL. ``` catalog = load_catalog( "default", **{

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758492168 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758507077 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758509477 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1758534281 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1758556272 ## open-api/rest-catalog-open-api.yaml: ## @@ -3129,6 +3204,11 @@ components: - `s3.secret-access-key`: secret for credentials that provide access to data i

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1758556272 ## open-api/rest-catalog-open-api.yaml: ## @@ -3129,6 +3204,11 @@ components: - `s3.secret-access-key`: secret for credentials that provide access to data i

[PR] Core: Fix caching table with metadata table names [iceberg]

2024-09-13 Thread via GitHub
manuzhang opened a new pull request, #11123: URL: https://github.com/apache/iceberg/pull/11123 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nk1506 commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758592399 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[I] what does value of partition mean in table dbxxx.tbxxx.partitions? [iceberg]

2024-09-13 Thread via GitHub
madeirak opened a new issue, #11125: URL: https://github.com/apache/iceberg/issues/11125 ### Query engine spark sql iceberg=1.4.3 ### Question From the below picture, it can be seen that the value of `hour(time)` is displayed as 473356 in the dbxx.tbxx.sections table.

[I] AWS: Glue ETL Job fails to create a table using lakeformation [iceberg]

2024-09-13 Thread via GitHub
Rizxcviii opened a new issue, #11126: URL: https://github.com/apache/iceberg/issues/11126 ### Apache Iceberg version 1.6.0 ### Query engine Spark ### Please describe the bug 🐞 I have a similar issue to #10226, but given the issue title, I wasn't sure whethe

Re: [PR] fix: SIGSEGV when describe empty table [iceberg-go]

2024-09-13 Thread via GitHub
alex-kar commented on PR #145: URL: https://github.com/apache/iceberg-go/pull/145#issuecomment-2348837415 @zeroshade Same result ``` func ExampleDescribeTable() { meta, _ := table.ParseMetadataBytes([]byte(ExampleTableMetadataV1)) table := table.New([]string{"t"}, meta,

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758806275 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,472 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758809463 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,472 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758812696 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,472 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758816947 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,472 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1758817493 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCommits.java: ## @@ -0,0 +1,472 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nk1506 commented on PR #9852: URL: https://github.com/apache/iceberg/pull/9852#issuecomment-2349077935 Build is failing because of flaky issue https://github.com/apache/iceberg/issues/11046 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra closed pull request #9852: Hive: Add View support for HIVE catalog URL: https://github.com/apache/iceberg/pull/9852 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nk1506 opened a new pull request, #9852: URL: https://github.com/apache/iceberg/pull/9852 This changes include: 1. Introduction of common metadata interface(BaseMetadata) for table and view. 2. Refactor for HiveTableOperation to have common code for table and view commits. Ref

Re: [PR] Core: Allow servers to express supported endpoints via endpoint field in ConfigResponse [iceberg]

2024-09-13 Thread via GitHub
danielcweeks commented on code in PR #10929: URL: https://github.com/apache/iceberg/pull/10929#discussion_r1759056952 ## core/src/main/java/org/apache/iceberg/rest/RESTTableOperations.java: ## @@ -97,12 +102,14 @@ public TableMetadata current() { @Override public TableMe

Re: [PR] Add new connection pool interface for JdbcCatalog [iceberg]

2024-09-13 Thread via GitHub
alessandro-nori closed pull request #11127: Add new connection pool interface for JdbcCatalog URL: https://github.com/apache/iceberg/pull/11127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Spec: Add v3 types and type promotion [iceberg]

2024-09-13 Thread via GitHub
aihuaxu commented on code in PR #10955: URL: https://github.com/apache/iceberg/pull/10955#discussion_r1759094008 ## format/spec.md: ## @@ -950,6 +977,7 @@ Maps with non-string keys must use an array representation with the `map` logica |**`uuid`**|`{ "type": "fixed",`  `"size"

Re: [PR] Core: Allow servers to express supported endpoints via endpoint field in ConfigResponse [iceberg]

2024-09-13 Thread via GitHub
nastra commented on PR #10929: URL: https://github.com/apache/iceberg/pull/10929#issuecomment-2349293620 I'll go ahead and merge this, thanks everyone for the feedback/reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Core: Allow servers to express supported endpoints via endpoint field in ConfigResponse [iceberg]

2024-09-13 Thread via GitHub
nastra merged PR #10929: URL: https://github.com/apache/iceberg/pull/10929 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra commented on PR #9852: URL: https://github.com/apache/iceberg/pull/9852#issuecomment-2349315113 thanks @nk1506 for your patience here. Also thanks for everyone that helped out with reviews -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nastra merged PR #9852: URL: https://github.com/apache/iceberg/pull/9852 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] [bug] [REST] Dont remove identifier root [iceberg-python]

2024-09-13 Thread via GitHub
kevinjqliu merged PR #1172: URL: https://github.com/apache/iceberg-python/pull/1172 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] [CI] Run on different platforms (ubuntu/windows/mac/mac m1) [iceberg-python]

2024-09-13 Thread via GitHub
kevinjqliu closed pull request #1173: [CI] Run on different platforms (ubuntu/windows/mac/mac m1) URL: https://github.com/apache/iceberg-python/pull/1173 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-09-13 Thread via GitHub
nk1506 commented on PR #9852: URL: https://github.com/apache/iceberg/pull/9852#issuecomment-2349377256 Thanks @nastra and @danielcweeks for your reviews and feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-09-13 Thread via GitHub
flyrain commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1759207112 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [PR] fix: SIGSEGV when describe empty table [iceberg-go]

2024-09-13 Thread via GitHub
zeroshade commented on PR #145: URL: https://github.com/apache/iceberg-go/pull/145#issuecomment-2349659281 Okay, in that case i think it's fine to merge this as is without the tests for now until we can figure out how to write tests using `pterm`. Thanks @alex-kar and feel free to cre

Re: [PR] Spark 3.4: Action to compute table stats [iceberg]

2024-09-13 Thread via GitHub
szehon-ho merged PR #11106: URL: https://github.com/apache/iceberg/pull/11106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.4: Action to compute table stats [iceberg]

2024-09-13 Thread via GitHub
szehon-ho commented on PR #11106: URL: https://github.com/apache/iceberg/pull/11106#issuecomment-2349803095 Merged, thanks @karuppayya and all for additional review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-13 Thread via GitHub
ookumuso commented on code in PR #2: URL: https://github.com/apache/iceberg/pull/2#discussion_r1759322782 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3LocationProvider.java: ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-13 Thread via GitHub
ookumuso commented on code in PR #2: URL: https://github.com/apache/iceberg/pull/2#discussion_r1759323116 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3LocationProvider.java: ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-13 Thread via GitHub
ookumuso commented on PR #2: URL: https://github.com/apache/iceberg/pull/2#issuecomment-2349883643 > Left some comments but a had a possibly naive question to just check my understanding: > > In the past for object storage provider, we've used a wider character set in the has

Re: [PR] Spark 3.4: Add utility to load table state reliably [iceberg]

2024-09-13 Thread via GitHub
dramaticlly commented on PR #5: URL: https://github.com/apache/iceberg/pull/5#issuecomment-2350004226 Rebased on #11106 as now it's the clean backport of #10984 , @szehon-ho I think this worth backporting to spark 3.4 as well. -- This is an automated message from the Apache Git Se

[I] Minimum required pyarrow version [iceberg-python]

2024-09-13 Thread via GitHub
gli-chris-hao opened a new issue, #1174: URL: https://github.com/apache/iceberg-python/issues/1174 ### Apache Iceberg version 0.7.1 (latest release) ### Please describe the bug 🐞 1. pyiceberg calls pyarrow `concat_tables` with `promote_options` here: https://github.com/a

Re: [I] [feat] add missing metadata tables [iceberg-python]

2024-09-13 Thread via GitHub
soumya-ghosh commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2350097049 From [spark docs](https://iceberg.apache.org/docs/latest/spark-queries/#all-metadata-tables), > These tables are unions of the metadata tables specific to the cu

Re: [I] [feat] add missing metadata tables [iceberg-python]

2024-09-13 Thread via GitHub
kevinjqliu commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2350320703 I see. So if I have a new table and append to it 5 times, I expect 5 snapshots and 5 manifest list files. I think each manifest list file will repeatedly refer to the same

Re: [PR] Flink: Custom partitioner for bucket partitions [iceberg]

2024-09-13 Thread via GitHub
stevenzwu commented on PR #7161: URL: https://github.com/apache/iceberg/pull/7161#issuecomment-2350416841 @binshuohu Currently, there is no plan to reapply this change to the main branch. We have a more general range distribution available now (guided by statistics collection): https://ice

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-13 Thread via GitHub
danielcweeks commented on code in PR #2: URL: https://github.com/apache/iceberg/pull/2#discussion_r1759559783 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3LocationProvider.java: ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-13 Thread via GitHub
danielcweeks commented on code in PR #2: URL: https://github.com/apache/iceberg/pull/2#discussion_r1759561633 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3LocationProvider.java: ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-09-13 Thread via GitHub
danielcweeks commented on code in PR #2: URL: https://github.com/apache/iceberg/pull/2#discussion_r1759567105 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3LocationProvider.java: ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-09-13 Thread via GitHub
wypoon commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1759573665 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -63,33 +60,43 @@ protected CloseableIterable doPlanFiles( return CloseableIter

Re: [PR] Spec: Add v3 types and type promotion [iceberg]

2024-09-13 Thread via GitHub
rdblue commented on code in PR #10955: URL: https://github.com/apache/iceberg/pull/10955#discussion_r1759573632 ## format/spec.md: ## @@ -950,6 +977,7 @@ Maps with non-string keys must use an array representation with the `map` logica |**`uuid`**|`{ "type": "fixed",`  `"size":

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-09-13 Thread via GitHub
RussellSpitzer commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1759577755 ## format/spec.md: ## @@ -298,16 +298,137 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns i

Re: [I] Inconsistent row count across versions [iceberg-python]

2024-09-13 Thread via GitHub
sungwy commented on issue #1132: URL: https://github.com/apache/iceberg-python/issues/1132#issuecomment-2350527753 Hi @daturkel and @dev-goyal I was finally able to find the root cause and put up a fix for this issue on this PR: https://github.com/apache/iceberg-python/pull/1141. Would you

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-09-13 Thread via GitHub
RussellSpitzer commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1759578432 ## format/spec.md: ## @@ -298,16 +298,137 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns i

Re: [I] what does value of partition mean in table dbxxx.tbxxx.partitions? [iceberg]

2024-09-13 Thread via GitHub
amogh-jahagirdar commented on issue #11125: URL: https://github.com/apache/iceberg/issues/11125#issuecomment-2350541820 Hey @madeirak you can check out the partition transform portion of the spec to determine how these values are obtained, https://iceberg.apache.org/spec/#partition-transfor

Re: [I] what does value of partition mean in table dbxxx.tbxxx.partitions? [iceberg]

2024-09-13 Thread via GitHub
amogh-jahagirdar closed issue #11125: what does value of partition mean in table dbxxx.tbxxx.partitions? URL: https://github.com/apache/iceberg/issues/11125 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] what does value of partition mean in table dbxxx.tbxxx.partitions? [iceberg]

2024-09-13 Thread via GitHub
amogh-jahagirdar commented on issue #11125: URL: https://github.com/apache/iceberg/issues/11125#issuecomment-2350547094 I'm going to go ahead and close, feel free to reopen if you have any more questions! -- This is an automated message from the Apache Git Service. To respond to the mess

[PR] Bump pypa/cibuildwheel from 2.20.0 to 2.21.0 [iceberg-python]

2024-09-13 Thread via GitHub
dependabot[bot] opened a new pull request, #1175: URL: https://github.com/apache/iceberg-python/pull/1175 Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.20.0 to 2.21.0. Release notes Sourced from https://github.com/pypa/cibuildwheel/releases";>pypa/cibuildwh

Re: [PR] Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large [iceberg-python]

2024-09-13 Thread via GitHub
kevinjqliu commented on PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#issuecomment-2350674380 > In cases where a single RecordBatch -> a Table -> multiple RecordBatches because of how Arrow automatically chunks a Table into multiple RecordBatches, we would lose the remai

Re: [PR] Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large [iceberg-python]

2024-09-13 Thread via GitHub
kevinjqliu commented on code in PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#discussion_r1759604254 ## pyiceberg/io/pyarrow.py: ## @@ -1251,10 +1253,17 @@ def _task_to_record_batches( arrow_table = arrow_table.filter(pyarrow_filter)

Re: [PR] Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large [iceberg-python]

2024-09-13 Thread via GitHub
sungwy commented on code in PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#discussion_r1759607957 ## pyiceberg/io/pyarrow.py: ## @@ -1238,10 +1238,12 @@ def _task_to_record_batches( for batch in batches: next_index = next_index + len(batc

Re: [PR] Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large [iceberg-python]

2024-09-13 Thread via GitHub
sungwy commented on code in PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#discussion_r1759608422 ## pyiceberg/io/pyarrow.py: ## @@ -1251,10 +1253,17 @@ def _task_to_record_batches( arrow_table = arrow_table.filter(pyarrow_filter)

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-13 Thread via GitHub
amogh-jahagirdar commented on PR #11131: URL: https://github.com/apache/iceberg/pull/11131#issuecomment-2350698759 Publishing a draft so I can test against entire CI. I need to think more about a good way to benchmark this and if there's even more reasonable optimizations that I can do here

Re: [PR] Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large [iceberg-python]

2024-09-13 Thread via GitHub
sungwy commented on code in PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#discussion_r1759608913 ## pyiceberg/io/pyarrow.py: ## @@ -1251,10 +1253,17 @@ def _task_to_record_batches( arrow_table = arrow_table.filter(pyarrow_filter)

Re: [PR] Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large [iceberg-python]

2024-09-13 Thread via GitHub
sungwy commented on PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#issuecomment-2350699585 > > In cases where a single RecordBatch -> a Table -> multiple RecordBatches because of how Arrow automatically chunks a Table into multiple RecordBatches, we would lose the remaini

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1759609664 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -304,6 +309,11 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1759609664 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -304,6 +309,11 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large [iceberg-python]

2024-09-13 Thread via GitHub
kevinjqliu commented on code in PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#discussion_r1759614689 ## pyiceberg/io/pyarrow.py: ## @@ -1238,10 +1238,12 @@ def _task_to_record_batches( for batch in batches: next_index = next_index + len(

Re: [PR] Bug Fix: Position Deletes + row_filter yields less data when the DataFile is large [iceberg-python]

2024-09-13 Thread via GitHub
kevinjqliu commented on code in PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#discussion_r1759615542 ## pyiceberg/io/pyarrow.py: ## @@ -1251,10 +1253,17 @@ def _task_to_record_batches( arrow_table = arrow_table.filter(pyarrow_filter)

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-13 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1759616249 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -304,6 +309,11 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Spec: Support geo type [iceberg]

2024-09-13 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1759616678 ## format/spec.md: ## @@ -198,6 +199,9 @@ Notes: - Timestamp values _with time zone_ represent a point in time: values are stored as UTC and do not retain a s

Re: [PR] Core: Add internal Avro reader [iceberg]

2024-09-13 Thread via GitHub
rdblue commented on PR #11108: URL: https://github.com/apache/iceberg/pull/11108#issuecomment-2350721960 I've separated out the first commit as https://github.com/apache/iceberg/pull/11132. This still includes it but should merge cleanly if merged after #11132. -- This is an automated me

Re: [I] Cannot set a custom location for path based tables [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on issue #8377: URL: https://github.com/apache/iceberg/issues/8377#issuecomment-2350730557 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] SPJ joins in the outer join component of MERGE queries [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on issue #8387: URL: https://github.com/apache/iceberg/issues/8387#issuecomment-2350730608 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Move field into place when adding during schema evolution [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on PR #8409: URL: https://github.com/apache/iceberg/pull/8409#issuecomment-2350730649 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Flink Iceberg [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on issue #8417: URL: https://github.com/apache/iceberg/issues/8417#issuecomment-2350730702 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] BaseSparkAction should not override `spark.jobGroup.id` property [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on issue #8422: URL: https://github.com/apache/iceberg/issues/8422#issuecomment-2350730735 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Spark 3.4: Only override the JobGroupInfo property if it does not exist [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on PR #8423: URL: https://github.com/apache/iceberg/pull/8423#issuecomment-2350730756 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Apache Iceberg - Update one record in the table doubles the number of files in the whole table [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on issue #8378: URL: https://github.com/apache/iceberg/issues/8378#issuecomment-2350730570 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Spark 3.4: Incremental scan specify branch [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on PR #8384: URL: https://github.com/apache/iceberg/pull/8384#issuecomment-2350730590 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Spec: deprecate distinct_counts in data_file [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on PR #8395: URL: https://github.com/apache/iceberg/pull/8395#issuecomment-2350730619 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] HDFS BlockMissingException when trying to read Hive data using Iceberg jars with erasure coding enabled [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on issue #8399: URL: https://github.com/apache/iceberg/issues/8399#issuecomment-2350730636 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Does expireSnapshotId delete older snapshots data files? [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on issue #8410: URL: https://github.com/apache/iceberg/issues/8410#issuecomment-2350730658 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Fix Iceberg to handle literal short and byte [iceberg]

2024-09-13 Thread via GitHub
github-actions[bot] commented on PR #8412: URL: https://github.com/apache/iceberg/pull/8412#issuecomment-2350730670 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

  1   2   >