Re: [I] Cannot create table if location is s3 with "secure" Minio server [iceberg-python]

2024-03-21 Thread via GitHub
thinkORo commented on issue #540: URL: https://github.com/apache/iceberg-python/issues/540#issuecomment-2014459327 I double-checked my configuration with a second tool, duckDB. With duckDB I can access the data in minio with the same credentials without any problems. These are my corr

Re: [I] Implement Glue Catalog [iceberg-rust]

2024-03-21 Thread via GitHub
marvinlanhenke commented on issue #249: URL: https://github.com/apache/iceberg-rust/issues/249#issuecomment-2014383639 @liurenjie1024 ... I started prototyping on this, here is what I'm trying to do: ## Tasks: - [ ] Setup basic structure + test infra - [ ] Implement namespace ope

Re: [I] `system.add_files` utility does not support updated Partition Spec [iceberg]

2024-03-21 Thread via GitHub
amogh-jahagirdar commented on issue #10008: URL: https://github.com/apache/iceberg/issues/10008#issuecomment-2014209251 I looked into this a bit and I think I know the problem. Here's a sample test that can be added to `TestAddFilesProcedure` to repro ``` @TestTemplate publi

Re: [I] Inheriting the sequence number from manifest list when load manifest [iceberg-rust]

2024-03-21 Thread via GitHub
ZENOTME commented on issue #286: URL: https://github.com/apache/iceberg-rust/issues/286#issuecomment-2014191692 Seems there isn't an explicit test case for this. I can add it later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] Inheriting the sequence number from manifest list when load manifest [iceberg-rust]

2024-03-21 Thread via GitHub
ZENOTME commented on issue #286: URL: https://github.com/apache/iceberg-rust/issues/286#issuecomment-2014189442 Yes! Sorry, I didn't notice that before. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Implement transforms projection [iceberg-rust]

2024-03-21 Thread via GitHub
liurenjie1024 commented on issue #289: URL: https://github.com/apache/iceberg-rust/issues/289#issuecomment-2014179697 > > And we need to port the project() method from the transforms > > For this one, it looks like the same as #264. Exactly, #264 is blocked by #283 , and @ZENOTM

Re: [I] Inheriting the sequence number from manifest list when load manifest [iceberg-rust]

2024-03-21 Thread via GitHub
viirya commented on issue #286: URL: https://github.com/apache/iceberg-rust/issues/286#issuecomment-2014180148 Hmm, I think we already make it inherited? https://github.com/apache/iceberg-rust/blob/39aafdd2ea69968213e94534b5864fd595a6034a/crates/iceberg/src/spec/manifest_list.rs#L650

[PR] Rename function name to `add_manifests` [iceberg-rust]

2024-03-21 Thread via GitHub
viirya opened a new pull request, #293: URL: https://github.com/apache/iceberg-rust/pull/293 While I'm looking into #286, I found there is one function `add_manifest_entries` which is confusing to me. In fact, it writes manifest instead of manifest entries. As I think manifest entrie

Re: [I] Implement transforms projection [iceberg-rust]

2024-03-21 Thread via GitHub
viirya commented on issue #289: URL: https://github.com/apache/iceberg-rust/issues/289#issuecomment-2014152579 > And we need to port the project() method from the transforms For this one, it looks like the same as #264. -- This is an automated message from the Apache Git Service. To

Re: [PR] feat: init iceberg writer [iceberg-rust]

2024-03-21 Thread via GitHub
liurenjie1024 commented on code in PR #275: URL: https://github.com/apache/iceberg-rust/pull/275#discussion_r1534923644 ## crates/iceberg/src/writer/base_writer/data_file_writer.rs: ## @@ -0,0 +1,310 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more c

Re: [I] Implement transforms projection [iceberg-rust]

2024-03-21 Thread via GitHub
liurenjie1024 commented on issue #289: URL: https://github.com/apache/iceberg-rust/issues/289#issuecomment-2014134792 I think this issue includes several things, and are tracked in https://github.com/apache/iceberg-rust/issues/153 . -- This is an automated message from the Apache Git Ser

Re: [I] Convert row filter to arrow filter [iceberg-rust]

2024-03-21 Thread via GitHub
liurenjie1024 commented on issue #265: URL: https://github.com/apache/iceberg-rust/issues/265#issuecomment-2014127366 I think this depends on the selectivity, and also the implementation. To achieve best performance, the scan reader need to perform vectorized execution to convert filter to

Re: [I] Implement transforms projection [iceberg-rust]

2024-03-21 Thread via GitHub
viirya commented on issue #289: URL: https://github.com/apache/iceberg-rust/issues/289#issuecomment-2014126242 If not, I will work on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [I] Implement transforms projection [iceberg-rust]

2024-03-21 Thread via GitHub
viirya commented on issue #289: URL: https://github.com/apache/iceberg-rust/issues/289#issuecomment-2014125805 Does this duplicate to #264? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [I] Can snapshot has an optional name? [iceberg]

2024-03-21 Thread via GitHub
github-actions[bot] commented on issue #2231: URL: https://github.com/apache/iceberg/issues/2231#issuecomment-2014085602 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] flink 1.12.0 cannot run iceberg batch mode [iceberg]

2024-03-21 Thread via GitHub
github-actions[bot] commented on issue #2225: URL: https://github.com/apache/iceberg/issues/2225#issuecomment-2014085539 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: Prevent duplicate data/delete files [iceberg]

2024-03-21 Thread via GitHub
danielcweeks commented on code in PR #10007: URL: https://github.com/apache/iceberg/pull/10007#discussion_r1534831074 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -43,6 +44,7 @@ class FastAppend extends SnapshotProducer implements AppendFiles { private fin

Re: [I] Support customized header in Rest catalog client [iceberg-rust]

2024-03-21 Thread via GitHub
flyrain commented on issue #292: URL: https://github.com/apache/iceberg-rust/issues/292#issuecomment-2013884314 Assigned to you. Thanks for taking a look! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] Make OAuth token server configurable [iceberg-rust]

2024-03-21 Thread via GitHub
flyrain commented on issue #291: URL: https://github.com/apache/iceberg-rust/issues/291#issuecomment-2013878492 Great! Assigned this to you. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Support customized header in Rest catalog client [iceberg-rust]

2024-03-21 Thread via GitHub
whynick1 commented on issue #292: URL: https://github.com/apache/iceberg-rust/issues/292#issuecomment-2013878115 @flyrain if nobody is already looking at this, would love to try and take a stab at this? -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] Make OAuth token server configurable [iceberg-rust]

2024-03-21 Thread via GitHub
whynick1 commented on issue #291: URL: https://github.com/apache/iceberg-rust/issues/291#issuecomment-2013872046 @flyrain if nobody is already looking at this, would love to try and take a stab at this? -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] Convert row filter to arrow filter [iceberg-rust]

2024-03-21 Thread via GitHub
a-agmon commented on issue #265: URL: https://github.com/apache/iceberg-rust/issues/265#issuecomment-2013869724 Perhaps I am missing something, but I was running [this simple test](https://gist.github.com/a-agmon/65fe8e6f065404f039937befbbfa401e) on a small parquet file (65MB) and a simple

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-21 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1534723838 ## core/src/test/java/org/apache/iceberg/hadoop/HadoopStreamsTest.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-21 Thread via GitHub
stevenzwu commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1534715733 ## core/src/test/java/org/apache/iceberg/hadoop/HadoopStreamsTest.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-21 Thread via GitHub
stevenzwu commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1534711755 ## core/src/test/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] Kafka Connect: Record converters [iceberg]

2024-03-21 Thread via GitHub
bryanck merged PR #9641: URL: https://github.com/apache/iceberg/pull/9641 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [I] Convert a StringLiteral into a DecimalLiteral [iceberg-python]

2024-03-21 Thread via GitHub
Fokko commented on issue #538: URL: https://github.com/apache/iceberg-python/issues/538#issuecomment-2013703991 This indeed works: ```python >>> from pyiceberg.types import DecimalType >>> literal("100.00").to(DecimalType(10,2)) DecimalLiteral(Decimal('100.00')) ``` I ma

Re: [I] Convert a StringLiteral into a DecimalLiteral [iceberg-python]

2024-03-21 Thread via GitHub
Fokko closed issue #538: Convert a StringLiteral into a DecimalLiteral URL: https://github.com/apache/iceberg-python/issues/538 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Convert a StringLiteral into a DecimalLiteral [iceberg-python]

2024-03-21 Thread via GitHub
syun64 commented on issue #538: URL: https://github.com/apache/iceberg-python/issues/538#issuecomment-2013684565 I'm making a similar observation as @Dysprosium0626 as well ``` from pyiceberg.expressions.literals import StringLiteral from pyiceberg.types import DecimalType

Re: [PR] `add_files` support partitioned tables [iceberg-python]

2024-03-21 Thread via GitHub
syun64 commented on PR #531: URL: https://github.com/apache/iceberg-python/pull/531#issuecomment-2013668533 > This looks good, thanks again for the work @syun64 Thank you! As always! @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] `add_files` support partitioned tables [iceberg-python]

2024-03-21 Thread via GitHub
Fokko merged PR #531: URL: https://github.com/apache/iceberg-python/pull/531 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Hive: Use base table metadata to create HiveLock [iceberg]

2024-03-21 Thread via GitHub
pvary commented on PR #10016: URL: https://github.com/apache/iceberg/pull/10016#issuecomment-2013630292 @lirui-apache: please add a new test too, to make sure that this behavior does not change in the future. Thx, Peter -- This is an automated message from the Apache Git Service. To r

Re: [PR] Add Snapshots table metadata [iceberg-python]

2024-03-21 Thread via GitHub
Fokko merged PR #524: URL: https://github.com/apache/iceberg-python/pull/524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Add Snapshots table metadata [iceberg-python]

2024-03-21 Thread via GitHub
Fokko commented on PR #524: URL: https://github.com/apache/iceberg-python/pull/524#issuecomment-2013620848 > Just have one question: I was thinking if later we need those metadata table classes, StaticTableScan, and StaticDataTask like what Java did. These may become useful when other engin

[PR] Modify `Bind` calls so that they don't consume `self` and instead return a new struct, leaving the original unmoved [iceberg-rust]

2024-03-21 Thread via GitHub
sdd opened a new pull request, #290: URL: https://github.com/apache/iceberg-rust/pull/290 This is a pre-requisite to https://github.com/apache/iceberg-rust/pull/241 and was a part of that PR but has been pulled into it's own PR after discussions with @liurenjie1024. The existing Pred

Re: [I] Convert row filter to arrow filter [iceberg-rust]

2024-03-21 Thread via GitHub
viirya commented on issue #265: URL: https://github.com/apache/iceberg-rust/issues/265#issuecomment-2013554756 Hmm, I wonder if the filtering takes too much time cost on so called common values? Is the predicate filter very complicated? Normally I think filtering on scan can boost performan

Re: [I] select distinct on table scan [iceberg-python]

2024-03-21 Thread via GitHub
Fokko closed issue #403: select distinct on table scan URL: https://github.com/apache/iceberg-python/issues/403 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] Delete/Update fails for tables with more than 1000 columns [iceberg]

2024-03-21 Thread via GitHub
xiaoxuandev commented on issue #6368: URL: https://github.com/apache/iceberg/issues/6368#issuecomment-2013502300 Getting a similar error for UPDATE in Iceberg 1.4.3 release, stack track below: ``` java.lang.AssertionError: Expecting code not to raise a throwable but caught "ja

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-21 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1534534023 ## core/src/test/java/org/apache/iceberg/hadoop/HadoopStreamsTest.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

[I] Cannot create table if location is s3 with "secure" Minio server [iceberg-python]

2024-03-21 Thread via GitHub
thinkORo opened a new issue, #540: URL: https://github.com/apache/iceberg-python/issues/540 ### Apache Iceberg version 0.6.0 (latest release) ### Please describe the bug 🐞 I've create a .pyiceberg.yaml file with the following content: ``` catalog: default:

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-21 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1534520860 ## core/src/test/java/org/apache/iceberg/hadoop/HadoopStreamsTest.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-21 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1534518373 ## core/src/test/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-21 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1534517549 ## core/src/test/java/org/apache/iceberg/hadoop/HadoopStreamsTest.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-21 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1534516571 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopStreams.java: ## @@ -185,8 +185,21 @@ public void flush() throws IOException { @Override public void cl

Re: [I] Implement transforms projection [iceberg-rust]

2024-03-21 Thread via GitHub
marvinlanhenke commented on issue #289: URL: https://github.com/apache/iceberg-rust/issues/289#issuecomment-2013198395 #264 as ref for implementing `project()` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [I] Integrate with datafusion [iceberg-rust]

2024-03-21 Thread via GitHub
marvinlanhenke commented on issue #242: URL: https://github.com/apache/iceberg-rust/issues/242#issuecomment-2013174861 > The datafusion provides the following trait to manage the table: > > * CatalogProviderList > * CatalogProvider > * SchemaProvider > * TableProvider T

Re: [PR] Add Snapshots table metadata [iceberg-python]

2024-03-21 Thread via GitHub
Gowthami03B commented on PR #524: URL: https://github.com/apache/iceberg-python/pull/524#issuecomment-2013119342 @Fokko Can we merge this? I am almost done with "Files" table, so I can rebase my code before creating a PR. -- This is an automated message from the Apache Git Service. To re

Re: [I] `system.add_files` utility does not support updated Partition Spec [iceberg]

2024-03-21 Thread via GitHub
nastra commented on issue #10008: URL: https://github.com/apache/iceberg/issues/10008#issuecomment-2013093976 Sorry I must have missed step 2 when reading the description. I'll take a closer look and will update the issue once I know more. -- This is an automated message from the Apache G

Re: [I] `system.add_files` utility does not support updated Partition Spec [iceberg]

2024-03-21 Thread via GitHub
sfc-gh-asudhakar commented on issue #10008: URL: https://github.com/apache/iceberg/issues/10008#issuecomment-2013045412 > @sfc-gh-asudhakar I believe you need to update the schema of the Iceberg table yourself. The [docs](https://iceberg.apache.org/docs/latest/spark-procedures/#add_files) o

Re: [PR] Add local nightly build to test current docs changes [iceberg]

2024-03-21 Thread via GitHub
rdblue commented on code in PR #9943: URL: https://github.com/apache/iceberg/pull/9943#discussion_r1534304249 ## site/nav.yml: ## @@ -21,6 +21,7 @@ nav: - Spark: spark-quickstart.md - Hive: hive-quickstart.md - Docs: +- nightly: '!include docs/docs/nightly/mkdo

Re: [PR] Migrate Scan, Schema and remaining Partition files in Core to JUnit5 [iceberg]

2024-03-21 Thread via GitHub
tomtongue commented on code in PR #10014: URL: https://github.com/apache/iceberg/pull/10014#discussion_r1534284283 ## core/src/test/java/org/apache/iceberg/TestSchemaUpdate.java: ## @@ -1733,22 +1706,19 @@ public void testRemoveIdentifierFields() { .setIdentifierFie

Re: [PR] Migrate Scan, Schema and remaining Partition files in Core to JUnit5 [iceberg]

2024-03-21 Thread via GitHub
nastra merged PR #10014: URL: https://github.com/apache/iceberg/pull/10014 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Kafka Connect: Record converters [iceberg]

2024-03-21 Thread via GitHub
bryanck commented on PR #9641: URL: https://github.com/apache/iceberg/pull/9641#issuecomment-2012955949 I was planning on merging this, unless someone wants to give more feedback, cc @fqaiser94 @danielcweeks -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] Migrate Scan, Schema and remaining Partition files in Core to JUnit5 [iceberg]

2024-03-21 Thread via GitHub
tomtongue commented on code in PR #10014: URL: https://github.com/apache/iceberg/pull/10014#discussion_r1534269319 ## core/src/test/java/org/apache/iceberg/TestSchemaUpdate.java: ## @@ -1733,22 +1706,19 @@ public void testRemoveIdentifierFields() { .setIdentifierFie

Re: [PR] Migrate Scan, Schema and remaining Partition files in Core to JUnit5 [iceberg]

2024-03-21 Thread via GitHub
nastra commented on code in PR #10014: URL: https://github.com/apache/iceberg/pull/10014#discussion_r1534263933 ## core/src/test/java/org/apache/iceberg/TestSchemaUpdate.java: ## @@ -1733,22 +1706,19 @@ public void testRemoveIdentifierFields() { .setIdentifierFields

Re: [PR] Migrate Scan, Schema and remaining Partition files in Core to JUnit5 [iceberg]

2024-03-21 Thread via GitHub
tomtongue commented on PR #10014: URL: https://github.com/apache/iceberg/pull/10014#issuecomment-2012876402 @nastra Could you review this PR when you have time? (one more PR should be needed to complete the migration of core files in `org/apache/iceberg` to JUnit5) -- This is an au

Re: [PR] feat: Implement the conversion from Arrow Schema to Iceberg Schema [iceberg-rust]

2024-03-21 Thread via GitHub
viirya commented on PR #258: URL: https://github.com/apache/iceberg-rust/pull/258#issuecomment-2012837449 Thanks @liurenjie1024 @ZENOTME @waynexia @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] OpenAPI: Express server capabilities via /config endpoint [iceberg]

2024-03-21 Thread via GitHub
snazy commented on code in PR #9940: URL: https://github.com/apache/iceberg/pull/9940#discussion_r1534219425 ## open-api/rest-catalog-open-api.yaml: ## @@ -1559,6 +1578,22 @@ components: type: string description: Properties that should be use

[I] Implement transforms projection [iceberg-rust]

2024-03-21 Thread via GitHub
Fokko opened a new issue, #289: URL: https://github.com/apache/iceberg-rust/issues/289 For evaluating the hibben partition filters, we need to have column projections. For example, this will translate `dt <= 2024-02-01 and dt < 2024-03-01` to the partition filter `month(dt) = 2024-02`.

Re: [PR] Spark 3.2: Support arbitrary scans in SparkBatchQueryScan [iceberg]

2024-03-21 Thread via GitHub
nastra closed pull request #10011: Spark 3.2: Support arbitrary scans in SparkBatchQueryScan URL: https://github.com/apache/iceberg/pull/10011 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Spark 3.2: Add RewritePositionDeleteFilesSparkAction [iceberg]

2024-03-21 Thread via GitHub
nastra closed pull request #10009: Spark 3.2: Add RewritePositionDeleteFilesSparkAction URL: https://github.com/apache/iceberg/pull/10009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Convert row filter to arrow filter [iceberg-rust]

2024-03-21 Thread via GitHub
a-agmon commented on issue #265: URL: https://github.com/apache/iceberg-rust/issues/265#issuecomment-2012751222 Hi @viirya Perhaps a bit off-topic but wondering what you think. I have been testing this a bit, and while I have always seen performance improvements in using `ParquetReco

Re: [I] How to insert overwrite with a single commit [iceberg]

2024-03-21 Thread via GitHub
difin commented on issue #9720: URL: https://github.com/apache/iceberg/issues/9720#issuecomment-2012606418 CC: @gaborkaszab -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Hive: Use base table metadata to create HiveLock [iceberg]

2024-03-21 Thread via GitHub
lirui-apache opened a new pull request, #10016: URL: https://github.com/apache/iceberg/pull/10016 Use base (instead of new) table metadata to create the lock object, so that concurrent commits use the same lock mechanism. Fixes #10006 -- This is an automated message from the Apach

Re: [PR] Core: Prevent duplicate data/delete files [iceberg]

2024-03-21 Thread via GitHub
danielcweeks commented on code in PR #10007: URL: https://github.com/apache/iceberg/pull/10007#discussion_r1534110571 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -43,6 +44,7 @@ class FastAppend extends SnapshotProducer implements AppendFiles { private fin

Re: [PR] Spark 3.5: Spark action to compute the partition stats [iceberg]

2024-03-21 Thread via GitHub
ajantha-bhat commented on PR #9437: URL: https://github.com/apache/iceberg/pull/9437#issuecomment-2012543899 ping @aokolnychyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] Integrate with datafusion [iceberg-rust]

2024-03-21 Thread via GitHub
ZENOTME commented on issue #242: URL: https://github.com/apache/iceberg-rust/issues/242#issuecomment-2012508866 Thanks for raising this discussion @marvinlanhenke! The basic idea for the integration is to provide the wrap struct using type in iceberg-rust so that users can use them to conne

Re: [PR] feat: init iceberg writer [iceberg-rust]

2024-03-21 Thread via GitHub
ZENOTME commented on code in PR #275: URL: https://github.com/apache/iceberg-rust/pull/275#discussion_r1534012662 ## crates/iceberg/src/writer/base_writer/data_file_writer.rs: ## @@ -0,0 +1,310 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

Re: [PR] feat: add builder to TableMetadata interface [iceberg-rust]

2024-03-21 Thread via GitHub
liurenjie1024 commented on PR #62: URL: https://github.com/apache/iceberg-rust/pull/62#issuecomment-2012426148 cc @y0psolo Should we close this now? I think it's resolved by #262 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Spark 3.2: Add RewritePositionDeleteFilesSparkAction [iceberg]

2024-03-21 Thread via GitHub
nastra commented on PR #10009: URL: https://github.com/apache/iceberg/pull/10009#issuecomment-2012416324 If there won't be a patch release for that particular version, then I don't think it makes sense to port this to Iceberg's 1.3.x branch -- This is an automated message from the Apache

Re: [I] iceberg reports an error after upgrading to 1.4.2 [iceberg]

2024-03-21 Thread via GitHub
zachdisc commented on issue #9018: URL: https://github.com/apache/iceberg/issues/9018#issuecomment-2012391871 That appears to be it. I didn't switch spark versions knowingly - I observed this when upgrading from EMR 6.14 (Spark 3.4.1, Iceberg 1.3.1-amzn-0) to EMR 6.15+ (Spark 3.4.1, Iceberg

Re: [PR] Spark 3.2: Add RewritePositionDeleteFilesSparkAction [iceberg]

2024-03-21 Thread via GitHub
puchengy commented on PR #10009: URL: https://github.com/apache/iceberg/pull/10009#issuecomment-2012358990 @nastra we internally still maintain Spark 3.2 so we want to port this to internal. Having this available in upstream first can have pairs of eyes to make sure the change is right, and

Re: [PR] feat: implement prune column for schema [iceberg-rust]

2024-03-21 Thread via GitHub
liurenjie1024 commented on code in PR #261: URL: https://github.com/apache/iceberg-rust/pull/261#discussion_r1533943697 ## crates/iceberg/src/spec/schema.rs: ## @@ -642,6 +644,199 @@ impl SchemaVisitor for IndexByName { } } +struct PruneColumn { +selected: HashSet, +

Re: [PR] feat: implement prune column for schema [iceberg-rust]

2024-03-21 Thread via GitHub
liurenjie1024 commented on code in PR #261: URL: https://github.com/apache/iceberg-rust/pull/261#discussion_r1533942476 ## crates/iceberg/src/spec/schema.rs: ## @@ -1338,4 +1533,430 @@ table { ); } } +#[test] +fn test_schema_prune_columns_strin

[PR] Add Strict projection [iceberg-python]

2024-03-21 Thread via GitHub
Fokko opened a new pull request, #539: URL: https://github.com/apache/iceberg-python/pull/539 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [I] Integrate with datafusion [iceberg-rust]

2024-03-21 Thread via GitHub
marvinlanhenke commented on issue #242: URL: https://github.com/apache/iceberg-rust/issues/242#issuecomment-2012273359 @ZENOTME I'm interested in your approach, perhaps you can outline what you are going to do (high-level). I'm just curious and want to understand / research where those

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-21 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1533819440 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -282,7 +281,12 @@ protected void doCommit(TableMetadata base, TableMetadata me

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-21 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1533818010 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -309,65 +304,39 @@ protected enum CommitStatus { * @return Commit Status of Succe

Re: [I] Convert a StringLiteral into a DecimalLiteral [iceberg-python]

2024-03-21 Thread via GitHub
Dysprosium0626 commented on issue #538: URL: https://github.com/apache/iceberg-python/issues/538#issuecomment-2012144995 Hi @Fokko I'd like to have I try but I do not know where to put these code. It seems that we already have https://github.com/apache/iceberg-python/blob/bbc7e7c8d095b4afea

Re: [I] Structured streaming writes to partitioned table fails when spark.sql.extensions is set to IcebergSparkSessionExtensions [iceberg]

2024-03-21 Thread via GitHub
greg-roberts-bbc commented on issue #7226: URL: https://github.com/apache/iceberg/issues/7226#issuecomment-2012098543 We've found a workaround in our use case. (Iceberg 1.4.3, Spark 3.3.0 on Glue 4.0). Our previous flow was: ``` # set up readStream read_stream = spark.rea

Re: [I] Convert a StringLiteral into a DecimalLiteral [iceberg-rust]

2024-03-21 Thread via GitHub
Fokko commented on issue #288: URL: https://github.com/apache/iceberg-rust/issues/288#issuecomment-2012074380 Wrong repo, sorry! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] Convert a StringLiteral into a DecimalLiteral [iceberg-rust]

2024-03-21 Thread via GitHub
Fokko closed issue #288: Convert a StringLiteral into a DecimalLiteral URL: https://github.com/apache/iceberg-rust/issues/288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-21 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1533741453 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -304,30 +305,16 @@ protected void doCommit(TableMetadata base, TableMetadata m

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-21 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1533738195 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -240,9 +239,9 @@ protected void doCommit(TableMetadata base, TableMetadata met

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-21 Thread via GitHub
nastra commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1533723838 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -309,65 +304,39 @@ protected enum CommitStatus { * @return Commit Status of Succe

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-21 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1533719987 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -222,9 +211,29 @@ public boolean dropTable(TableIdentifier identifier, boolean purge)

Re: [PR] feat: implement prune column for schema [iceberg-rust]

2024-03-21 Thread via GitHub
Dysprosium0626 commented on code in PR #261: URL: https://github.com/apache/iceberg-rust/pull/261#discussion_r1533718813 ## crates/iceberg/src/spec/schema.rs: ## @@ -642,6 +644,199 @@ impl SchemaVisitor for IndexByName { } } +struct PruneColumn { +selected: HashSet,

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-21 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1533712833 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -309,65 +304,39 @@ protected enum CommitStatus { * @return Commit Status of Succe

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-21 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1533702231 ## core/src/main/java/org/apache/iceberg/CatalogUtil.java: ## @@ -136,6 +138,18 @@ public static void dropTableData(FileIO io, TableMetadata metadata) { deleteFi

Re: [PR] feat: implement prune column for schema [iceberg-rust]

2024-03-21 Thread via GitHub
Dysprosium0626 commented on code in PR #261: URL: https://github.com/apache/iceberg-rust/pull/261#discussion_r1533685816 ## crates/iceberg/src/spec/schema.rs: ## @@ -1338,4 +1533,430 @@ table { ); } } +#[test] +fn test_schema_prune_columns_stri

Re: [PR] Core: Prevent duplicate data files [iceberg]

2024-03-21 Thread via GitHub
nastra commented on code in PR #10007: URL: https://github.com/apache/iceberg/pull/10007#discussion_r1533675374 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -83,9 +85,13 @@ protected Map summary() { @Override public FastAppend appendFile(DataFile file)

Re: [I] [Flink] CTAS data isn't returned in Flink query [iceberg]

2024-03-21 Thread via GitHub
rmoff closed issue #9947: [Flink] CTAS data isn't returned in Flink query URL: https://github.com/apache/iceberg/issues/9947 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] [Flink] CTAS data isn't returned in Flink query [iceberg]

2024-03-21 Thread via GitHub
rmoff commented on issue #9947: URL: https://github.com/apache/iceberg/issues/9947#issuecomment-2011958487 Thanks @pvary, this was 💯 the cause :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] iceberg reports an error after upgrading to 1.4.2 [iceberg]

2024-03-21 Thread via GitHub
nastra commented on issue #9018: URL: https://github.com/apache/iceberg/issues/9018#issuecomment-2011941518 might be related to https://issues.apache.org/jira/browse/SPARK-46847. When switching the Iceberg version, did you also switch the Spark version? Because that Spark issue started to h

Re: [PR] Core: Prevent duplicate data files [iceberg]

2024-03-21 Thread via GitHub
nastra commented on code in PR #10007: URL: https://github.com/apache/iceberg/pull/10007#discussion_r1533635845 ## core/src/test/java/org/apache/iceberg/TestBaseIncrementalAppendScan.java: ## @@ -67,13 +67,13 @@ public void fromSnapshotInclusiveWithTag() { table.manageSnaps

Re: [PR] Core: Prevent duplicate data files [iceberg]

2024-03-21 Thread via GitHub
nastra commented on code in PR #10007: URL: https://github.com/apache/iceberg/pull/10007#discussion_r1533634773 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -80,6 +80,8 @@ abstract class MergingSnapshotProducer extends SnapshotProducer { //

Re: [PR] Core: Prevent duplicate data files [iceberg]

2024-03-21 Thread via GitHub
nastra commented on code in PR #10007: URL: https://github.com/apache/iceberg/pull/10007#discussion_r1533633016 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -43,6 +44,7 @@ class FastAppend extends SnapshotProducer implements AppendFiles { private final Par

[PR] [WIP] Migrate Scan, Schema and remaining Partition files in Core to JUnit5 [iceberg]

2024-03-21 Thread via GitHub
tomtongue opened a new pull request, #10014: URL: https://github.com/apache/iceberg/pull/10014 Migrate the following test classes in iceberg-core to JUnit 5 and AssertJ style for https://github.com/apache/iceberg/issues/9085. ## Current Progress Scan - [x] `TestScanDataFile

Re: [PR] feat: implement prune column for schema [iceberg-rust]

2024-03-21 Thread via GitHub
liurenjie1024 commented on code in PR #261: URL: https://github.com/apache/iceberg-rust/pull/261#discussion_r1531709362 ## crates/iceberg/src/spec/schema.rs: ## @@ -1338,4 +1533,430 @@ table { ); } } +#[test] +fn test_schema_prune_columns_strin

Re: [I] Calling `rewrite_position_delete_files` fails on tables with more than 1k columns [iceberg]

2024-03-21 Thread via GitHub
bk-mz commented on issue #9923: URL: https://github.com/apache/iceberg/issues/9923#issuecomment-2011826944 @xiaoxuandev it's because nameToId is inverted to the result: ```nameToId.forEach((key, value) -> builder.put(value, key));``` you take key, value and remap it to value, ke

  1   2   >