Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2024-10-17 Thread via GitHub
yunlou11 commented on PR #7914: URL: https://github.com/apache/iceberg/pull/7914#issuecomment-2421517117 ```sql CALL nessie.system.remove_orphan_files(table => 'nessie.robot_dev.robot_data') ``` ```text Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileS

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-17 Thread via GitHub
arkadius commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2421498966 > @arkadius: The flaky test will be handled. In the meantime, could you please retrigger the tests, so we can have a green run? It looks like I have no right to do it - this butt

Re: [PR] Flink Support for TIMESTAMP_NANOS data type for PARQUET [iceberg]

2024-10-17 Thread via GitHub
pvary commented on PR #11348: URL: https://github.com/apache/iceberg/pull/11348#issuecomment-2421446880 Do the schema conversions handle the nano ts correctly? Do we get a `Timestamp(9)` when we convert to Flink schema, or a nano timestamp when we convert from the Iceberg schema? -- This

Re: [PR] Flink: disable the flaky range distribution bucketing tests for now [iceberg]

2024-10-17 Thread via GitHub
pvary commented on PR #11347: URL: https://github.com/apache/iceberg/pull/11347#issuecomment-2421432093 Nit: could we add a comment why this was ignored? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Flink: make FLIP-27 default in SQL and mark the old FlinkSource as deprecated [iceberg]

2024-10-17 Thread via GitHub
pvary commented on code in PR #11345: URL: https://github.com/apache/iceberg/pull/11345#discussion_r1805883034 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -86,7 +85,6 @@ import org.slf4j.Logger; import org.slf4j.LoggerFactory;

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-17 Thread via GitHub
pvary commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2421434091 @arkadius: The flaky test will be handled. In the meantime, could you please retrigger the tests, so we can have a green run? -- This is an automated message from the Apache Git Servic

Re: [I] flink:FlinkSink support dynamically changed schema [iceberg]

2024-10-17 Thread via GitHub
pvary commented on issue #4190: URL: https://github.com/apache/iceberg/issues/4190#issuecomment-2421426631 The current tradeoff is more like doubled CPU time (we need caching and an extra serialization/deserialization step, which is on an already well optimized hot path). We are still looki

Re: [PR] Core: delete temp metadata file when version already exists [iceberg]

2024-10-17 Thread via GitHub
leesf commented on PR #11350: URL: https://github.com/apache/iceberg/pull/11350#issuecomment-2421372412 @rdblue please help to review this PR, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Flink: Add RowConverter for Iceberg Source [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on code in PR #11301: URL: https://github.com/apache/iceberg/pull/11301#discussion_r1805836604 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceBoundedRow.java: ## @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Flink: Add RowConverter for Iceberg Source [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on code in PR #11301: URL: https://github.com/apache/iceberg/pull/11301#discussion_r1805835532 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/source/reader/RowConverter.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Spec: Support geo type [iceberg]

2024-10-17 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1805831055 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaN

Re: [I] Support Vended Credentials for Azure Data Lake Store [iceberg-python]

2024-10-17 Thread via GitHub
sfc-gh-tbenroeck commented on issue #1146: URL: https://github.com/apache/iceberg-python/issues/1146#issuecomment-2421306564 @sungwy and @ndrluis [#961](https://github.com/apache/iceberg-python/pull/961) isn't a fix for this issue. There are two aspects, first the prefix `adls.sas-token` v

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805821651 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

[PR] Core: delete temp metadata file when version already exists [iceberg]

2024-10-17 Thread via GitHub
leesf opened a new pull request, #11350: URL: https://github.com/apache/iceberg/pull/11350 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

[I] feat: abstract the MetricsEvaluator [iceberg-rust]

2024-10-17 Thread via GitHub
sundy-li opened a new issue, #674: URL: https://github.com/apache/iceberg-rust/issues/674 Every query engine has its expression framework to prune files. The current `Predicate` in `iceberg-rust` is very simple, it's not powerful yet. For example, it does not support Cast Expre

Re: [PR] Spark 3.5: Display write metrics on SQL UI [iceberg]

2024-10-17 Thread via GitHub
manuzhang commented on PR #11340: URL: https://github.com/apache/iceberg/pull/11340#issuecomment-2421230380 Add `number of total data files` to write command `AppendData` ![CleanShot 2024-10-18 at 11 31 55@2x](https://github.com/user-attachments/assets/e3cbb4b1-9649-4fe5-b2fe-624abb6a

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2024-10-17 Thread via GitHub
namrathamyske commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-2421151917 Would love to have this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Glue and Hive catalog return only Iceberg tables [iceberg-python]

2024-10-17 Thread via GitHub
sungwy commented on code in PR #1145: URL: https://github.com/apache/iceberg-python/pull/1145#discussion_r1805681185 ## pyiceberg/catalog/dynamodb.py: ## @@ -393,7 +393,7 @@ def drop_namespace(self, namespace: Union[str, Identifier]) -> None: raise NoSuchNamespaceE

Re: [PR] Add flag to allow disabling creation of catalog tables [iceberg-python]

2024-10-17 Thread via GitHub
sungwy merged PR #1155: URL: https://github.com/apache/iceberg-python/pull/1155 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Add flag to allow disabling creation of catalog tables [iceberg-python]

2024-10-17 Thread via GitHub
sungwy commented on PR #1155: URL: https://github.com/apache/iceberg-python/pull/1155#issuecomment-2420987590 Thank you again for working on this PR @isc-patrick ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-17 Thread via GitHub
szehon-ho commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1805653101 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct valu

Re: [I] Suppress duplicate OAuth token fetching in rest catalog client [iceberg-python]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #587: URL: https://github.com/apache/iceberg-python/issues/587#issuecomment-2420913318 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apac

Re: [I] Suppress duplicate OAuth token fetching in rest catalog client [iceberg-python]

2024-10-17 Thread via GitHub
github-actions[bot] closed issue #587: Suppress duplicate OAuth token fetching in rest catalog client URL: https://github.com/apache/iceberg-python/issues/587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Improve read times and reduce size of metadata.json by storing schemas in external files [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9734: URL: https://github.com/apache/iceberg/issues/9734#issuecomment-2420907068 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] hive iceberg [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9094: URL: https://github.com/apache/iceberg/issues/9094#issuecomment-2420904428 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] hive iceberg [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] closed issue #9094: hive iceberg URL: https://github.com/apache/iceberg/issues/9094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Metrics for Manifest file caching [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] closed issue #9093: Metrics for Manifest file caching URL: https://github.com/apache/iceberg/issues/9093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Metrics for Manifest file caching [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9093: URL: https://github.com/apache/iceberg/issues/9093#issuecomment-2420904412 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] It sometimes throws exception java.lang.AssertionError: assertion failed after upgrade to Iceberg 1.3.1 + Spark 3.4.1 [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9092: URL: https://github.com/apache/iceberg/issues/9092#issuecomment-2420904393 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Dynamically support Spark native engine in Iceberg [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on PR #9721: URL: https://github.com/apache/iceberg/pull/9721#issuecomment-2420906960 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] How to insert overwrite with a single commit [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9720: URL: https://github.com/apache/iceberg/issues/9720#issuecomment-2420906919 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Core: checkpoint validation in BaseOverwriteFiles [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] commented on issue #9718: URL: https://github.com/apache/iceberg/issues/9718#issuecomment-2420906840 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] It sometimes throws exception java.lang.AssertionError: assertion failed after upgrade to Iceberg 1.3.1 + Spark 3.4.1 [iceberg]

2024-10-17 Thread via GitHub
github-actions[bot] closed issue #9092: It sometimes throws exception java.lang.AssertionError: assertion failed after upgrade to Iceberg 1.3.1 + Spark 3.4.1 URL: https://github.com/apache/iceberg/issues/9092 -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [I] Document Custom FileIO [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on issue #1233: URL: https://github.com/apache/iceberg-python/issues/1233#issuecomment-2420894695 Assigned to you. I think we can add it under [the `FileIO` section](https://py.iceberg.apache.org/configuration/#fileio) as something like "Custom FileIO Implementations"

Re: [PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on PR #1234: URL: https://github.com/apache/iceberg-python/pull/1234#issuecomment-2420893458 hey @omkenge, thanks for the PR. What use case is this for? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Revert "Support wasb[s] paths in ADLSFileIO" [iceberg]

2024-10-17 Thread via GitHub
danielcweeks commented on PR #11344: URL: https://github.com/apache/iceberg/pull/11344#issuecomment-2420833542 > Can we have some examples though to guard against these changes in the future? Strings that won't parse correctly? There are a lot of subtle issues like hashCode and equali

Re: [PR] Arrow: Fix indexing in Parquet dictionary encoded values readers [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11247: URL: https://github.com/apache/iceberg/pull/11247#discussion_r1805583085 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/parquet/vectorized/TestParquetDictionaryEncodedVectorizedReads.java: ## @@ -93,4 +125,64 @@ p

Re: [PR] Arrow: Fix indexing in Parquet dictionary encoded values readers [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11247: URL: https://github.com/apache/iceberg/pull/11247#discussion_r1805582734 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/parquet/vectorized/TestParquetDictionaryEncodedVectorizedReads.java: ## @@ -93,4 +125,64 @@ p

[PR] Bump mypy-boto3-glue from 1.35.23 to 1.35.25 [iceberg-python]

2024-10-17 Thread via GitHub
dependabot[bot] opened a new pull request, #1236: URL: https://github.com/apache/iceberg-python/pull/1236 Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.35.23 to 1.35.25. Commits See full diff in https://github.com/youtype/mypy_boto3_builder/commi

[PR] Bump mkdocstrings from 0.26.1 to 0.26.2 [iceberg-python]

2024-10-17 Thread via GitHub
dependabot[bot] opened a new pull request, #1235: URL: https://github.com/apache/iceberg-python/pull/1235 Bumps [mkdocstrings](https://github.com/mkdocstrings/mkdocstrings) from 0.26.1 to 0.26.2. Release notes Sourced from https://github.com/mkdocstrings/mkdocstrings/releases";>mkd

Re: [I] Support writing to a branch [iceberg-python]

2024-10-17 Thread via GitHub
vinjai commented on issue #306: URL: https://github.com/apache/iceberg-python/issues/306#issuecomment-2420730890 PR is ready for review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Flink: make FLIP-27 default in SQL and mark the old FlinkSource as deprecated [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on code in PR #11345: URL: https://github.com/apache/iceberg/pull/11345#discussion_r1805519628 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/FlinkConfigOptions.java: ## @@ -88,7 +88,7 @@ private FlinkConfigOptions() {} public static final Con

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2420723401 @pvary I have created a PR to disable the flaky test for now. https://github.com/apache/iceberg/pull/11347 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Bump moto from 5.0.14 to 5.0.17 [iceberg-python]

2024-10-17 Thread via GitHub
sungwy merged PR #1230: URL: https://github.com/apache/iceberg-python/pull/1230 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Bump getdaft from 0.3.2 to 0.3.8 [iceberg-python]

2024-10-17 Thread via GitHub
sungwy merged PR #1228: URL: https://github.com/apache/iceberg-python/pull/1228 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

[PR] [KafkaConnect] Fix RecordConverter [iceberg]

2024-10-17 Thread via GitHub
singhpk234 opened a new pull request, #11346: URL: https://github.com/apache/iceberg/pull/11346 ## About the change The UUID type in the parquet writer expects ByteBuffer rather than UUID otherwise writer fails with : ``` class java.util.UUID cannot be cast to class [B (jav

Re: [PR] Bump pypa/cibuildwheel from 2.21.1 to 2.21.3 [iceberg-python]

2024-10-17 Thread via GitHub
sungwy merged PR #1224: URL: https://github.com/apache/iceberg-python/pull/1224 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Revert "Support wasb[s] paths in ADLSFileIO" [iceberg]

2024-10-17 Thread via GitHub
mrcnc commented on PR #11344: URL: https://github.com/apache/iceberg/pull/11344#issuecomment-2420687278 I'll follow up with another PR that doesn't use java.net.URI for parsing. I'm happy to have more 👀 on this -- This is an automated message from the Apache Git Service. To respond to th

[PR] Flink: make FLIP-27 default in SQL and mark the old FlinkSource as deprecated [iceberg]

2024-10-17 Thread via GitHub
stevenzwu opened a new pull request, #11345: URL: https://github.com/apache/iceberg/pull/11345 SQL config default change is only applied to 1.20. See the dev ML discussion [here](https://lists.apache.org/api/plain?thread=2r34z5drgkn1fqbvktwfzhr0fj39p3th). -- This is an automated me

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-17 Thread via GitHub
mrcnc commented on PR #11294: URL: https://github.com/apache/iceberg/pull/11294#issuecomment-2420675717 > Hi @mrcnc, @RussellSpitzer . To confirm this PR solution to 10127: any existing (or new) iceberg tables stored in azure with wasbs + .blob. will be interpreted interchangeably with abfs

Re: [PR] Revert "Support wasb[s] paths in ADLSFileIO" [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on PR #11344: URL: https://github.com/apache/iceberg/pull/11344#issuecomment-2420666877 @mrcnc Can chime in on that, but I think that's fine. Can we have some examples though to guard against these changes in the future? Strings that won't parse correctly? -- Thi

Re: [PR] Revert "Support wasb[s] paths in ADLSFileIO" [iceberg]

2024-10-17 Thread via GitHub
danielcweeks commented on PR #11344: URL: https://github.com/apache/iceberg/pull/11344#issuecomment-2420655723 I'm not actually opposed to the WASB path support, just concerned about the introduction of the URI class for parsing locations. Is it possible to just revert to the old parsing o

Re: [PR] Task: Simulating OOM error during merge equality deletes [iceberg]

2024-10-17 Thread via GitHub
nicole-martinez closed pull request #11320: Task: Simulating OOM error during merge equality deletes URL: https://github.com/apache/iceberg/pull/11320 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Core: fix NPE with HadoopFileIO because FileIOParser doesn't serialize Hadoop configuration [iceberg]

2024-10-17 Thread via GitHub
stevenzwu commented on PR #10926: URL: https://github.com/apache/iceberg/pull/10926#issuecomment-2420612439 thanks @nastra @pvary @Fokko @rdblue @ashvina for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Core: fix NPE with HadoopFileIO because FileIOParser doesn't serialize Hadoop configuration [iceberg]

2024-10-17 Thread via GitHub
stevenzwu merged PR #10926: URL: https://github.com/apache/iceberg/pull/10926 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark: Add RewriteTablePath action interface [iceberg]

2024-10-17 Thread via GitHub
szehon-ho commented on code in PR #10920: URL: https://github.com/apache/iceberg/pull/10920#discussion_r1805455886 ## api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java: ## @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [I] flink:FlinkSink support dynamically changed schema [iceberg]

2024-10-17 Thread via GitHub
ottomata commented on issue #4190: URL: https://github.com/apache/iceberg/issues/4190#issuecomment-2420588912 Ah, thanks! FWIW, I think schema evolution support is worth the tradeoff of extra bytes per record :) -- This is an automated message from the Apache Git Service. T

Re: [PR] Spark: Add RewriteTablePath action interface [iceberg]

2024-10-17 Thread via GitHub
flyrain commented on code in PR #10920: URL: https://github.com/apache/iceberg/pull/10920#discussion_r1805439452 ## api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java: ## @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-10-17 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2420559048 @dwilson1988 Sounds good, I've made the changes, please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-10-17 Thread via GitHub
pvary commented on PR #10935: URL: https://github.com/apache/iceberg/pull/10935#issuecomment-2420515789 > First of all, we need to discuss the expected behavior: > > * Do we want to resolve equality deletes and map them into data files? Or should we add a new task and output the conte

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
gene-db commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805396309 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
gene-db commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805378000 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [I] flink:FlinkSink support dynamically changed schema [iceberg]

2024-10-17 Thread via GitHub
pvary commented on issue #4190: URL: https://github.com/apache/iceberg/issues/4190#issuecomment-2420417129 It sends the schema along with every record. I'm playing around with a somewhat similar, but more performant solution, where we send only the schemaId instead of the full schema. The t

Re: [PR] Small fix to TestSerializableTypes.java [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer merged PR #11342: URL: https://github.com/apache/iceberg/pull/11342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

[PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-17 Thread via GitHub
omkenge opened a new pull request, #1234: URL: https://github.com/apache/iceberg-python/pull/1234 ### Support for Additional Date Format Summary This PR extends the date/time handling functions by adding support for one additional formats: - `MMDD` (e.g., `20241018`)

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420292280 > https://github.com/apache/iceberg-python/actions/runs/11389073844/job/31690659534?pr=1232 Yup, linting is in place now. Thanks for the reminder! -- This is an automated

Re: [PR] Spec: add variant type [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on PR #10831: URL: https://github.com/apache/iceberg/pull/10831#issuecomment-2420252802 And an entry https://github.com/apache/iceberg/blob/main/format/spec.md#parquet -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Spec: add variant type [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on PR #10831: URL: https://github.com/apache/iceberg/pull/10831#issuecomment-2420248887 This needs some notes in `Partition Transforms` , I think explicitly we should disallow identity For Appendix B - We should define something or state explicitly we don't

Re: [I] Document Custom FileIO [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on issue #1233: URL: https://github.com/apache/iceberg-python/issues/1233#issuecomment-2420246567 > ### Feature Request / Improvement > Add documentation for custom FileIO, similar to [custom catalog](https://py.iceberg.apache.org/configuration/#custom-catalog-implemen

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420244677 https://github.com/apache/iceberg-python/actions/runs/11389073844/job/31690659534?pr=1232 -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420232911 > @sikehish can you fix the CI lint issue? `make lint` should work > > There are other "good first issue"s, please take a look https://github.com/apache/iceberg-python/issue

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
kevinjqliu commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420218362 @sikehish can you fix the CI lint issue? `make lint` should work There are other "good first issue"s, please take a look https://github.com/apache/iceberg-python/issues?q=

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805209323 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more co

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420140541 > LGTM! Thanks for working on this Thank you for the oppurtunity! Do let me know if you would want me to work on any other issue :)) -- This is an automated message from

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1805172828 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -50,27 +59,23 @@ class ADLSLocation { Preconditions.checkArgument(location

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-17 Thread via GitHub
danielcweeks commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1805168268 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -50,27 +59,23 @@ class ADLSLocation { Preconditions.checkArgument(location !

Re: [PR] Spark: Add RewriteTablePath action interface [iceberg]

2024-10-17 Thread via GitHub
szehon-ho commented on code in PR #10920: URL: https://github.com/apache/iceberg/pull/10920#discussion_r1804149603 ## api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java: ## @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1805150599 ## core/src/test/java/org/apache/iceberg/rest/requests/TestPlanTableScanRequest.java: ## @@ -0,0 +1,153 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Small fix to TestSerializableTypes.java [iceberg]

2024-10-17 Thread via GitHub
RussellSpitzer commented on PR #11342: URL: https://github.com/apache/iceberg/pull/11342#issuecomment-2420103045 Thanks @aihuaxu, good typo fix ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-17 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1805144514 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the follo

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1805121855 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -185,6 +200,13 @@ List filterManifests(Schema tableSchema, List manife ret

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1805124784 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -185,6 +200,13 @@ List filterManifests(Schema tableSchema, List manife ret

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1805121855 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -185,6 +200,13 @@ List filterManifests(Schema tableSchema, List manife ret

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-17 Thread via GitHub
rodmeneses commented on PR #11305: URL: https://github.com/apache/iceberg/pull/11305#issuecomment-2420050251 HI @stevenzwu @pvary what is needed so we can merge this change ? I understand there's a flaky test, but that's already in main. Could we move forward with this PR? I need it for my

Re: [PR] Small fix to TestSerializableTypes.java [iceberg]

2024-10-17 Thread via GitHub
aihuaxu commented on PR #11342: URL: https://github.com/apache/iceberg/pull/11342#issuecomment-2420046226 cc @RussellSpitzer. Small fix. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[PR] Small fix to TestSerializableTypes.java [iceberg]

2024-10-17 Thread via GitHub
aihuaxu opened a new pull request, #11342: URL: https://github.com/apache/iceberg/pull/11342 Trivial fix for the test. Discovered by working on #11178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-17 Thread via GitHub
bryanck commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1805111909 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Create a

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805107315 ## api/src/test/java/org/apache/iceberg/types/TestSerializableTypes.java: ## @@ -112,13 +113,13 @@ public void testMaps() throws Exception { @Test public void

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-17 Thread via GitHub
ajantha-bhat commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1805104824 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Cre

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805107008 ## api/src/test/java/org/apache/iceberg/util/RandomUtil.java: ## @@ -225,4 +229,8 @@ private static BigInteger randomUnscaled(int precision, Random random) {

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-17 Thread via GitHub
bryanck commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1805103870 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Create a

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-17 Thread via GitHub
bryanck commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1805101850 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Create a

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-17 Thread via GitHub
ajantha-bhat commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1805100167 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Cre

Re: [PR] API: Add Variant data type [iceberg]

2024-10-17 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1805098603 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420009547 @kevinjqliu Hi, I've made the changes. Do let me know if any other changes are to be made. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2420008561 > Also curious if you have suggestion to prevent documentation drift in the future I believe we could utilize automated documentation generation tools or enforce strict doc

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-17 Thread via GitHub
danielcweeks commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1805082199 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Cre

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-17 Thread via GitHub
danielcweeks commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1805082199 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Cre

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1805071134 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -111,6 +115,10 @@ protected void failMissingDeletePaths() { this.failMissingD

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-17 Thread via GitHub
sikehish commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2419979292 > I noticed these 3 options are missing > > https://github.com/apache/iceberg-python/blob/7cf0c225c3cdb32ac5e390de06b7b0e4fe7de92e/pyiceberg/table/__init__.py#L197-L204

  1   2   >