Re: [PR] Deprecate ADLFS prefix in favor of ADLS [iceberg-python]

2024-07-26 Thread via GitHub
HonahX commented on code in PR #961: URL: https://github.com/apache/iceberg-python/pull/961#discussion_r1693579181 ## pyiceberg/io/fsspec.py: ## @@ -176,14 +186,50 @@ def _gs(properties: Properties) -> AbstractFileSystem: def _adlfs(properties: Properties) -> AbstractFileSystem

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-07-26 Thread via GitHub
advancedxy commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1693879617 ## core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [I] Investigation about tracing, logging, and metrics support. [iceberg-rust]

2024-07-26 Thread via GitHub
Xuanwo commented on issue #482: URL: https://github.com/apache/iceberg-rust/issues/482#issuecomment-2253782960 > Adoption of tracing has supplanted that of logging over the past few years. Users who rely on `tracing` can still integrate with our `log`, as tracing has native integratio

Re: [PR] UpdatePartitionSpec: Added ability to not set the new partition spec as default [iceberg]

2024-07-26 Thread via GitHub
shanielh commented on PR #10736: URL: https://github.com/apache/iceberg/pull/10736#issuecomment-225377 > Ah just saw/remembered [#10736 (comment)](https://github.com/apache/iceberg/pull/10736#issuecomment-2244231366), the write APIs Iceberg exposes allow passing in a given spec ID. So y

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-07-26 Thread via GitHub
amogh-jahagirdar commented on PR #9008: URL: https://github.com/apache/iceberg/pull/9008#issuecomment-2253718238 Yes, I'll take another pass tomorrow! Thanks for your patience @jacobmarble -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] UpdatePartitionSpec: Added ability to not set the new partition spec as default [iceberg]

2024-07-26 Thread via GitHub
amogh-jahagirdar commented on code in PR #10736: URL: https://github.com/apache/iceberg/pull/10736#discussion_r1693722945 ## api/src/main/java/org/apache/iceberg/UpdatePartitionSpec.java: ## @@ -122,4 +122,12 @@ public interface UpdatePartitionSpec extends PendingUpdate { *

Re: [PR] Core: Add estimateRowCount for Files and Entries Metadata Tables [iceberg]

2024-07-26 Thread via GitHub
lurnagao-dahua commented on PR #10759: URL: https://github.com/apache/iceberg/pull/10759#issuecomment-2253706391 > This is for a metadata table and not data table. Hence for manifest-read task, each manifest's row count is the number of files it references (added + deleted + existing)

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-26 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1693715757 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -422,26 +422,23 @@ public void commit() { throw e; } + // at thi

Re: [PR] [ICEBERG-FLINK]support read hive configuration from HIVE_HOME&HIVE_CONF_DIR env [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] closed pull request #3034: [ICEBERG-FLINK]support read hive configuration from HIVE_HOME&HIVE_CONF_DIR env URL: https://github.com/apache/iceberg/pull/3034 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] #2468 fix the catalog interface cast exception. [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] closed pull request #3032: #2468 fix the catalog interface cast exception. URL: https://github.com/apache/iceberg/pull/3032 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Spark: Add Spark extension for table encryption key [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] closed pull request #3013: Spark: Add Spark extension for table encryption key URL: https://github.com/apache/iceberg/pull/3013 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Flink: use deleteKey method when write delete data to only write the parimary key to the eqDeleteFile [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] closed pull request #3012: Flink: use deleteKey method when write delete data to only write the parimary key to the eqDeleteFile URL: https://github.com/apache/iceberg/pull/3012 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Flink: use deleteKey method when write delete data to only write the parimary key to the eqDeleteFile [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #3012: URL: https://github.com/apache/iceberg/pull/3012#issuecomment-2253667711 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Handle OVERWRITE snapshot on spark streaming for table v1 [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2944: URL: https://github.com/apache/iceberg/pull/2944#issuecomment-2253667649 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark: remove object storage data path in destination table for snapshot table action [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] closed pull request #2966: Spark: remove object storage data path in destination table for snapshot table action URL: https://github.com/apache/iceberg/pull/2966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [WIP] Core: Add Index File Metadata into Iceberg [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2290: URL: https://github.com/apache/iceberg/pull/2290#issuecomment-2253667534 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] RowDelta validate data files to recent rewrite. [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2381: URL: https://github.com/apache/iceberg/pull/2381#issuecomment-2253667565 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] [SPARK] Spark parquet read timestamp without timezone [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2282: URL: https://github.com/apache/iceberg/pull/2282#issuecomment-2253667510 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: store watermark as iceberg table's property [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2109: URL: https://github.com/apache/iceberg/pull/2109#issuecomment-2253667406 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Support for PartitionStatsFile in each snapshot [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2182: URL: https://github.com/apache/iceberg/pull/2182#issuecomment-2253667414 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Partition Stats file schema for Iceberg tables [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1985: URL: https://github.com/apache/iceberg/pull/1985#issuecomment-2253667388 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Avro metrics support: track metrics in Avro value writers [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1963: URL: https://github.com/apache/iceberg/pull/1963#issuecomment-2253667372 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] [ICEBERG-FLINK]support read hive configuration from HIVE_HOME&HIVE_CONF_DIR env [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #3034: URL: https://github.com/apache/iceberg/pull/3034#issuecomment-2253667786 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Arrow: FIXED type support [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] closed pull request #3029: Arrow: FIXED type support URL: https://github.com/apache/iceberg/pull/3029 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Arrow: FIXED type support [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #3029: URL: https://github.com/apache/iceberg/pull/3029#issuecomment-2253667761 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] #2468 fix the catalog interface cast exception. [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #3032: URL: https://github.com/apache/iceberg/pull/3032#issuecomment-2253667770 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark: Add Spark extension for table encryption key [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #3013: URL: https://github.com/apache/iceberg/pull/3013#issuecomment-2253667729 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Handle OVERWRITE snapshot on spark streaming for table v1 [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] closed pull request #2944: Handle OVERWRITE snapshot on spark streaming for table v1 URL: https://github.com/apache/iceberg/pull/2944 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Spark: remove object storage data path in destination table for snapshot table action [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2966: URL: https://github.com/apache/iceberg/pull/2966#issuecomment-2253667675 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Hive: Fix predicate pushdown for Timestamp.withZone() [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2278: URL: https://github.com/apache/iceberg/pull/2278#issuecomment-2253667488 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink : migrate hive table to iceberg table [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2217: URL: https://github.com/apache/iceberg/pull/2217#issuecomment-2253667441 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink : add computed column support for flink [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2265: URL: https://github.com/apache/iceberg/pull/2265#issuecomment-2253667453 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Support structured streaming read for Iceberg [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2272: URL: https://github.com/apache/iceberg/pull/2272#issuecomment-2253667468 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Flink: IcebergTableSink to write data into multiple iceberg tables [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on issue #2208: URL: https://github.com/apache/iceberg/issues/2208#issuecomment-2253667427 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Flink: add show partitions with specified partitions [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #2082: URL: https://github.com/apache/iceberg/pull/2082#issuecomment-2253667396 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] WIP: SymmetricKeyEncryptionManager [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1918: URL: https://github.com/apache/iceberg/pull/1918#issuecomment-2253667345 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Avro metrics support [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1935: URL: https://github.com/apache/iceberg/pull/1935#issuecomment-2253667360 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Remove useless table directories when use HiveCatalog.dropTable and purge is true [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1839: URL: https://github.com/apache/iceberg/pull/1839#issuecomment-2253667306 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: Support sink when disable flink checkpoint disable [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1515: URL: https://github.com/apache/iceberg/pull/1515#issuecomment-2253667270 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] AppenderMetrics [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1060: URL: https://github.com/apache/iceberg/pull/1060#issuecomment-2253667235 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Spark DataFrame write fails if input dataframe has columns in different order than iceberg schema [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] closed issue #741: Spark DataFrame write fails if input dataframe has columns in different order than iceberg schema URL: https://github.com/apache/iceberg/issues/741 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] [WIP]Use relative path for manifest_path and file_path [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #491: URL: https://github.com/apache/iceberg/pull/491#issuecomment-2253667180 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull re

Re: [PR] Add Atomic Write support for Hadoop Tables [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1871: URL: https://github.com/apache/iceberg/pull/1871#issuecomment-2253667327 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] WIP add rewrite file operator after iceberg committer [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1669: URL: https://github.com/apache/iceberg/pull/1669#issuecomment-2253667294 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Parquet: Support Page Skipping in Iceberg Parquet Reader [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1566: URL: https://github.com/apache/iceberg/pull/1566#issuecomment-2253667284 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] RemoveOrphanFiles: consider only path to compare and delete, avoid authority and scheme [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1471: URL: https://github.com/apache/iceberg/pull/1471#issuecomment-2253667260 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Add docker demo for Iceberg [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on PR #1100: URL: https://github.com/apache/iceberg/pull/1100#issuecomment-2253667244 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Spark DataFrame write fails if input dataframe has columns in different order than iceberg schema [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on issue #741: URL: https://github.com/apache/iceberg/issues/741#issuecomment-2253667226 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Git

Re: [I] Vectorize read of complex/nested data types [iceberg]

2024-07-26 Thread via GitHub
github-actions[bot] commented on issue #521: URL: https://github.com/apache/iceberg/issues/521#issuecomment-2253667205 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
huaxingao commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693697534 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -130,6 +142,98 @@ public void testEstimatedRowCount() throws NoSuchTabl

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
huaxingao commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693697463 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +185,46 @@ public Statistics estimateStatistics() { protected Stat

[I] Use Min, Max, and NumOfNulls from Manifest Files for Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
huaxingao opened a new issue, #10791: URL: https://github.com/apache/iceberg/issues/10791 ### Feature Request / Improvement I am adding Spark Column Stats in this [pull request](https://github.com/apache/iceberg/pull/10659). Currently, in this PR, I only populate the NDV values. I pl

Re: [PR] Spark Action to Analyze table [iceberg]

2024-07-26 Thread via GitHub
aokolnychyi commented on code in PR #10288: URL: https://github.com/apache/iceberg/pull/10288#discussion_r1693682403 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/ComputeTableStatsSparkAction.java: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software

Re: [PR] Spark Action to Analyze table [iceberg]

2024-07-26 Thread via GitHub
aokolnychyi commented on code in PR #10288: URL: https://github.com/apache/iceberg/pull/10288#discussion_r1693679423 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/ComputeTableStatsSparkAction.java: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software

Re: [PR] Add `IcebergAnalysisException` in iceberg-spark module [iceberg]

2024-07-26 Thread via GitHub
huaxingao commented on PR #10766: URL: https://github.com/apache/iceberg/pull/10766#issuecomment-2253612812 > We haven't written this down (I think) but we generally have similar rules that we should probably publish. I will check to see if we need a doc PR for this -- This is an a

Re: [PR] Add `IcebergAnalysisException` in iceberg-spark module [iceberg]

2024-07-26 Thread via GitHub
huaxingao commented on PR #10766: URL: https://github.com/apache/iceberg/pull/10766#issuecomment-2253607788 > Do we have to do this in 3.5 or is it only required for 4.0? @RussellSpitzer Thanks for reviewing the PR! We don't have to do this in 3.5. I was just trying to separate a few

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-26 Thread via GitHub
emkornfield commented on code in PR #10780: URL: https://github.com/apache/iceberg/pull/10780#discussion_r1692309633 ## site/docs/contribute.md: ## @@ -45,6 +45,17 @@ The Iceberg community prefers to receive contributions as [Github pull requests] * If a PR is related to an is

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-26 Thread via GitHub
emkornfield commented on PR #10780: URL: https://github.com/apache/iceberg/pull/10780#issuecomment-2253589056 > As a side note, should this also be included in the subprojects' documentation, such as iceberg-python and iceberg-rust? IIUC, this should get published to the central websi

Re: [PR] Spark Action to Analyze table [iceberg]

2024-07-26 Thread via GitHub
karuppayya commented on code in PR #10288: URL: https://github.com/apache/iceberg/pull/10288#discussion_r1693434170 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/NDVSketchGenerator.java: ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Add `IcebergAnalysisException` in iceberg-spark module [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on PR #10766: URL: https://github.com/apache/iceberg/pull/10766#issuecomment-2253554397 My only worry here is that RevAPI is missing this as a public api method change because it's in Scala but we are changing the named exception being thrown and thus changing a met

Re: [PR] Add `IcebergAnalysisException` in iceberg-spark module [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on PR #10766: URL: https://github.com/apache/iceberg/pull/10766#issuecomment-2253550137 @huaxingao Do we have to do this in 3.5 or is it only required for 4.0? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Add `IcebergAnalysisException` in iceberg-spark module [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on PR #10766: URL: https://github.com/apache/iceberg/pull/10766#issuecomment-2253547485 > This makes sense, let me take a deeper look tomorrow. I think in a subsequent pr we can make a pass in the error messages to try to follow: https://spark.apache.org/error-messa

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693627295 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -130,6 +142,98 @@ public void testEstimatedRowCount() throws NoSuc

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693627142 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -130,6 +142,98 @@ public void testEstimatedRowCount() throws NoSuc

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693625673 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +185,46 @@ public Statistics estimateStatistics() { protected

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
huaxingao commented on PR #10659: URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2253538504 @jeesou @saitharun15 Thanks for testing this out! The column stats are not yet accurate because I still need to retrieve the numOfNulls, min, and max from the manifest files. I wi

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
huaxingao commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693621270 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -130,6 +142,61 @@ public void testEstimatedRowCount() throws NoSuchTabl

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-26 Thread via GitHub
huaxingao commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693621129 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +184,37 @@ public Statistics estimateStatistics() { protected Stat

Re: [PR] Flink: support limit pushdown in FLIP-27 source [iceberg]

2024-07-26 Thread via GitHub
stevenzwu commented on code in PR #10748: URL: https://github.com/apache/iceberg/pull/10748#discussion_r1693308647 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkSourceConfig.java: ## @@ -49,11 +48,11 @@ public void testFlinkHintConfig() { @Test

[PR] Podman support [iceberg-rust]

2024-07-26 Thread via GitHub
alexyin1 opened a new pull request, #489: URL: https://github.com/apache/iceberg-rust/pull/489 Close [#474](https://github.com/apache/iceberg-rust/issues/474) This PR adds podman support by updating the get "Os/Arch" code in docker.rs and some configuration tricks. In addition

Re: [I] How to query a specified partition data file? [iceberg]

2024-07-26 Thread via GitHub
dramaticlly commented on issue #10725: URL: https://github.com/apache/iceberg/issues/10725#issuecomment-2253481450 looks like iceberg data files table scan is not able to evaluate expression completely, so client side filtering also happens. Speak of which, have you tried to scan on table d

Re: [PR] add missing integration test marker [iceberg-python]

2024-07-26 Thread via GitHub
sungwy merged PR #969: URL: https://github.com/apache/iceberg-python/pull/969 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Deprecate ADLFS prefix in favor of ADLS [iceberg-python]

2024-07-26 Thread via GitHub
HonahX commented on code in PR #961: URL: https://github.com/apache/iceberg-python/pull/961#discussion_r1693562585 ## pyiceberg/io/__init__.py: ## @@ -62,13 +70,13 @@ HDFS_PORT = "hdfs.port" HDFS_USER = "hdfs.user" HDFS_KERB_TICKET = "hdfs.kerberos_ticket" -ADLFS_CONNECTION_S

Re: [I] Geospatial Support [iceberg]

2024-07-26 Thread via GitHub
jiayuasu commented on issue #10260: URL: https://github.com/apache/iceberg/issues/10260#issuecomment-2253467257 @desruisseaux Thanks for the great suggestion. The status of this PR is to wait until the Parquet format accepts the geometry type (mostly by absorbing GeoParquet into the P

Re: [PR] API: Define RepairManifests action interface [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10784: URL: https://github.com/apache/iceberg/pull/10784#discussion_r1693578847 ## api/src/main/java/org/apache/iceberg/actions/RepairManifests.java: ## @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] API: Define RepairManifests action interface [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10784: URL: https://github.com/apache/iceberg/pull/10784#discussion_r1693576095 ## api/src/main/java/org/apache/iceberg/actions/RepairManifests.java: ## @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] API: Define RepairManifests action interface [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10784: URL: https://github.com/apache/iceberg/pull/10784#discussion_r1693574845 ## api/src/main/java/org/apache/iceberg/actions/RepairManifests.java: ## @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] API: Define RepairManifests action interface [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10784: URL: https://github.com/apache/iceberg/pull/10784#discussion_r1693574574 ## api/src/main/java/org/apache/iceberg/actions/RepairManifests.java: ## @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-26 Thread via GitHub
emkornfield commented on code in PR #10780: URL: https://github.com/apache/iceberg/pull/10780#discussion_r1693573999 ## site/docs/contribute.md: ## @@ -45,6 +45,16 @@ The Iceberg community prefers to receive contributions as [Github pull requests] * If a PR is related to an is

[PR] add missing integration test marker [iceberg-python]

2024-07-26 Thread via GitHub
sungwy opened a new pull request, #969: URL: https://github.com/apache/iceberg-python/pull/969 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10780: URL: https://github.com/apache/iceberg/pull/10780#discussion_r1693570087 ## site/docs/contribute.md: ## @@ -45,6 +45,16 @@ The Iceberg community prefers to receive contributions as [Github pull requests] * If a PR is related to an

Re: [PR] Deprecate rest.authorization-url in favor of oauth2-server-uri [iceberg-python]

2024-07-26 Thread via GitHub
HonahX commented on code in PR #962: URL: https://github.com/apache/iceberg-python/pull/962#discussion_r1693532560 ## pyiceberg/catalog/rest.py: ## @@ -290,11 +293,38 @@ def url(self, endpoint: str, prefixed: bool = True, **kwargs: Any) -> str: @property def auth_ur

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1693525020 ## core/src/test/java/org/apache/iceberg/TestRemoveUnusedSpecs.java: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-26 Thread via GitHub
jackye1995 commented on code in PR #10780: URL: https://github.com/apache/iceberg/pull/10780#discussion_r1693523064 ## site/docs/contribute.md: ## @@ -45,6 +45,16 @@ The Iceberg community prefers to receive contributions as [Github pull requests] * If a PR is related to an iss

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1693522333 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -1102,6 +1121,22 @@ public Builder setDefaultPartitionSpec(int specId) { return this;

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-26 Thread via GitHub
jackye1995 commented on code in PR #10780: URL: https://github.com/apache/iceberg/pull/10780#discussion_r1693521805 ## site/docs/contribute.md: ## @@ -45,6 +45,16 @@ The Iceberg community prefers to receive contributions as [Github pull requests] * If a PR is related to an iss

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-26 Thread via GitHub
jackye1995 commented on code in PR #10780: URL: https://github.com/apache/iceberg/pull/10780#discussion_r1693521805 ## site/docs/contribute.md: ## @@ -45,6 +45,16 @@ The Iceberg community prefers to receive contributions as [Github pull requests] * If a PR is related to an iss

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1693521937 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -1102,6 +1121,22 @@ public Builder setDefaultPartitionSpec(int specId) { return this;

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1693521273 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -597,6 +597,24 @@ public TableMetadata replaceProperties(Map rawProperties) { .bui

Re: [I] Configure root path in Catalog or FileIO? [iceberg-rust]

2024-07-26 Thread via GitHub
fqaiser94 commented on issue #488: URL: https://github.com/apache/iceberg-rust/issues/488#issuecomment-2253361945 cc: @liurenjie1024 @Xuanwo let me know what you think, I've shared my thoughts above. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Deprecate rest.authorization-url in favor of oauth2-server-uri [iceberg-python]

2024-07-26 Thread via GitHub
ndrluis commented on code in PR #962: URL: https://github.com/apache/iceberg-python/pull/962#discussion_r1693511401 ## mkdocs/docs/how-to-release.md: ## @@ -38,6 +38,8 @@ For example, the API with the following deprecation tag should be removed when p ) ``` +We also have th

[I] Catalog [iceberg-rust]

2024-07-26 Thread via GitHub
fqaiser94 opened a new issue, #488: URL: https://github.com/apache/iceberg-rust/issues/488 There was an unresolved [discussion](https://github.com/apache/iceberg-rust/pull/475#discussion_r1690861564) about whether we should: - Allow users to configure a path as a root of the catalog and

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1693509902 ## core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-26 Thread via GitHub
sl255051 commented on PR #10678: URL: https://github.com/apache/iceberg/pull/10678#issuecomment-2253353743 I've updated the PR with a case-sensitivity flag pattern I've found in other parts of the Iceberg codebase. I hope this change is more paletable to the Iceberg community. -- This is

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1693504833 ## core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-07-26 Thread via GitHub
RussellSpitzer commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1693505221 ## core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Deprecate rest.authorization-url in favor of oauth2-server-uri [iceberg-python]

2024-07-26 Thread via GitHub
ndrluis commented on code in PR #962: URL: https://github.com/apache/iceberg-python/pull/962#discussion_r1693503897 ## pyiceberg/catalog/rest.py: ## @@ -290,11 +293,38 @@ def url(self, endpoint: str, prefixed: bool = True, **kwargs: Any) -> str: @property def auth_u

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-26 Thread via GitHub
jackye1995 commented on code in PR #10780: URL: https://github.com/apache/iceberg/pull/10780#discussion_r1693340621 ## site/docs/contribute.md: ## @@ -45,6 +45,16 @@ The Iceberg community prefers to receive contributions as [Github pull requests] * If a PR is related to an iss

Re: [PR] Deprecate rest.authorization-url in favor of oauth2-server-uri [iceberg-python]

2024-07-26 Thread via GitHub
HonahX commented on code in PR #962: URL: https://github.com/apache/iceberg-python/pull/962#discussion_r1693491040 ## pyiceberg/catalog/rest.py: ## @@ -290,11 +293,38 @@ def url(self, endpoint: str, prefixed: bool = True, **kwargs: Any) -> str: @property def auth_ur

Re: [PR] Deprecate rest.authorization-url in favor of oauth2-server-uri [iceberg-python]

2024-07-26 Thread via GitHub
HonahX commented on code in PR #962: URL: https://github.com/apache/iceberg-python/pull/962#discussion_r1693476700 ## mkdocs/docs/how-to-release.md: ## @@ -38,6 +38,8 @@ For example, the API with the following deprecation tag should be removed when p ) ``` +We also have the

  1   2   3   >