Re: [PR] Add Files metadata table [iceberg-python]

2024-09-25 Thread via GitHub
DieHertz commented on PR #614: URL: https://github.com/apache/iceberg-python/pull/614#issuecomment-2375318027 Will do -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Build: Bump Spark 3.5 to 3.5.3 [iceberg]

2024-09-25 Thread via GitHub
RussellSpitzer commented on PR #11160: URL: https://github.com/apache/iceberg/pull/11160#issuecomment-2375318216 Theoretical patch for changing to DelegatingCatalogExtension - Note this breaks a bunch of stuff (staging is broken and init has to be skipped so configuration is broken)

Re: [PR] DO NOT MERGE WILL BREAK [iceberg]

2024-09-25 Thread via GitHub
RussellSpitzer commented on code in PR #11210: URL: https://github.com/apache/iceberg/pull/11210#discussion_r1776029888 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java: ## @@ -193,7 +148,7 @@ public StagedTable stageCreate( } cat

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-25 Thread via GitHub
stevenzwu commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1776031911 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/maintenance/stream/ScheduledBuilderTestBase.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Sof

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375337828 is this the source of truth? https://iceberg.apache.org/spec/#partition-transforms -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Core: Add ContentFileSet and ContentFileWrapper [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11195: URL: https://github.com/apache/iceberg/pull/11195#discussion_r1776041008 ## api/src/main/java/org/apache/iceberg/util/ContentFileSet.java: ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1776062427 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1776062427 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-25 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1776421822 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ## @@ -140,12 +141,18 @@ public static class ConstantVectorHolder extends VectorHolder

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-09-25 Thread via GitHub
arkadius commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2375109539 Hi @rodmeneses, by everything works I meant that I did some manual tests and the results were the same as with the old one. Probably "everything" was an overkill here ;-) Yes, I can ta

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-09-25 Thread via GitHub
rodmeneses commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2375118465 > Hi @rodmeneses, by everything works I meant that I did some manual tests and the results were the same as with the old one. Probably "everything" was an overkill here ;-) Yes, I ca

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-09-25 Thread via GitHub
rodmeneses commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2375119421 > > Hi @rodmeneses, by everything works I meant that I did some manual tests and the results were the same as with the old one. Probably "everything" was an overkill here ;-) Yes, I

Re: [PR] Core: Add ContentFileSet and ContentFileWrapper [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11195: URL: https://github.com/apache/iceberg/pull/11195#discussion_r1776041636 ## api/src/main/java/org/apache/iceberg/util/ContentFileSet.java: ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Core: Add ContentFileSet and ContentFileWrapper [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11195: URL: https://github.com/apache/iceberg/pull/11195#discussion_r1776041636 ## api/src/main/java/org/apache/iceberg/util/ContentFileSet.java: ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Spark: Added merge schema as spark configuration [iceberg]

2024-09-25 Thread via GitHub
RussellSpitzer merged PR #9640: URL: https://github.com/apache/iceberg/pull/9640 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Spark: Added merge schema as spark configuration [iceberg]

2024-09-25 Thread via GitHub
RussellSpitzer commented on PR #9640: URL: https://github.com/apache/iceberg/pull/9640#issuecomment-2375349511 Thanks for the PR @aleenamg21-1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] Bump getdaft from 0.3.2 to 0.3.4 [iceberg-python]

2024-09-25 Thread via GitHub
dependabot[bot] opened a new pull request, #1209: URL: https://github.com/apache/iceberg-python/pull/1209 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.3.2 to 0.3.4. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

Re: [PR] Bump getdaft from 0.3.2 to 0.3.3 [iceberg-python]

2024-09-25 Thread via GitHub
dependabot[bot] closed pull request #1204: Bump getdaft from 0.3.2 to 0.3.3 URL: https://github.com/apache/iceberg-python/pull/1204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Bump getdaft from 0.3.2 to 0.3.3 [iceberg-python]

2024-09-25 Thread via GitHub
dependabot[bot] commented on PR #1204: URL: https://github.com/apache/iceberg-python/pull/1204#issuecomment-2375358770 Superseded by #1209. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1776175392 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1776175392 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1776175392 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375566385 Im not 100% sure, perhaps the metadata table does the transformation. https://iceberg.apache.org/docs/latest/spark-queries/#partitions -- This is an automated message from

Re: [PR] Introduces the new IcebergSink based on the new V2 Flink Sink Abstraction [iceberg]

2024-09-25 Thread via GitHub
stevenzwu commented on PR #10179: URL: https://github.com/apache/iceberg/pull/10179#issuecomment-2375754209 yes, we should have a config to determine which sink implementation used for Table API/SQL. Default should be using the old `FlinkSink`. When the new v2 sink implementation becomes st

[I] Why does executing a sql "desc tableA" in hive command line report a error on a iceberg table with decimal(2,2) field type [iceberg]

2024-09-25 Thread via GitHub
denghaiy opened a new issue, #11211: URL: https://github.com/apache/iceberg/issues/11211 ### Apache Iceberg version 1.0.0 ### Query engine Spark ### Please describe the bug šŸž We have created a iceberg table named "test.tableA" with a column type decimal(

Re: [PR] Build: Bump Spark 3.5 to 3.5.3 [iceberg]

2024-09-25 Thread via GitHub
manuzhang commented on PR #11160: URL: https://github.com/apache/iceberg/pull/11160#issuecomment-2375773222 Spark community is [reverting changes](https://github.com/apache/spark/pull/48257), and we will skip `3.5.3` and wait for next Spark 3.5 release. -- This is an automated message fr

Re: [I] What's the use of old metadata fileļ¼Œ why not delete by default? [iceberg]

2024-09-25 Thread via GitHub
madeirak commented on issue #11206: URL: https://github.com/apache/iceberg/issues/11206#issuecomment-2375784911 > Keeping old metadata helps support [rollback & time travel](https://iceberg.apache.org/docs/latest/spark-queries/#time-travel). It's often useful to know what the state of the t

Re: [PR] Core: Add a util to compute partition stats [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11146: URL: https://github.com/apache/iceberg/pull/11146#discussion_r1776149156 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,137 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] OpenAPI: Add AppendDataFile models to openapi spec for fine grained metadata commits [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #10202: URL: https://github.com/apache/iceberg/pull/10202#discussion_r1776236397 ## open-api/rest-catalog-open-api.yaml: ## @@ -2893,6 +3003,37 @@ components: additionalProperties: type: string +AppendDataFil

Re: [PR] DO NOT MERGE WILL BREAK - Change BaseCatalog to Interface [iceberg]

2024-09-25 Thread via GitHub
manuzhang commented on PR #11210: URL: https://github.com/apache/iceberg/pull/11210#issuecomment-2375771669 Thanks @RussellSpitzer. Spark community is [reverting the changes](https://github.com/apache/spark/pull/48257) such that we don't need to change `BaseCatalog` now. -- This is an au

Re: [PR] PR #1169 [iceberg-python]

2024-09-25 Thread via GitHub
JE-Chen commented on code in PR #1206: URL: https://github.com/apache/iceberg-python/pull/1206#discussion_r1776310431 ## pyiceberg/io/pyarrow.py: ## @@ -1068,20 +1068,13 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return StringType() e

Re: [PR] PR #1169 [iceberg-python]

2024-09-25 Thread via GitHub
JE-Chen commented on code in PR #1206: URL: https://github.com/apache/iceberg-python/pull/1206#discussion_r1776310431 ## pyiceberg/io/pyarrow.py: ## @@ -1068,20 +1068,13 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return StringType() e

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375801600 > Im not 100% sure, perhaps the metadata table does the transformation. > > https://iceberg.apache.org/docs/latest/spark-queries/#partitions I think you are correct

Re: [PR] PR #1169 [iceberg-python]

2024-09-25 Thread via GitHub
JE-Chen commented on code in PR #1206: URL: https://github.com/apache/iceberg-python/pull/1206#discussion_r1776310431 ## pyiceberg/io/pyarrow.py: ## @@ -1068,20 +1068,13 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return StringType() e

Re: [I] Enabling schema evolution feature using spark configuration like we have in Delta Lake [iceberg]

2024-09-25 Thread via GitHub
aleenamg21-1 closed issue #9651: Enabling schema evolution feature using spark configuration like we have in Delta Lake URL: https://github.com/apache/iceberg/issues/9651 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] Enabling schema evolution feature using spark configuration like we have in Delta Lake [iceberg]

2024-09-25 Thread via GitHub
aleenamg21-1 commented on issue #9651: URL: https://github.com/apache/iceberg/issues/9651#issuecomment-2375808924 Closing this issue since [#9640] got merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375810823 I like that its converted, its more readable! Do you know where the transform happens? Is it only for the metadata table? -- This is an automated message from the Apache Git

Re: [I] Why not use the profile name when initialising the S3FileSystem class? [iceberg-python]

2024-09-25 Thread via GitHub
wudihero2 commented on issue #1207: URL: https://github.com/apache/iceberg-python/issues/1207#issuecomment-2375828069 Hello, I am interested in this, do I need to tag the person who will assign this task to me? -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] Core: Remove unused code for streaming position deletes [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on PR #11175: URL: https://github.com/apache/iceberg/pull/11175#issuecomment-2375505084 Thanks @wypoon , merging! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinzwang commented on code in PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#discussion_r1776140899 ## pyiceberg/transforms.py: ## @@ -517,9 +517,6 @@ def day_func(v: Any) -> int: def can_transform(self, source: IcebergType) -> bool: return isi

Re: [PR] Core: Remove unused code for streaming position deletes [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar merged PR #11175: URL: https://github.com/apache/iceberg/pull/11175 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Compatible with Spark4 ļ¼ˆupgrade antlr4 to version 4.13.1 Compatible with jdk17Ā  ļ¼‰ [iceberg]

2024-09-25 Thread via GitHub
manuzhang commented on PR #11204: URL: https://github.com/apache/iceberg/pull/11204#issuecomment-2375511027 Have you checked out https://github.com/apache/iceberg/pull/10622? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-25 Thread via GitHub
stevenzwu commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1776132332 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DeleteFilesProcessor.java: ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Soft

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on code in PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#discussion_r1776147728 ## pyiceberg/transforms.py: ## @@ -517,9 +517,6 @@ def day_func(v: Any) -> int: def can_transform(self, source: IcebergType) -> bool: return isi

Re: [PR] Compatible with Spark4 ļ¼ˆupgrade antlr4 to version 4.13.1 Compatible with jdk17Ā  ļ¼‰ [iceberg]

2024-09-25 Thread via GitHub
awol2005ex commented on PR #11204: URL: https://github.com/apache/iceberg/pull/11204#issuecomment-2375517182 > Have you checked out #10622? No , I just see that -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375530341 Ok so interesting... Spark actually does store day transforms as date type in the metadata, which is why the integration test is failing. This is probably why this library had t

Re: [I] javax.net.ssl.SSLException: Connection reset on S3 w/ S3FileIO and Apache HTTP client [iceberg]

2024-09-25 Thread via GitHub
SandeepSinghGahir commented on issue #10340: URL: https://github.com/apache/iceberg/issues/10340#issuecomment-2375168340 @danielcweeks thanks a lot for the update and prioritizing the fix. Looking forward to the 1.7 release. @amogh-jahagirdar thanks for all the hard work šŸ™Œ -- This is

Re: [PR] Add Files metadata table [iceberg-python]

2024-09-25 Thread via GitHub
DieHertz commented on PR #614: URL: https://github.com/apache/iceberg-python/pull/614#issuecomment-2375186118 Hi guys, sorry if it's not the right place to ask this question. Do you know of a viable way to speed up `table.inspect.files()` for large tables? Maybe something in mind that

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on code in PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#discussion_r1776111931 ## pyiceberg/transforms.py: ## @@ -517,9 +517,6 @@ def day_func(v: Any) -> int: def can_transform(self, source: IcebergType) -> bool: return isi

Re: [PR] Flink: Avoid metaspace memory leak by not registering ShutdownHook for ExecutorService in Flink [iceberg]

2024-09-25 Thread via GitHub
stevenzwu commented on PR #11073: URL: https://github.com/apache/iceberg/pull/11073#issuecomment-2375462905 I think we should first add Javadoc to `ThreadPools.newWorkerPool` that it adds shutdown hook. It is not obvious from the method name. regarding `ThreadPools.newNonExitingWorker

Re: [I] DELETE fails with "java.lang.IllegalArgumentException: info must be ExtendedLogicalWriteInfo" [iceberg]

2024-09-25 Thread via GitHub
github-actions[bot] commented on issue #8926: URL: https://github.com/apache/iceberg/issues/8926#issuecomment-2375486924 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Vulnerabilities found on latest version - jackson, avro, openssl [iceberg]

2024-09-25 Thread via GitHub
github-actions[bot] commented on issue #8923: URL: https://github.com/apache/iceberg/issues/8923#issuecomment-2375486901 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Missing serialVersionUID in Serializable implementation [iceberg]

2024-09-25 Thread via GitHub
github-actions[bot] commented on issue #8929: URL: https://github.com/apache/iceberg/issues/8929#issuecomment-2375486956 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Spark write abort result in table miss metadata location file [iceberg]

2024-09-25 Thread via GitHub
github-actions[bot] commented on issue #8927: URL: https://github.com/apache/iceberg/issues/8927#issuecomment-2375486939 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Spark 3.5: Honor Spark conf spark.sql.files.maxPartitionBytes in read split [iceberg]

2024-09-25 Thread via GitHub
github-actions[bot] commented on PR #8922: URL: https://github.com/apache/iceberg/pull/8922#issuecomment-2375486872 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [I] Iceberg streaming using checkpoint does not ignore the stream-from-timestamp option [iceberg]

2024-09-25 Thread via GitHub
github-actions[bot] commented on issue #8921: URL: https://github.com/apache/iceberg/issues/8921#issuecomment-2375486850 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Spark 3.5: Fix Migrate procedure renaming issue for custom catalog [iceberg]

2024-09-25 Thread via GitHub
github-actions[bot] commented on PR #8931: URL: https://github.com/apache/iceberg/pull/8931#issuecomment-2375486985 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [I] bug: FileScanTask project_field_ids order could be inconsistent with the RecordBatch schema [iceberg-rust]

2024-09-25 Thread via GitHub
liurenjie1024 commented on issue #627: URL: https://github.com/apache/iceberg-rust/issues/627#issuecomment-2375681475 I think this could be solve together with other problems like type promotion. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] OpenAPI: Add AppendDataFile models to openapi spec for fine grained metadata commits [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #10202: URL: https://github.com/apache/iceberg/pull/10202#discussion_r1776260258 ## open-api/rest-catalog-open-api.yaml: ## @@ -2893,6 +3003,37 @@ components: additionalProperties: type: string +AppendDataFil

Re: [PR] OpenAPI: Add AppendDataFile models to openapi spec for fine grained metadata commits [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #10202: URL: https://github.com/apache/iceberg/pull/10202#discussion_r1776236397 ## open-api/rest-catalog-open-api.yaml: ## @@ -2893,6 +3003,37 @@ components: additionalProperties: type: string +AppendDataFil

Re: [PR] feat: Safer PartitionSpec & SchemalessPartitionSpec [iceberg-rust]

2024-09-25 Thread via GitHub
c-thiel commented on PR #645: URL: https://github.com/apache/iceberg-rust/pull/645#issuecomment-2376026264 Introducing `SchemalessPartitionSpec` might be our way to avoid https://github.com/apache/iceberg/issues/4563. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Upgrade to Gradle 8.10.2 [iceberg]

2024-09-25 Thread via GitHub
nastra commented on code in PR #11212: URL: https://github.com/apache/iceberg/pull/11212#discussion_r1776455680 ## gradle/wrapper/gradle-wrapper.properties: ## @@ -1,7 +1,7 @@ distributionBase=GRADLE_USER_HOME distributionPath=wrapper/dists -distributionSha256Sum=1541fa36599e1

Re: [PR] Upgrade to Gradle 8.10.2 [iceberg]

2024-09-25 Thread via GitHub
jbonofre commented on code in PR #11212: URL: https://github.com/apache/iceberg/pull/11212#discussion_r1776457256 ## gradle/wrapper/gradle-wrapper.properties: ## @@ -1,7 +1,7 @@ distributionBase=GRADLE_USER_HOME distributionPath=wrapper/dists -distributionSha256Sum=1541fa36599

Re: [PR] Upgrade to Gradle 8.10.2 [iceberg]

2024-09-25 Thread via GitHub
jbonofre commented on code in PR #11212: URL: https://github.com/apache/iceberg/pull/11212#discussion_r1776464839 ## gradle/wrapper/gradle-wrapper.properties: ## @@ -1,7 +1,7 @@ distributionBase=GRADLE_USER_HOME distributionPath=wrapper/dists -distributionSha256Sum=1541fa36599

Re: [PR] Core: Add a util to compute partition stats [iceberg]

2024-09-25 Thread via GitHub
ajantha-bhat commented on PR #11146: URL: https://github.com/apache/iceberg/pull/11146#issuecomment-2376070543 @aokolnychyi: Thanks for the review and guidance. I have addressed the final nits. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-25 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1776485623 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ## @@ -140,12 +141,18 @@ public static class ConstantVectorHolder extends VectorHolder

Re: [I] `ALTER TABLE ... DROP COLUMN` allows dropping a column used by old PartitionSpecs [iceberg]

2024-09-25 Thread via GitHub
osscm commented on issue #4563: URL: https://github.com/apache/iceberg/issues/4563#issuecomment-2375198579 @hashhar @rdblue any conclusion on this issue, we saw this one with 421 and 438. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Add Files metadata table [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on PR #614: URL: https://github.com/apache/iceberg-python/pull/614#issuecomment-2375285704 I think there's definitely room for improvement. @DieHertz do you mind opening an issue for this? -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] PR #1169 [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on code in PR #1206: URL: https://github.com/apache/iceberg-python/pull/1206#discussion_r1776008439 ## pyiceberg/io/pyarrow.py: ## @@ -1068,20 +1068,13 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return StringType()

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1776062427 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -325,7 +341,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manifes

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1774350481 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -81,6 +81,7 @@ public String partition() { // cache filtered manifests to avo

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1776090183 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -49,7 +50,21 @@ public static , K> K copy( } } + /** + * @deprecated since

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1776090858 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -49,7 +50,21 @@ public static , K> K copy( } } + /** + * @deprecated since

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1776092317 ## core/src/main/java/org/apache/iceberg/DeleteFileIndex.java: ## @@ -458,14 +457,14 @@ DeleteFileIndex build() { } private void add( -CharSeq

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1776091909 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -49,7 +50,21 @@ public static , K> K copy( } } + /** + * @deprecated since

Re: [PR] Core: Support merging in PositionDeleteIndex [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi merged PR #11208: URL: https://github.com/apache/iceberg/pull/11208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Core: Support merging in PositionDeleteIndex [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on PR #11208: URL: https://github.com/apache/iceberg/pull/11208#issuecomment-2375438906 Thank you, @singhpk234 @anuragmantri @amogh-jahagirdar! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375442625 > is this the source of truth? https://iceberg.apache.org/spec/#partition-transforms Yup, precisely -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-09-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1776097229 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -78,9 +78,11 @@ public String partition() { private boolean failMissingDeletePa

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-25 Thread via GitHub
aokolnychyi commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1776090858 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -49,7 +50,21 @@ public static , K> K copy( } } + /** + * @deprecated since

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-25 Thread via GitHub
stevenzwu commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1776099559 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/maintenance/operator/TestExpireSnapshotsProcessor.java: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apac

Re: [PR] AWS: Fix AWS doc URL [iceberg]

2024-09-25 Thread via GitHub
nastra merged PR #11198: URL: https://github.com/apache/iceberg/pull/11198 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build: Bump mkdocs-material from 9.5.34 to 9.5.36 [iceberg]

2024-09-25 Thread via GitHub
dependabot[bot] closed pull request #11190: Build: Bump mkdocs-material from 9.5.34 to 9.5.36 URL: https://github.com/apache/iceberg/pull/11190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Build: Bump mkdocs-macros-plugin from 1.0.5 to 1.2.0 [iceberg]

2024-09-25 Thread via GitHub
nastra merged PR #11189: URL: https://github.com/apache/iceberg/pull/11189 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[PR] Build: Bump mkdocs-material from 9.5.34 to 9.5.37 [iceberg]

2024-09-25 Thread via GitHub
dependabot[bot] opened a new pull request, #11205: URL: https://github.com/apache/iceberg/pull/11205 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.34 to 9.5.37. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdoc

Re: [PR] Build: Bump mkdocs-material from 9.5.34 to 9.5.36 [iceberg]

2024-09-25 Thread via GitHub
dependabot[bot] commented on PR #11190: URL: https://github.com/apache/iceberg/pull/11190#issuecomment-2373519743 Superseded by #11205. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] [DRAFT] Remove unused code for streaming position deletes [iceberg]

2024-09-25 Thread via GitHub
nastra commented on code in PR #11177: URL: https://github.com/apache/iceberg/pull/11177#discussion_r1774870225 ## core/src/main/java/org/apache/iceberg/deletes/Deletes.java: ## @@ -192,27 +185,43 @@ public static PositionDeleteIndex toPositionIndex(CloseableIterable posDel

Re: [PR] Core: Remove unused code for streaming position deletes [iceberg]

2024-09-25 Thread via GitHub
nastra commented on code in PR #11175: URL: https://github.com/apache/iceberg/pull/11175#discussion_r1774876614 ## core/src/main/java/org/apache/iceberg/deletes/Deletes.java: ## @@ -192,27 +185,43 @@ public static PositionDeleteIndex toPositionIndex(CloseableIterable posDel

Re: [PR] Core: Remove unused code for streaming position deletes [iceberg]

2024-09-25 Thread via GitHub
nastra commented on code in PR #11175: URL: https://github.com/apache/iceberg/pull/11175#discussion_r1774877596 ## core/src/main/java/org/apache/iceberg/deletes/Deletes.java: ## @@ -192,27 +185,43 @@ public static PositionDeleteIndex toPositionIndex(CloseableIterable posDel

Re: [PR] update PartitionSpec with snapshot'schema [iceberg]

2024-09-25 Thread via GitHub
nastra commented on PR #11196: URL: https://github.com/apache/iceberg/pull/11196#issuecomment-2373473827 > Could we have a test for all of the documented use-cases: > > ``` > -- time travel to October 26, 1986 at 01:21:00 -> uses the snapshot's schema > SELECT * FROM prod.db.tab

Re: [PR] update PartitionSpec with snapshot'schema [iceberg]

2024-09-25 Thread via GitHub
nastra commented on code in PR #11196: URL: https://github.com/apache/iceberg/pull/11196#discussion_r1774844030 ## core/src/test/java/org/apache/iceberg/DataTableScanTestBase.java: ## @@ -309,4 +310,14 @@ public void testManifestLocationsInScanWithDeleteFiles() throws IOExcepti

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-25 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1774851860 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ## @@ -140,12 +141,18 @@ public static class ConstantVectorHolder extends VectorHolder {

Re: [PR] Spark: Add RewriteTablePath action interface [iceberg]

2024-09-25 Thread via GitHub
nastra commented on PR #10920: URL: https://github.com/apache/iceberg/pull/10920#issuecomment-2373499415 > @nastra can you rerun the workflows please? I fixed a formatting error The formatting error seems to still exist -- This is an automated message from the Apache Git Service. To

Re: [I] Support Vended Credentials for Azure Data Lake Store [iceberg-python]

2024-09-25 Thread via GitHub
c-thiel commented on issue #1146: URL: https://github.com/apache/iceberg-python/issues/1146#issuecomment-2373244987 @sungwy sorry for the late reply, I overlooked this. Looking good now :) -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-09-25 Thread via GitHub
pvary commented on PR #10935: URL: https://github.com/apache/iceberg/pull/10935#issuecomment-2373258387 @RussellSpitzer, @aokolnychyi: would it be possible to take a look at this PR? I did my best to review it, but I'm not an expert in this part of the code. Thanks, Peter -- This is a

Re: [PR] Core: Replace use of CharSequenceMap in DeleteFileIndex with Map [iceberg]

2024-09-25 Thread via GitHub
nastra commented on code in PR #11199: URL: https://github.com/apache/iceberg/pull/11199#discussion_r1774786143 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -49,7 +50,21 @@ public static , K> K copy( } } + /** + * @deprecated since 1.7.0

Re: [PR] Flink: Avoid metaspace memory leak by not registering ShutdownHook for ExecutorService in Flink [iceberg]

2024-09-25 Thread via GitHub
pvary commented on PR #11073: URL: https://github.com/apache/iceberg/pull/11073#issuecomment-2373402484 > @fengjiajie I think Ryan had a good point in the email thread that we probably shouldn't be using the `ThreadPools.newWorkerPool()` with explicit lifecycle management. I think we can sw

Re: [PR] update PartitionSpec with snapshot'schema [iceberg]

2024-09-25 Thread via GitHub
pvary commented on PR #11196: URL: https://github.com/apache/iceberg/pull/11196#issuecomment-2373413818 Could we have a test for all of the documented use-cases: ``` -- time travel to October 26, 1986 at 01:21:00 -> uses the snapshot's schema SELECT * FROM prod.db.table TIMESTAMP AS

Re: [PR] update PartitionSpec with snapshot'schema [iceberg]

2024-09-25 Thread via GitHub
pvary commented on code in PR #11196: URL: https://github.com/apache/iceberg/pull/11196#discussion_r1774798448 ## core/src/test/java/org/apache/iceberg/DataTableScanTestBase.java: ## @@ -309,4 +310,14 @@ public void testManifestLocationsInScanWithDeleteFiles() throws IOExceptio

Re: [PR] update PartitionSpec with snapshot'schema [iceberg]

2024-09-25 Thread via GitHub
pvary commented on code in PR #11196: URL: https://github.com/apache/iceberg/pull/11196#discussion_r1774802463 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -719,7 +719,7 @@ public TableMetadata upgradeToFormatVersion(int newFormatVersion) { return new

  1   2   >