Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1490501986 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -438,6 +439,10 @@ public ViewMetadata build() { metadataLocation == null || changes.isE

Re: [PR] Core: Add property to prevent loss of view representation when replacing a view [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9620: URL: https://github.com/apache/iceberg/pull/9620#discussion_r1490495749 ## core/src/test/java/org/apache/iceberg/view/TestViewMetadata.java: ## @@ -799,4 +802,299 @@ public void lastAddedSchemaFailure() { .isInstanceOf(ValidationExc

Re: [I] refactor: Remove support of manifest list format as a list of file paths. [iceberg-rust]

2024-02-14 Thread via GitHub
Dysprosium0626 commented on issue #158: URL: https://github.com/apache/iceberg-rust/issues/158#issuecomment-1945426309 https://github.com/Dysprosium0626/iceberg-rust/commit/e25adbb07a1fb03a856dc40cfa90c3e253281af9 Hi @liurenjie1024, here is my version of refactor. I just remove the en

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-02-14 Thread via GitHub
stevenzwu commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1490382969 ## open-api/rest-catalog-open-api.yaml: ## @@ -532,6 +532,100 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-02-14 Thread via GitHub
stevenzwu commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1490380768 ## open-api/rest-catalog-open-api.yaml: ## @@ -532,6 +532,100 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-02-14 Thread via GitHub
stevenzwu commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1490373156 ## open-api/rest-catalog-open-api.yaml: ## @@ -532,6 +532,100 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-02-14 Thread via GitHub
stevenzwu commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1490356134 ## open-api/rest-catalog-open-api.py: ## @@ -209,6 +209,16 @@ class MetadataLog(BaseModel): __root__: List[MetadataLogItem] +class PlanTask(BaseModel): +

Re: [I] Core: checkpoint validation in BaseOverwriteFiles [iceberg]

2024-02-14 Thread via GitHub
hrishisd commented on issue #9718: URL: https://github.com/apache/iceberg/issues/9718#issuecomment-1945252225 Sounds good, I'll give it a shot this weekend -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Spark 3.3: Add RemoveDanglingDeletes action [iceberg]

2024-02-14 Thread via GitHub
dramaticlly commented on code in PR #6581: URL: https://github.com/apache/iceberg/pull/6581#discussion_r1490257622 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/RemoveDanglingDeletesSparkAction.java: ## @@ -0,0 +1,227 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on PR #9725: URL: https://github.com/apache/iceberg/pull/9725#issuecomment-1945201021 Also, to keep things moving: +1 when the changes to `expireVersions` and `updateHistory` are removed. The simple fix to update history only once per commit looks good. -- This is an aut

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1490244193 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -504,6 +517,11 @@ static List updateHistory(List history, Set< } } + // ke

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1490242520 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -479,21 +488,21 @@ public ViewMetadata build() { metadataLocation); } -stati

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1490236966 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -479,21 +488,21 @@ public ViewMetadata build() { metadataLocation); } -stati

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1490235686 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -243,6 +244,12 @@ public Builder setCurrentVersionId(int newVersionId) { changes.add(new

Re: [I] Core: complete FileScanTaskParser for other FileScanTask implementation classes (like StaticDataTask) [iceberg]

2024-02-14 Thread via GitHub
stevenzwu commented on issue #9597: URL: https://github.com/apache/iceberg/issues/9597#issuecomment-1945184620 > I think we should definitely use single-value serialization for the values in the structs when we convert to JSON. I probably wouldn't use objects, though. We could use a list an

Re: [PR] Core: Add property to prevent loss of view representation when replacing a view [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9620: URL: https://github.com/apache/iceberg/pull/9620#discussion_r1490232663 ## core/src/test/java/org/apache/iceberg/view/TestViewMetadata.java: ## @@ -799,4 +802,299 @@ public void lastAddedSchemaFailure() { .isInstanceOf(ValidationExc

Re: [PR] Core: Add property to prevent loss of view representation when replacing a view [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9620: URL: https://github.com/apache/iceberg/pull/9620#discussion_r1490231921 ## core/src/test/java/org/apache/iceberg/view/TestViewMetadata.java: ## @@ -334,6 +334,9 @@ public void viewMetadataAndMetadataChanges() { .addVersion(viewV

Re: [PR] Core: Add property to prevent loss of view representation when replacing a view [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9620: URL: https://github.com/apache/iceberg/pull/9620#discussion_r1490231465 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -510,5 +529,29 @@ static List updateHistory(List history, Set< private Stream changes(Class

Re: [PR] Core: Add property to prevent loss of view representation when replacing a view [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9620: URL: https://github.com/apache/iceberg/pull/9620#discussion_r1490231179 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -438,6 +440,16 @@ public ViewMetadata build() { metadataLocation == null || changes.isE

Re: [PR] Core: Add property to prevent loss of view representation when replacing a view [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9620: URL: https://github.com/apache/iceberg/pull/9620#discussion_r1490231023 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -479,6 +491,23 @@ public ViewMetadata build() { metadataLocation); } +privat

Re: [PR] Core: Add property to prevent loss of view representation when replacing a view [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9620: URL: https://github.com/apache/iceberg/pull/9620#discussion_r1490230498 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -479,6 +491,23 @@ public ViewMetadata build() { metadataLocation); } +privat

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-02-14 Thread via GitHub
danielcweeks commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1490219293 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -81,6 +87,7 @@ public class JdbcCatalog extends BaseMetastoreCatalog private final Functio

Re: [I] Core: complete FileScanTaskParser for other FileScanTask implementation classes (like StaticDataTask) [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on issue #9597: URL: https://github.com/apache/iceberg/issues/9597#issuecomment-1945166758 I think we should definitely use single-value serialization for the values in the structs when we convert to JSON. I probably wouldn't use objects, though. We could use a list and sen

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-14 Thread via GitHub
dramaticlly commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1490217959 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -99,6 +99,20 @@ public interface RewriteDataFiles boolean USE_STARTING_SEQUENCE_NU

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-14 Thread via GitHub
dramaticlly commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1490217959 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -99,6 +99,20 @@ public interface RewriteDataFiles boolean USE_STARTING_SEQUENCE_NU

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-14 Thread via GitHub
jackye1995 commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1490183371 ## open-api/rest-catalog-open-api.yaml: ## @@ -1482,6 +1490,33 @@ components: explode: false example: "vended-credentials,remote-signing" +page-t

Re: [I] Flink: Support Flink streaming reading [iceberg]

2024-02-14 Thread via GitHub
github-actions[bot] commented on issue #1383: URL: https://github.com/apache/iceberg/issues/1383#issuecomment-1945114441 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Spec: add task-type field to JSON serialization of file scan task. add JSON serialization for StaticDataTask. [iceberg]

2024-02-14 Thread via GitHub
stevenzwu commented on code in PR #9728: URL: https://github.com/apache/iceberg/pull/9728#discussion_r1490214139 ## format/spec.md: ## @@ -1239,15 +1239,34 @@ Content file (data or delete) is serialized as a JSON object according to the fo ### File Scan Task Serialization

Re: [PR] Open ap iworkflow [iceberg]

2024-02-14 Thread via GitHub
dramaticlly closed pull request #9727: Open ap iworkflow URL: https://github.com/apache/iceberg/pull/9727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail

Re: [PR] Spec: add task-type field to JSON serialization of file scan task. add JSON serialization for StaticDataTask. [iceberg]

2024-02-14 Thread via GitHub
rdblue commented on code in PR #9728: URL: https://github.com/apache/iceberg/pull/9728#discussion_r1490201923 ## format/spec.md: ## @@ -1239,15 +1239,34 @@ Content file (data or delete) is serialized as a JSON object according to the fo ### File Scan Task Serialization -Fi

Re: [PR] API: New API For sequential / streaming updates [iceberg]

2024-02-14 Thread via GitHub
jasonf20 commented on PR #9323: URL: https://github.com/apache/iceberg/pull/9323#issuecomment-1944914307 @rdblue Based on our discussions could you have another look at this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Make InMemoryFileIO map shared access across instances [iceberg]

2024-02-14 Thread via GitHub
jackye1995 commented on code in PR #9722: URL: https://github.com/apache/iceberg/pull/9722#discussion_r1490147254 ## core/src/main/java/org/apache/iceberg/inmemory/InMemoryFileIO.java: ## @@ -28,7 +28,7 @@ public class InMemoryFileIO implements FileIO { - private final Map

Re: [PR] Make InMemoryFileIO map shared access across instances [iceberg]

2024-02-14 Thread via GitHub
jackye1995 commented on code in PR #9722: URL: https://github.com/apache/iceberg/pull/9722#discussion_r1490146191 ## core/src/test/java/org/apache/iceberg/inmemory/TestInMemoryFileIO.java: ## @@ -27,11 +27,11 @@ import org.junit.jupiter.api.Test; public class TestInMemoryFil

Re: [PR] OpenAPI: Add AppendDataFileUpdate to TableUpdate for rest appends [iceberg]

2024-02-14 Thread via GitHub
jackye1995 commented on code in PR #9717: URL: https://github.com/apache/iceberg/pull/9717#discussion_r1490144793 ## open-api/rest-catalog-open-api.yaml: ## @@ -3324,6 +3348,97 @@ components: type: integer format: int64 +TypeValue: + oneOf: +

[PR] refactor: remove unwraps [iceberg-rust]

2024-02-14 Thread via GitHub
odysa opened a new pull request, #196: URL: https://github.com/apache/iceberg-rust/pull/196 Remove `unwrap` and `expect`. It looks like some `expect` are expected. For example ```rust pub fn current_snapshot(&self) -> Option<&SnapshotRef> { self.current_snapshot_id.map(

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1490117508 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -438,6 +439,10 @@ public ViewMetadata build() { metadataLocation == null || c

[PR] Open ap iworkflow [iceberg]

2024-02-14 Thread via GitHub
dramaticlly opened a new pull request, #9727: URL: https://github.com/apache/iceberg/pull/9727 test if workflow will trigger -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-02-14 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1490138784 ## api/src/main/java/org/apache/iceberg/types/Types.java: ## @@ -205,38 +208,74 @@ public String toString() { } public static class TimestampType extends Primitiv

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-02-14 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1490137388 ## api/src/main/java/org/apache/iceberg/types/Types.java: ## @@ -205,38 +208,74 @@ public String toString() { } public static class TimestampType extends Primitiv

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-02-14 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1490134160 ## open-api/rest-catalog-open-api.yaml: ## @@ -532,6 +532,103 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

Re: [I] `catalog.create_namespace(...)` throws both 404 and 409 which is unintuitive [iceberg-python]

2024-02-14 Thread via GitHub
seunggs commented on issue #430: URL: https://github.com/apache/iceberg-python/issues/430#issuecomment-1944765728 Hi @Fokko, hierarchical namespaces are not really an issue for me - the issue is attempting to create the same namespace from parallel calls. I have an ingestion pipeline that r

Re: [PR] OpenAPI: Add AppendDataFileUpdate to TableUpdate for rest appends [iceberg]

2024-02-14 Thread via GitHub
amogh-jahagirdar commented on code in PR #9717: URL: https://github.com/apache/iceberg/pull/9717#discussion_r1490110431 ## open-api/rest-catalog-open-api.yaml: ## @@ -3324,6 +3348,97 @@ components: type: integer format: int64 +TypeValue: + oneOf:

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-14 Thread via GitHub
rahil-c commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1490085636 ## open-api/rest-catalog-open-api.yaml: ## @@ -1482,6 +1490,34 @@ components: explode: false example: "vended-credentials,remote-signing" +page-toke

Re: [I] Getting Original Schema of a DataFile in a FileScanTask? [iceberg-python]

2024-02-14 Thread via GitHub
Fokko commented on issue #401: URL: https://github.com/apache/iceberg-python/issues/401#issuecomment-1944638806 Hey @srilman sorry for not replying earlier. It slipped somewhere through the cracks, thanks for pinging me! > Is there a recommended way to getting the base / original sche

Re: [I] GCS support? [iceberg-go]

2024-02-14 Thread via GitHub
thorfour commented on issue #60: URL: https://github.com/apache/iceberg-go/issues/60#issuecomment-1944549686 Yea absolutely! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Build: Bump arrow from 14.0.2 to 15.0.0 [iceberg]

2024-02-14 Thread via GitHub
Fokko merged PR #9574: URL: https://github.com/apache/iceberg/pull/9574 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Build: Bump software.amazon.awssdk:bom from 2.23.17 to 2.24.0 [iceberg]

2024-02-14 Thread via GitHub
Fokko merged PR #9701: URL: https://github.com/apache/iceberg/pull/9701 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Build: Bump tez010 from 0.10.2 to 0.10.3 [iceberg]

2024-02-14 Thread via GitHub
Fokko merged PR #9702: URL: https://github.com/apache/iceberg/pull/9702 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Build: Bump calcite from 1.10.0 to 1.35.0 [iceberg]

2024-02-14 Thread via GitHub
Fokko commented on PR #8239: URL: https://github.com/apache/iceberg/pull/8239#issuecomment-1944522980 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Build: Bump org.assertj:assertj-core from 3.25.2 to 3.25.3 [iceberg]

2024-02-14 Thread via GitHub
Fokko merged PR #9706: URL: https://github.com/apache/iceberg/pull/9706 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Build: Bump nessie from 0.76.3 to 0.76.6 [iceberg]

2024-02-14 Thread via GitHub
dependabot[bot] closed pull request #9568: Build: Bump nessie from 0.76.3 to 0.76.6 URL: https://github.com/apache/iceberg/pull/9568 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Build: Bump nessie from 0.76.3 to 0.76.6 [iceberg]

2024-02-14 Thread via GitHub
dependabot[bot] commented on PR #9568: URL: https://github.com/apache/iceberg/pull/9568#issuecomment-1944517779 Looks like these dependencies are no longer a dependency, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Build: Bump nessie from 0.76.3 to 0.76.6 [iceberg]

2024-02-14 Thread via GitHub
Fokko commented on PR #9568: URL: https://github.com/apache/iceberg/pull/9568#issuecomment-1944517468 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Upgrade Nessie to 0.77.1 [iceberg]

2024-02-14 Thread via GitHub
Fokko merged PR #9726: URL: https://github.com/apache/iceberg/pull/9726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [I] Conversion between date/timestamp(tz) and integer for avro reader/writer [iceberg-python]

2024-02-14 Thread via GitHub
jqin61 commented on issue #398: URL: https://github.com/apache/iceberg-python/issues/398#issuecomment-1944482233 Yep yep totally agree, I am splitting one portion of the partitioned write for the features of partition path generation and PartitionKey class into a separate PR currently. Will

Re: [PR] OpenAPI: Add AppendDataFileUpdate to TableUpdate for rest appends [iceberg]

2024-02-14 Thread via GitHub
geruh commented on code in PR #9717: URL: https://github.com/apache/iceberg/pull/9717#discussion_r1489952698 ## open-api/rest-catalog-open-api.yaml: ## @@ -3324,6 +3348,97 @@ components: type: integer format: int64 +TypeValue: + oneOf: +-

Re: [PR] Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882 [iceberg]

2024-02-14 Thread via GitHub
kmensah-stripe commented on PR #6570: URL: https://github.com/apache/iceberg/pull/6570#issuecomment-1944371875 Looking at https://home.corp.stripe.com/compass/projects/removing-locking-for-iceberg-commits-in-hms I don't have context on why we're pulling in patches instead of upgrading. Not

Re: [PR] OpenAPI: Add AppendDataFileUpdate to TableUpdate for rest appends [iceberg]

2024-02-14 Thread via GitHub
jackye1995 commented on code in PR #9717: URL: https://github.com/apache/iceberg/pull/9717#discussion_r1489878930 ## open-api/rest-catalog-open-api.yaml: ## @@ -3324,6 +3348,97 @@ components: type: integer format: int64 +TypeValue: + oneOf: +

Re: [PR] Ability to override session adapter and auth when session object is… [iceberg-python]

2024-02-14 Thread via GitHub
cabhishek closed pull request #431: Ability to override session adapter and auth when session object is… URL: https://github.com/apache/iceberg-python/pull/431 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[PR] Ability to override session adapter and auth when session object is… [iceberg-python]

2024-02-14 Thread via GitHub
cabhishek opened a new pull request, #431: URL: https://github.com/apache/iceberg-python/pull/431 * Pass a custom http session adapter via properties. * This lets users provide custom authentication mechanism. -- This is an automated message from the Apache Git Service. To respond to t

Re: [I] Support setting a snapshot property in same commit as spark.sql [iceberg-python]

2024-02-14 Thread via GitHub
Gowthami03B commented on issue #368: URL: https://github.com/apache/iceberg-python/issues/368#issuecomment-1944327908 https://github.com/apache/iceberg-python/pull/419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] Support reading and writing snapshot properties [iceberg-python]

2024-02-14 Thread via GitHub
Gowthami03B commented on issue #367: URL: https://github.com/apache/iceberg-python/issues/367#issuecomment-1944327574 https://github.com/apache/iceberg-python/pull/419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-14 Thread via GitHub
paulpaul1076 commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1944320729 Discussed in slack that this is due to iceberg's streaming writer not being unique, this PR should fix this: https://github.com/apache/iceberg/pull/9255, waiting for iceberg 1.5

Re: [I] java.lang.IllegalArgumentException: requirement failed: length (-6235972) cannot be smaller than -1 [iceberg]

2024-02-14 Thread via GitHub
rjayapalan commented on issue #9689: URL: https://github.com/apache/iceberg/issues/9689#issuecomment-1944303986 Hi @amogh-jahagirdar Yes we were writing to the iceberg table initially using version 1.4.0 until we came across this same issue. So as per recommendation we upgraded to versio

[I] `catalog.create_namespace(...)` throws both 404 and 409 which is unintuitive [iceberg-python]

2024-02-14 Thread via GitHub
seunggs opened a new issue, #430: URL: https://github.com/apache/iceberg-python/issues/430 ### Apache Iceberg version 0.5.0 (latest release) ### Please describe the bug 🐞 I'm actually using the 0.6.0rc4 version, but that shouldn't matter in this case. During the i

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-14 Thread via GitHub
dramaticlly commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1489817653 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -99,6 +99,20 @@ public interface RewriteDataFiles boolean USE_STARTING_SEQUENCE_NU

Re: [PR] Change Append/Overwrite API to accept snapshot properties [iceberg-python]

2024-02-14 Thread via GitHub
Fokko commented on code in PR #419: URL: https://github.com/apache/iceberg-python/pull/419#discussion_r1489812651 ## mkdocs/docs/api.md: ## @@ -446,6 +446,20 @@ table = table.transaction().remove_properties("abc").commit_transaction() assert table.properties == {} ``` +## S

Re: [I] Getting Original Schema of a DataFile in a FileScanTask? [iceberg-python]

2024-02-14 Thread via GitHub
srilman commented on issue #401: URL: https://github.com/apache/iceberg-python/issues/401#issuecomment-1944245508 @Fokko any thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat(catalog): add initial rest catalog impl [iceberg-go]

2024-02-14 Thread via GitHub
zeroshade commented on PR #58: URL: https://github.com/apache/iceberg-go/pull/58#issuecomment-1944206393 @nastra Added several issues as suggested -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[I] Config File Handling [iceberg-go]

2024-02-14 Thread via GitHub
zeroshade opened a new issue, #62: URL: https://github.com/apache/iceberg-go/issues/62 ### Feature Request / Improvement Similar to `pyiceberg`'s [config file](https://py.iceberg.apache.org/configuration/) we should add handling for a config file to make it easier to manipulate the a

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-14 Thread via GitHub
paulpaul1076 commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1944183346 @nastra So, I seem to have discovered new info about what's going on. For some reason in Iceberg metadata there are 2 entries of the same file: ![image](https://github.co

Re: [I] GCS support? [iceberg-go]

2024-02-14 Thread via GitHub
zeroshade commented on issue #60: URL: https://github.com/apache/iceberg-go/issues/60#issuecomment-1944163455 That's definitely an interesting and valid idea. Would you be willing to sketch out a PR for it? -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] feat(catalog): add initial rest catalog impl [iceberg-go]

2024-02-14 Thread via GitHub
nastra merged PR #58: URL: https://github.com/apache/iceberg-go/pull/58 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] detect breaking changes [iceberg-python]

2024-02-14 Thread via GitHub
Fokko commented on PR #394: URL: https://github.com/apache/iceberg-python/pull/394#issuecomment-1944095694 Thanks for setting this up @syun64. This looks great. I think we can just give it a try after the 0.6.0 release and see how noisy it is. -- This is an automated message from the Apac

Re: [I] Null values in metadata_log_entries [iceberg]

2024-02-14 Thread via GitHub
oneonestar commented on issue #9723: URL: https://github.com/apache/iceberg/issues/9723#issuecomment-1944060001 > latest_snapshot_id -> This is the snapshot ID that was the state of the table at the time the metadata file in the log entry was the metadata file for the table. By this

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-14 Thread via GitHub
manuzhang commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1489633539 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -99,6 +99,20 @@ public interface RewriteDataFiles boolean USE_STARTING_SEQUENCE_NUMB

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-14 Thread via GitHub
manuzhang commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1489633539 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -99,6 +99,20 @@ public interface RewriteDataFiles boolean USE_STARTING_SEQUENCE_NUMB

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-14 Thread via GitHub
manuzhang commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1489632986 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -99,6 +99,20 @@ public interface RewriteDataFiles boolean USE_STARTING_SEQUENCE_NUMB

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1489631189 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -479,21 +488,21 @@ public ViewMetadata build() { metadataLocation); } -stati

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1489628489 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -479,21 +488,21 @@ public ViewMetadata build() { metadataLocation); } -stati

Re: [PR] Core: Common metadata for TableMetadata and ViewMetadata [iceberg]

2024-02-14 Thread via GitHub
nastra commented on PR #9682: URL: https://github.com/apache/iceberg/pull/9682#issuecomment-1944007197 let's hold off on this one until 1.5.0 is out and the Hive view PR is reviewed so that we can better decide in which direction we'd want to go -- This is an automated message from the Ap

Re: [PR] MR: Migrate parameterized tests to JUni5 [iceberg]

2024-02-14 Thread via GitHub
lisirrx commented on PR #9711: URL: https://github.com/apache/iceberg/pull/9711#issuecomment-1943963031 > LGTM, thanks @lisirrx Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[I] Minimum Requirement of Data File Name in Apache Iceberg? [iceberg-python]

2024-02-14 Thread via GitHub
syun64 opened a new issue, #429: URL: https://github.com/apache/iceberg-python/issues/429 ### Question Hi folks, I was chatting with @jaychia the other day and we were both wondering about the importance of the file naming convention in Apache Iceberg. Currently, PyIceberg and Java c

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1489559055 ## core/src/test/java/org/apache/iceberg/view/TestViewMetadataParser.java: ## @@ -101,16 +101,20 @@ public void readAndWriteValidViewMetadata() throws Exception {

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-14 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1489556973 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metadata) {

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1489553738 ## core/src/test/java/org/apache/iceberg/view/TestViewMetadata.java: ## @@ -252,12 +301,12 @@ public void viewHistoryNormalization() { // the first build will not

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1489547283 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -504,6 +517,11 @@ static List updateHistory(List history, Set< } } + // ke

Re: [PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9725: URL: https://github.com/apache/iceberg/pull/9725#discussion_r1489543649 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -485,15 +496,17 @@ static List expireVersions( List ids = Lists.newArrayList(versionsById.k

[PR] Core: Only write view history when currentVersionId changes [iceberg]

2024-02-14 Thread via GitHub
nastra opened a new pull request, #9725: URL: https://github.com/apache/iceberg/pull/9725 While working on https://github.com/apache/iceberg/pull/9620 we've noticed that we currently add elements to the view history whenever a new view version has been added, which is wrong. Accordin

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-14 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1489527957 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metadata) {

Re: [PR] Core: Add property to prevent loss of view representation when replacing a view [iceberg]

2024-02-14 Thread via GitHub
nastra commented on code in PR #9620: URL: https://github.com/apache/iceberg/pull/9620#discussion_r1489508079 ## core/src/test/java/org/apache/iceberg/view/TestViewMetadata.java: ## @@ -334,6 +334,9 @@ public void viewMetadataAndMetadataChanges() { .addVersion(viewV

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-14 Thread via GitHub
paulpaul1076 commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1943746314 @nastra these are the logs from the driver that does compaction and fails with this content length exception, and from one of the executors: [logs.zip](https://github.com/

Re: [I] Null values in metadata_log_entries [iceberg]

2024-02-14 Thread via GitHub
findinpath commented on issue #9723: URL: https://github.com/apache/iceberg/issues/9723#issuecomment-1943680429 > I think this is expected behavior for the scenario you described because a replace will reset the history entries (the snapshot-log) Why would replace affect the history e

Re: [I] Parallel Table.append [iceberg-python]

2024-02-14 Thread via GitHub
Fokko commented on issue #428: URL: https://github.com/apache/iceberg-python/issues/428#issuecomment-1943481392 @bigluck Thanks for raising this. This is on my list to look into! Parallelization of this is always hard since it is hard to exactly know how big the Parquet file will be.

[I] Parallel Table.append [iceberg-python]

2024-02-14 Thread via GitHub
bigluck opened a new issue, #428: URL: https://github.com/apache/iceberg-python/issues/428 ### Apache Iceberg version main (development) ### Please describe the bug 🐞 While doing some tests with the latest RC (`v0.6.0rc5`), I generated a ~6.7GB arrow table and appended i

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-14 Thread via GitHub
ajantha-bhat commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1489002574 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -99,6 +99,20 @@ public interface RewriteDataFiles boolean USE_STARTING_SEQUENCE_N

Re: [PR] feat(catalog): add initial rest catalog impl [iceberg-go]

2024-02-14 Thread via GitHub
nastra commented on code in PR #58: URL: https://github.com/apache/iceberg-go/pull/58#discussion_r1489134148 ## dev/provision.py: ## @@ -0,0 +1,344 @@ +# Licensed to the Apache Software Foundation (ASF) under one Review Comment: this can be removed now that we don't have a c

Re: [PR] feat(catalog): add initial rest catalog impl [iceberg-go]

2024-02-14 Thread via GitHub
nastra commented on code in PR #58: URL: https://github.com/apache/iceberg-go/pull/58#discussion_r1489129979 ## catalog/catalog.go: ## @@ -37,29 +40,148 @@ const ( var ( // ErrNoSuchTable is returned when a table does not exist in the catalog. - ErrNoSuchTable

Re: [PR] Fix environment variable parsing [iceberg-python]

2024-02-14 Thread via GitHub
Fokko merged PR #423: URL: https://github.com/apache/iceberg-python/pull/423 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.