Re: [I] Adopt `Catalog` API to include references to the `TableMetadata` and the `metadata_location` in the `TableCommit` payload for the `update_table` method [iceberg-rust]

2023-10-06 Thread via GitHub
Xuanwo commented on issue #75: URL: https://github.com/apache/iceberg-rust/issues/75#issuecomment-1750114581 > But in this case the data is already in memory through the `load_table` operation. You're correct. By reusing the `metadata_location` and `TableMetadata`, we can eliminate 2

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
paulpaul1076 commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750131250 I just tried bumping up the value of this setting by times 10 (5368709120), and the file sizes are still around 100MB. -- This is an automated message from the Apache Git Serv

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
amogh-jahagirdar commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750162049 >So this setting is not really "file size", it's more like "task size"? I would not say that. The docs I linked earlier put it concisely when it says "When writing dat

Re: [I] Deprecate usages of AssertHelpers in codebase [iceberg]

2023-10-06 Thread via GitHub
gzagarwal commented on issue #7094: URL: https://github.com/apache/iceberg/issues/7094#issuecomment-1750169496 > @gzagarwal you can just search through the codebase and pick up any of them, such as the ones in `iceberg-aws` module I am running existing testcase in iceberg-aws module .

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-06 Thread via GitHub
Fokko commented on code in PR #8673: URL: https://github.com/apache/iceberg/pull/8673#discussion_r1348395371 ## core/src/main/java/org/apache/iceberg/V2Metadata.java: ## @@ -39,7 +39,7 @@ private V2Metadata() {} ManifestFile.MANIFEST_CONTENT.asRequired(), M

Re: [I] Docs: Add document how to export records from CDC/Upsert Stream into apache iceberg table. [iceberg]

2023-10-06 Thread via GitHub
shreyanshR7 commented on issue #3105: URL: https://github.com/apache/iceberg/issues/3105#issuecomment-1750174056 Hello @openinx I am new to open source and want to work on this issue please guide me about the issue, the technical knowledge prerequisites to solve the issue, so that I can go

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
amogh-jahagirdar commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750178671 Also what's your configured value for `spark.sql.adaptive.advisoryPartitionSizeInBytes`? that will also influence the Spark task size for your case as well (by default, the

Re: [I] Spec: Add `141: spec_id` and `142: schema_id` to the spec [iceberg]

2023-10-06 Thread via GitHub
amogh-jahagirdar commented on issue #8712: URL: https://github.com/apache/iceberg/issues/8712#issuecomment-1750184038 Just noticed this as well, will pick this up -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
gzagarwal commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750189687 I also had the same problem , i added couple of spark properties and then the file size got increased from 50 to 250 MB around As Amogh pointed about the property "spark.sql.ada

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
paulpaul1076 commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750200744 @amogh-jahagirdar the value of spark.sql.adaptive.advisoryPartitionSizeInBytes is the default 64MB. 1) Do I understand it correctly that the size of the uncompressed data

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-06 Thread via GitHub
Fokko commented on code in PR #8673: URL: https://github.com/apache/iceberg/pull/8673#discussion_r1348436509 ## core/src/main/java/org/apache/iceberg/GenericManifestFile.java: ## @@ -84,15 +84,24 @@ public GenericManifestFile(Schema avroSchema) { } } + /** + * @depr

Re: [PR] Add ASF DOAP rdf file [iceberg]

2023-10-06 Thread via GitHub
jbonofre commented on PR #8586: URL: https://github.com/apache/iceberg/pull/8586#issuecomment-1750237175 FYI, I'm updating this PR with all Iceberg releases (for the record). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add ASF DOAP rdf file [iceberg]

2023-10-06 Thread via GitHub
Fokko commented on code in PR #8586: URL: https://github.com/apache/iceberg/pull/8586#discussion_r1348448229 ## doap.rdf: ## @@ -0,0 +1,55 @@ + + +http://usefulinc.com/ns/doap#"; + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; + xmlns:asfext="http://

Re: [PR] Add ASF DOAP rdf file [iceberg]

2023-10-06 Thread via GitHub
jbonofre commented on code in PR #8586: URL: https://github.com/apache/iceberg/pull/8586#discussion_r1348449919 ## doap.rdf: ## @@ -0,0 +1,55 @@ + + +http://usefulinc.com/ns/doap#"; + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; + xmlns:asfext="http

Re: [PR] Rename master branch to main [iceberg]

2023-10-06 Thread via GitHub
jbonofre commented on PR #8722: URL: https://github.com/apache/iceberg/pull/8722#issuecomment-1750244656 Thanks, I do a cleanup on this PR and I create a INFRA ticket to actually rename. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[PR] Spec: Clarify spec_id field in Data File [iceberg]

2023-10-06 Thread via GitHub
amogh-jahagirdar opened a new pull request, #8730: URL: https://github.com/apache/iceberg/pull/8730 Fixes #8712 . This clarifies the `spec_id` partition spec field in DataFile. Note that in practice this field looks to not be written and is simply inherited from the manifest file. We alread

Re: [PR] Spec: Clarify spec_id field in Data File [iceberg]

2023-10-06 Thread via GitHub
amogh-jahagirdar commented on code in PR #8730: URL: https://github.com/apache/iceberg/pull/8730#discussion_r1348465225 ## format/spec.md: ## @@ -443,13 +443,13 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | **`1

Re: [PR] Spec: Clarify spec_id field in Data File [iceberg]

2023-10-06 Thread via GitHub
amogh-jahagirdar commented on code in PR #8730: URL: https://github.com/apache/iceberg/pull/8730#discussion_r1348465225 ## format/spec.md: ## @@ -443,13 +443,13 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | **`1

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
paulpaul1076 commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750288055 I increased spark.sql.adaptive.advisoryPartitionSizeInBytes and the files are still around 100MB in size. -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Add ASF DOAP rdf file [iceberg]

2023-10-06 Thread via GitHub
nastra commented on code in PR #8586: URL: https://github.com/apache/iceberg/pull/8586#discussion_r1348489076 ## doap.rdf: ## @@ -0,0 +1,55 @@ + + +http://usefulinc.com/ns/doap#"; + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; + xmlns:asfext="http:/

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348491518 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348492147 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348492834 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348498134 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] Add ASF DOAP rdf file [iceberg]

2023-10-06 Thread via GitHub
jbonofre commented on code in PR #8586: URL: https://github.com/apache/iceberg/pull/8586#discussion_r1348498937 ## doap.rdf: ## @@ -0,0 +1,55 @@ + + +http://usefulinc.com/ns/doap#"; + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; + xmlns:asfext="http

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348499538 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348499766 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348501637 ## table/refs_test.go: ## @@ -0,0 +1,37 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348499766 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348502543 ## table/snapshots.go: ## @@ -0,0 +1,186 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE fil

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
nastra commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348503065 ## table/snapshots.go: ## @@ -0,0 +1,186 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE fil

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-06 Thread via GitHub
nastra closed pull request #8715: Thread.sleep() method is replaced with Awaitility URL: https://github.com/apache/iceberg/pull/8715 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] Drop the SQL issue when attempting to drop an Iceberg table whose location does not exist [iceberg]

2023-10-06 Thread via GitHub
maxfirman commented on issue #7227: URL: https://github.com/apache/iceberg/issues/7227#issuecomment-1750323978 I've just hit this issue. I see there are two open PRs to resolve this: #7228 and #6786. My two cents is that modifying the behaviour of `DROP TABLE` to succeed even if the metadat

[PR] Convert the Logical to Physical map to a visitor [iceberg-python]

2023-10-06 Thread via GitHub
Fokko opened a new pull request, #43: URL: https://github.com/apache/iceberg-python/pull/43 I noticed that the FixedType was missing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-06 Thread via GitHub
swat1234 commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1750556745 Can some one please advise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Could there be duplicate values in the result returned by the findOrphanFiles method? [iceberg]

2023-10-06 Thread via GitHub
nk1506 commented on issue #8670: URL: https://github.com/apache/iceberg/issues/8670#issuecomment-1750610633 Hi @hwfff , I tried to repro with duplicate files as well invalid files. Iceberg is catching all the errors while deleting the files. Although it has BulkDelete as well as single dele

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-06 Thread via GitHub
jhchee commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1750689951 @swat1234 Can you try something like: The result with UNCOMPRESSED codec looks unusual (and it shouldn't be smaller than snappy). Are you sure that you are using this config in your

[PR] Core: Allows UnionByName To Insert New Elements in Order [iceberg]

2023-10-06 Thread via GitHub
RussellSpitzer opened a new pull request, #8731: URL: https://github.com/apache/iceberg/pull/8731 When unionByName is now called with schema update we first check to see whether or not all the elements in the struct are in the same order (while ignoring if additional columns have been added

Re: [PR] Core: Allows UnionByName To Insert New Elements in Order [iceberg]

2023-10-06 Thread via GitHub
RussellSpitzer commented on PR #8731: URL: https://github.com/apache/iceberg/pull/8731#issuecomment-1750767403 Another option here is we take this whole behavior and make a new api "EvolveByName" or something that more strictly attempts to fit the new schema. -- This is an automated messa

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-06 Thread via GitHub
Fokko commented on code in PR #8673: URL: https://github.com/apache/iceberg/pull/8673#discussion_r1348841775 ## core/src/main/java/org/apache/iceberg/GenericManifestFile.java: ## @@ -84,15 +84,24 @@ public GenericManifestFile(Schema avroSchema) { } } + /** + * @depr

Re: [I] Replace Thread.sleep() usage in test code with Awaitility [iceberg]

2023-10-06 Thread via GitHub
shreyanshR7 commented on issue #7154: URL: https://github.com/apache/iceberg/issues/7154#issuecomment-1750854267 @nastra please consider the pr #8725 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-06 Thread via GitHub
dungdm93 commented on PR #8701: URL: https://github.com/apache/iceberg/pull/8701#issuecomment-1750915594 I'm developing [`alluvial`](https://github.com/dungdm93/alluvial) project which used to stream change logs from Kafka in `debezium` format to Iceberg table. Can't wait until this PR g

Re: [PR] Core: Add AsyncFileIO [iceberg]

2023-10-06 Thread via GitHub
dungdm93 commented on code in PR #8644: URL: https://github.com/apache/iceberg/pull/8644#discussion_r1348892099 ## core/src/main/java/org/apache/iceberg/io/AsyncFileIO.java: ## @@ -0,0 +1,269 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contr

Re: [PR] Core: Add AsyncFileIO [iceberg]

2023-10-06 Thread via GitHub
dungdm93 commented on code in PR #8644: URL: https://github.com/apache/iceberg/pull/8644#discussion_r1348893096 ## core/src/main/java/org/apache/iceberg/io/AsyncFileIO.java: ## @@ -0,0 +1,269 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contr

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
zeroshade commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348900025 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE f

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
zeroshade commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1348905234 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE f

Re: [PR] GCP: Add Iceberg Catalog for GCP BigLake Metastore [iceberg]

2023-10-06 Thread via GitHub
coufon commented on PR #7412: URL: https://github.com/apache/iceberg/pull/7412#issuecomment-1750974517 > Great work, this feature is exactly what my team needs. Are there any updates? Zhou Fang Ryan Blue We released these code here (https://cloud.google.com/bigquery/docs/manage-open-

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-06 Thread via GitHub
nastra commented on code in PR #8725: URL: https://github.com/apache/iceberg/pull/8725#discussion_r1348922171 ## flink/v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java: ## @@ -113,7 +114,7 @@ public void testConsumeWithoutStartSnapshotI

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-06 Thread via GitHub
shreyanshR7 commented on code in PR #8725: URL: https://github.com/apache/iceberg/pull/8725#discussion_r1348940705 ## flink/v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java: ## @@ -113,7 +114,7 @@ public void testConsumeWithoutStartSnap

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-06 Thread via GitHub
shreyanshR7 commented on code in PR #8725: URL: https://github.com/apache/iceberg/pull/8725#discussion_r1348940705 ## flink/v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java: ## @@ -113,7 +114,7 @@ public void testConsumeWithoutStartSnap

[PR] Add full docs for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi opened a new pull request, #278: URL: https://github.com/apache/iceberg-docs/pull/278 This PR adds release notes for 1.4.0 and its jars. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Add full docs for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi commented on PR #278: URL: https://github.com/apache/iceberg-docs/pull/278#issuecomment-1751095463 @amogh-jahagirdar @Fokko @nastra @rdblue @danielcweeks @RussellSpitzer @flyrain -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Add full docs for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
Fokko commented on code in PR #278: URL: https://github.com/apache/iceberg-docs/pull/278#discussion_r1348969309 ## landing-page/content/common/multi-engine-support.md: ## @@ -66,10 +66,11 @@ Each engine version undergoes the following lifecycle stages: | -- | -

[PR] Javadoc for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi opened a new pull request, #279: URL: https://github.com/apache/iceberg-docs/pull/279 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Javadoc for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi commented on PR #279: URL: https://github.com/apache/iceberg-docs/pull/279#issuecomment-1751104235 @amogh-jahagirdar @Fokko @nastra @rdblue @danielcweeks @RussellSpitzer @flyrain -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Update roadmap.md [iceberg-docs]

2023-10-06 Thread via GitHub
Fokko commented on PR #272: URL: https://github.com/apache/iceberg-docs/pull/272#issuecomment-1751103769 @bitsondatadev I just noticed this one, can you resolve the pending issues? You put in so much effort already, it is a waste to keep this lingering here. -- This is an automated messag

Re: [PR] Add full docs for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi commented on code in PR #278: URL: https://github.com/apache/iceberg-docs/pull/278#discussion_r1348980277 ## landing-page/content/common/release-notes.md: ## @@ -26,10 +26,10 @@ disableSidebar: true The latest version of Iceberg is [{{% icebergVersion %}}](https://

Re: [PR] Add full docs for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi merged PR #278: URL: https://github.com/apache/iceberg-docs/pull/278 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Add full docs for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi commented on code in PR #278: URL: https://github.com/apache/iceberg-docs/pull/278#discussion_r1348980483 ## docs/config.toml: ## @@ -24,6 +24,7 @@ home = [ "HTML", "RSS", "SearchIndex" ] [menu] versions = [ { name = "latest", pre = "relative", url = "../lat

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-06 Thread via GitHub
zeroshade commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1349140042 ## table/refs_test.go: ## @@ -0,0 +1,37 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE f

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-06 Thread via GitHub
rdblue commented on code in PR #8659: URL: https://github.com/apache/iceberg/pull/8659#discussion_r1349160440 ## docs-new/.github/bin/deploy_docs.sh: ## @@ -0,0 +1,39 @@ +#!/bin/bash + +while [[ "$#" -gt 0 ]]; do +case $1 in +-v|--version) ICEBERG_VERSION="$2"; shift

Re: [PR] Javadoc for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi commented on PR #279: URL: https://github.com/apache/iceberg-docs/pull/279#issuecomment-1751218207 I checked this locally and it seems to include some recent changes. I'll merge this to finish the release. If anything comes up, we can fix it later. -- This is an automated mess

Re: [PR] Javadoc for 1.4.0 [iceberg-docs]

2023-10-06 Thread via GitHub
aokolnychyi merged PR #279: URL: https://github.com/apache/iceberg-docs/pull/279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Convert the Logical to Physical map to a visitor [iceberg-python]

2023-10-06 Thread via GitHub
rdblue commented on code in PR #43: URL: https://github.com/apache/iceberg-python/pull/43#discussion_r1349207476 ## pyiceberg/io/pyarrow.py: ## @@ -1099,21 +1099,70 @@ def map_value_partner(self, partner_map: Optional[pa.Array]) -> Optional[pa.Arra return partner_map.i

Re: [PR] Convert the Logical to Physical map to a visitor [iceberg-python]

2023-10-06 Thread via GitHub
rdblue commented on code in PR #43: URL: https://github.com/apache/iceberg-python/pull/43#discussion_r1349208232 ## pyiceberg/io/pyarrow.py: ## @@ -1099,21 +1099,70 @@ def map_value_partner(self, partner_map: Optional[pa.Array]) -> Optional[pa.Arra return partner_map.i

Re: [PR] Convert the Logical to Physical map to a visitor [iceberg-python]

2023-10-06 Thread via GitHub
Fokko commented on code in PR #43: URL: https://github.com/apache/iceberg-python/pull/43#discussion_r1349234820 ## pyiceberg/io/pyarrow.py: ## @@ -1099,21 +1099,70 @@ def map_value_partner(self, partner_map: Optional[pa.Array]) -> Optional[pa.Arra return partner_map.it

Re: [PR] Convert the Logical to Physical map to a visitor [iceberg-python]

2023-10-06 Thread via GitHub
Fokko commented on code in PR #43: URL: https://github.com/apache/iceberg-python/pull/43#discussion_r1349235131 ## pyiceberg/io/pyarrow.py: ## @@ -1099,21 +1099,70 @@ def map_value_partner(self, partner_map: Optional[pa.Array]) -> Optional[pa.Arra return partner_map.it

[PR] Fix the TableIdentifier [iceberg-python]

2023-10-06 Thread via GitHub
Fokko opened a new pull request, #44: URL: https://github.com/apache/iceberg-python/pull/44 We were sending a table identifier before in `['accounting', 'tax', 'invoices']`, but it has to be `{'namespace': ['accounting, 'tax'], 'name': 'invoices'}` 😭 -- This is an automated message from

Re: [PR] Spark: Clean up FileIO instances on executors [iceberg]

2023-10-06 Thread via GitHub
rdblue commented on PR #8685: URL: https://github.com/apache/iceberg/pull/8685#issuecomment-1751409600 Thanks for fixing this, @aokolnychyi! Good to have this cleaned up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Spark: Clean up FileIO instances on executors [iceberg]

2023-10-06 Thread via GitHub
bryanck commented on PR #8685: URL: https://github.com/apache/iceberg/pull/8685#issuecomment-1751417119 Just to mention it, auto close caused an issue related to broadcasts that was fixed in this PR - https://github.com/apache/iceberg/pull/7263 -- This is an automated message from the Apa

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-06 Thread via GitHub
rdblue commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1349331583 ## kafka-connect/kafka-connect-events/src/test/java/org/apache/iceberg/connect/events/EventTestUtil.java: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-06 Thread via GitHub
rdblue commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1349332017 ## settings.gradle: ## @@ -200,3 +202,6 @@ if (JavaVersion.current() == JavaVersion.VERSION_1_8) { } } +include ":iceberg-kafka-connect:kafka-connect-events" Revi

Re: [PR] Spark: Clean up FileIO instances on executors [iceberg]

2023-10-06 Thread via GitHub
rdblue merged PR #8685: URL: https://github.com/apache/iceberg/pull/8685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-06 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1349335413 ## settings.gradle: ## @@ -200,3 +202,6 @@ if (JavaVersion.current() == JavaVersion.VERSION_1_8) { } } +include ":iceberg-kafka-connect:kafka-connect-events" Rev

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-06 Thread via GitHub
bitsondatadev commented on code in PR #8659: URL: https://github.com/apache/iceberg/pull/8659#discussion_r1349336499 ## docs-new/mkdocs.yml: ## @@ -0,0 +1,96 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTI

[PR] allow override env-variables in load_catalog [iceberg-python]

2023-10-06 Thread via GitHub
bdilday opened a new pull request, #45: URL: https://github.com/apache/iceberg-python/pull/45 I found that I was unable to change a catalog URI after it had been configured by an environment variable. Example, ``` import os os.environ["PYICEBERG_CATALOG__SOMEDB__URI"] = "https:

Re: [PR] Spark: Clean up FileIO instances on executors [iceberg]

2023-10-06 Thread via GitHub
bryanck commented on PR #8685: URL: https://github.com/apache/iceberg/pull/8685#issuecomment-1751438110 After looking through the code, I see https://github.com/apache/iceberg/pull/7263 shouldn't be an issue bc the FileIO isn't closed on the driver. -- This is an automated message from t

Re: [PR] Test: Add a test utility method to programmatically create expected partition specs [iceberg]

2023-10-06 Thread via GitHub
jerqi commented on PR #8467: URL: https://github.com/apache/iceberg/pull/8467#issuecomment-1751557690 I have rebased the master branch and remove the code of Spark 3.1. @aokolnychyi @RussellSpitzer Could you have a look if you have time? -- This is an automated message from the Apache Git

Re: [I] Have a test utility method to programmatically create expected specs [iceberg]

2023-10-06 Thread via GitHub
jerqi commented on issue #8434: URL: https://github.com/apache/iceberg/issues/8434#issuecomment-1751557823 @aokolnychyi I have raised a pr #8467. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Support Hudi `DeltaStreamer` compatible feature [iceberg]

2023-10-06 Thread via GitHub
LittleWat commented on issue #8724: URL: https://github.com/apache/iceberg/issues/8724#issuecomment-1751560189 KafkaConnect is cool! Deltastreamer also supports [the distributed-file-system ingestion](https://hudi.apache.org/docs/0.13.1/hoodie_deltastreamer/#distributed-file-system-dfs). U

[I] v1 table data file spec id is None [iceberg-python]

2023-10-06 Thread via GitHub
puchengy opened a new issue, #46: URL: https://github.com/apache/iceberg-python/issues/46 ### Apache Iceberg version None ### Please describe the bug 🐞 v1 data file spec_id is optionally, but it seems spark is able to recognize the spec_id, but pyiceberg can't, any idea

Re: [I] Spark fails to write into an iceberg table after updating its schema [iceberg]

2023-10-06 Thread via GitHub
paulpaul1076 commented on issue #8721: URL: https://github.com/apache/iceberg/issues/8721#issuecomment-1751575139 Turns out that the problem is due to Spark caching its catalog, and since I updated the schema through iceberg's Java API instead of Spark, the cached value didn't get updated.

Re: [I] Spark fails to write into an iceberg table after updating its schema [iceberg]

2023-10-06 Thread via GitHub
paulpaul1076 closed issue #8721: Spark fails to write into an iceberg table after updating its schema URL: https://github.com/apache/iceberg/issues/8721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Manifest List Writer Design [iceberg-rust]

2023-10-06 Thread via GitHub
liurenjie1024 commented on issue #72: URL: https://github.com/apache/iceberg-rust/issues/72#issuecomment-1751583913 > Thanks! We can integrate V1 and V2 into a writer > > ``` > struck ManifestListWriter { >... > } > > impl ManifestListWriter { > fn write_v1(&mu

Re: [PR] feat: add builder to TableMetadata interface [iceberg-rust]

2023-10-06 Thread via GitHub
liurenjie1024 commented on code in PR #62: URL: https://github.com/apache/iceberg-rust/pull/62#discussion_r1349453405 ## crates/iceberg/src/spec/table_metadata.rs: ## @@ -93,22 +112,291 @@ pub struct TableMetadata { /// previous metadata file location should be added to the

Re: [PR] feat: add builder to TableMetadata interface [iceberg-rust]

2023-10-06 Thread via GitHub
liurenjie1024 commented on PR #62: URL: https://github.com/apache/iceberg-rust/pull/62#issuecomment-1751587849 Hi, @y0psolo Thanks for the effort! But I'm a little worried about the api here, since it's error prone, e.g. it's quite easy to construct an invalid table metadata. My suggestion

Re: [I] Adopt `Catalog` API to include references to the `TableMetadata` and the `metadata_location` in the `TableCommit` payload for the `update_table` method [iceberg-rust]

2023-10-06 Thread via GitHub
liurenjie1024 commented on issue #75: URL: https://github.com/apache/iceberg-rust/issues/75#issuecomment-1751595418 > > But in this case the data is already in memory through the `load_table` operation. > > You're correct. By reusing the `metadata_location` and `TableMetadata`, we ca

Re: [I] Implement `Debug` trait for public structs [iceberg-rust]

2023-10-06 Thread via GitHub
liurenjie1024 commented on issue #73: URL: https://github.com/apache/iceberg-rust/issues/73#issuecomment-1751595801 +1 for this proposal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Support Hudi `DeltaStreamer` compatible feature [iceberg]

2023-10-06 Thread via GitHub
pvary commented on issue #8724: URL: https://github.com/apache/iceberg/issues/8724#issuecomment-1751614785 We use Flink for the ingestion. Flink supports wide range of sources, and the Iceberg FlinkSink enables you to write them to Iceberg tables. -- This is an automated message from t

Re: [I] Migrate Files using TestRule in dell package to Junit5 [iceberg]

2023-10-06 Thread via GitHub
DaVincii commented on issue #7888: URL: https://github.com/apache/iceberg/issues/7888#issuecomment-1751624255 > Hi @DaVincii, Please provide your feedbacks on the PR for this issue. Hi @ashutosh-roy, updates looks fine to me but Eduard Tudenhoefner(https://github.com/nastra) would be

Re: [PR] Fix the TableIdentifier [iceberg-python]

2023-10-06 Thread via GitHub
amogh-jahagirdar commented on code in PR #44: URL: https://github.com/apache/iceberg-python/pull/44#discussion_r1349475349 ## tests/test_integration.py: ## @@ -104,25 +104,25 @@ def table(catalog: Catalog) -> Table: @pytest.mark.integration def test_table_properties(table: T