Re: [I] PyIceberg appending data creates snapshots incompatible with Athena/Spark [iceberg-python]

2024-12-16 Thread via GitHub
Fokko commented on issue #1424: URL: https://github.com/apache/iceberg-python/issues/1424#issuecomment-2544900271 I'm also not seeing how this could happen, to test this, I also ran this script: ``` Python 3.10.14 (main, Mar 19 2024, 21:46:16) [Clang 15.0.0 (clang-1500.3.9.4)]

Re: [PR] Doc: Do Not Modify the Source Data Table During MergeIntoCommand Exec… [iceberg]

2024-12-16 Thread via GitHub
BsoBird commented on code in PR #11787: URL: https://github.com/apache/iceberg/pull/11787#discussion_r1886334243 ## docs/docs/spark-writes.md: ## @@ -101,6 +101,9 @@ Spark 3.5 added support for `WHEN NOT MATCHED BY SOURCE ... THEN ...` to update WHEN NOT MATCHED BY SOURCE THEN

Re: [PR] Doc: Do Not Modify the Source Data Table During MergeIntoCommand Exec… [iceberg]

2024-12-16 Thread via GitHub
BsoBird commented on code in PR #11787: URL: https://github.com/apache/iceberg/pull/11787#discussion_r1886334243 ## docs/docs/spark-writes.md: ## @@ -101,6 +101,9 @@ Spark 3.5 added support for `WHEN NOT MATCHED BY SOURCE ... THEN ...` to update WHEN NOT MATCHED BY SOURCE THEN

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886483368 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886487094 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add license checker [iceberg-cpp]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #10: URL: https://github.com/apache/iceberg-cpp/pull/10#discussion_r1886489090 ## .github/workflows/license_check.yml: ## @@ -0,0 +1,26 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886465895 ## tests/integration/test_reads.py: ## @@ -873,3 +874,76 @@ def test_table_scan_empty_table(catalog: Catalog) -> None: result_table = tbl.scan().to_arrow()

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886464171 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2545047787 Thanks @kevinjqliu @corleyma for your review. Pls take another look, thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Add license checker [iceberg-cpp]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #10: URL: https://github.com/apache/iceberg-cpp/pull/10#discussion_r1886494679 ## .github/workflows/license_check.yml: ## @@ -0,0 +1,26 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

Re: [PR] Add license checker [iceberg-cpp]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #10: URL: https://github.com/apache/iceberg-cpp/pull/10#discussion_r1886494679 ## .github/workflows/license_check.yml: ## @@ -0,0 +1,26 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

Re: [PR] feat: Add RemovePartitionSpecs table update [iceberg-rust]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #804: URL: https://github.com/apache/iceberg-rust/pull/804#discussion_r1886516435 ## crates/iceberg/src/spec/table_metadata_builder.rs: ## @@ -740,6 +740,32 @@ impl TableMetadataBuilder { .set_default_partition_spec(Self::LAST_ADDED)

Re: [I] Generate release notes on website. [iceberg-rust]

2024-12-16 Thread via GitHub
Fokko commented on issue #593: URL: https://github.com/apache/iceberg-rust/issues/593#issuecomment-2545128114 I'm going to remove the milestone to unblock the 0.4.0 release. I'm happy to help here. In general, I think it would be great to add more the website (getting started etc) since it

Re: [I] Kafka Connect Sporadic Commit Delay [iceberg]

2024-12-16 Thread via GitHub
trolle4 commented on issue #11796: URL: https://github.com/apache/iceberg/issues/11796#issuecomment-2545131515 Also added some debug logging inside the class `org.apache.iceberg.connect.channel.CommitterImpl` ```java @Override public void save(Collection sinkRecords) {

Re: [PR] feat: Store file io props to allow re-build it [iceberg-rust]

2024-12-16 Thread via GitHub
Xuanwo commented on code in PR #802: URL: https://github.com/apache/iceberg-rust/pull/802#discussion_r1887055534 ## crates/iceberg/src/io/file_io.rs: ## @@ -165,7 +175,7 @@ impl FileIOBuilder { /// Fetch the scheme string. /// /// The scheme_str will be empty if i

Re: [PR] feat: Add RemovePartitionSpecs table update [iceberg-rust]

2024-12-16 Thread via GitHub
c-thiel commented on PR #804: URL: https://github.com/apache/iceberg-rust/pull/804#issuecomment-2546189749 @Fokko resolved merge conflicts. Could you re-approve? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Core: Fix numeric overflow of timestamp nano literal [iceberg]

2024-12-16 Thread via GitHub
amogh-jahagirdar commented on code in PR #11775: URL: https://github.com/apache/iceberg/pull/11775#discussion_r1887202656 ## api/src/main/java/org/apache/iceberg/expressions/Literals.java: ## @@ -300,8 +300,7 @@ public Literal to(Type type) { case TIMESTAMP:

Re: [PR] Auth Manager API part 1: HTTPRequest, HTTPHeader [iceberg]

2024-12-16 Thread via GitHub
danielcweeks commented on code in PR #11769: URL: https://github.com/apache/iceberg/pull/11769#discussion_r1887271043 ## core/src/main/java/org/apache/iceberg/rest/HTTPHeaders.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] Auth Manager API part 1: HTTPRequest, HTTPHeader [iceberg]

2024-12-16 Thread via GitHub
danielcweeks commented on code in PR #11769: URL: https://github.com/apache/iceberg/pull/11769#discussion_r1887271278 ## core/src/main/java/org/apache/iceberg/rest/HTTPHeaders.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] Core: Add Variant implementation to read serialized objects [iceberg]

2024-12-16 Thread via GitHub
rdblue commented on code in PR #11415: URL: https://github.com/apache/iceberg/pull/11415#discussion_r1887270705 ## core/src/main/java/org/apache/iceberg/variants/Variants.java: ## @@ -0,0 +1,276 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more co

Re: [PR] Bump `pyiceberg_core` to 0.4.0 [iceberg-rust]

2024-12-16 Thread via GitHub
Xuanwo merged PR #808: URL: https://github.com/apache/iceberg-rust/pull/808 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Implementing namespace_exists function on the REST Catalog [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1434: URL: https://github.com/apache/iceberg-python/pull/1434#discussion_r1887152140 ## tests/catalog/test_rest.py: ## @@ -681,6 +681,51 @@ def test_update_namespace_properties_200(rest_mock: Mocker) -> None: assert response == Propertie

Re: [I] Tracking issues of iceberg rust v0.4.0 Release [iceberg-rust]

2024-12-16 Thread via GitHub
Xuanwo commented on issue #739: URL: https://github.com/apache/iceberg-rust/issues/739#issuecomment-2546101308 > Could someone help review this PR so we can get the versions synced to 0.4.0? #808 Merged. Let's move! -- This is an automated message from the Apache Git Service. To re

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1887143738 ## pyiceberg/types.py: ## @@ -328,8 +328,8 @@ def __init__( data["type"] = data["type"] if "type" in data else field_type data["required"] =

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1887143738 ## pyiceberg/types.py: ## @@ -328,8 +328,8 @@ def __init__( data["type"] = data["type"] if "type" in data else field_type data["required"] =

Re: [PR] fix: gurantee the deserialize order of struct is same as the struct type [iceberg-rust]

2024-12-16 Thread via GitHub
Xuanwo commented on code in PR #795: URL: https://github.com/apache/iceberg-rust/pull/795#discussion_r1887090289 ## crates/iceberg/src/spec/values.rs: ## @@ -3604,4 +3608,29 @@ mod tests { assert_eq!(result, expected); } + +#[test] +fn test_record_ser_de(

Re: [PR] fix: gurantee the deserialize order of struct is same as the struct type [iceberg-rust]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #795: URL: https://github.com/apache/iceberg-rust/pull/795#discussion_r1887083904 ## crates/iceberg/src/spec/values.rs: ## @@ -3604,4 +3608,29 @@ mod tests { assert_eq!(result, expected); } + +#[test] +fn test_record_ser_de()

Re: [I] Failed to read iceberg TPCH generated by snowflake [iceberg-rust]

2024-12-16 Thread via GitHub
Xuanwo commented on issue #790: URL: https://github.com/apache/iceberg-rust/issues/790#issuecomment-2546018588 > @Xuanwo are you running into anything else? Curious to learn if it works :) Hi, it works great! Thank you for the fix (just in time I needed it!). Let's close. -- This i

Re: [I] Failed to read iceberg TPCH generated by snowflake [iceberg-rust]

2024-12-16 Thread via GitHub
Xuanwo closed issue #790: Failed to read iceberg TPCH generated by snowflake URL: https://github.com/apache/iceberg-rust/issues/790 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[PR] Bump `pyiceberg_core` to 0.4.0 [iceberg-rust]

2024-12-16 Thread via GitHub
sungwy opened a new pull request, #808: URL: https://github.com/apache/iceberg-rust/pull/808 We are releasing the bindings simultaneously with the core rust crates. Bump the version to be consistent with the core rust crates for simplicity. -- This is an automated message from the A

Re: [PR] Core: Add Variant implementation to read serialized objects [iceberg]

2024-12-16 Thread via GitHub
rdblue commented on code in PR #11415: URL: https://github.com/apache/iceberg/pull/11415#discussion_r1887268226 ## core/src/main/java/org/apache/iceberg/variants/PrimitiveWrapper.java: ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Spark 3.4: Add REST catalog to Spark integration tests [iceberg]

2024-12-16 Thread via GitHub
danielcweeks merged PR #11698: URL: https://github.com/apache/iceberg/pull/11698 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Feat: support aliyun oss backend. [iceberg-go]

2024-12-16 Thread via GitHub
zeroshade commented on PR #216: URL: https://github.com/apache/iceberg-go/pull/216#issuecomment-2546599302 the Java iceberg impl has some mocking and test setups for Aliyun as seen [here](https://github.com/apache/iceberg/tree/main/aliyun/src/test/java/org/apache/iceberg/aliyun/oss) would i

Re: [PR] Prep 0.4.0 release [iceberg-rust]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #809: URL: https://github.com/apache/iceberg-rust/pull/809#discussion_r1887538305 ## crates/iceberg/src/spec/snapshot.rs: ## @@ -192,13 +191,6 @@ impl Snapshot { partition_type_provider, ) } - -pub(crate) fn log(&

Re: [PR] Prep 0.4.0 release [iceberg-rust]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #809: URL: https://github.com/apache/iceberg-rust/pull/809#discussion_r1887539719 ## crates/iceberg/src/spec/table_metadata.rs: ## @@ -398,33 +397,6 @@ impl TableMetadata { self.partition_statistics.get(&snapshot_id) } -///

Re: [PR] Docs: add note for `day` transform [iceberg]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #11749: URL: https://github.com/apache/iceberg/pull/11749#discussion_r1887580518 ## format/spec.md: ## @@ -454,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncated

Re: [PR] Implementing namespace_exists function on the REST Catalog [iceberg-python]

2024-12-16 Thread via GitHub
sungwy commented on PR #1434: URL: https://github.com/apache/iceberg-python/pull/1434#issuecomment-2546756342 Hi @AhmedNader42 - thank you very much for picking up this issue and getting a working solution up already! I'm in agreement with @kevinjqliu 's comment, that it would be grea

[PR] Fix comment on `WRITE_OBJECT_STORE_PARTITIONED_PATHS` table property [iceberg]

2024-12-16 Thread via GitHub
smaheshwar-pltr opened a new pull request, #11798: URL: https://github.com/apache/iceberg/pull/11798 The code comment above the `WRITE_OBJECT_STORE_PARTITIONED_PATHS` constant in `TableProperties` was incorrect - partition values are excluded when this property is set to *false*, not true,

Re: [PR] Avro: Support default values for generic data [iceberg]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #11786: URL: https://github.com/apache/iceberg/pull/11786#discussion_r1887585686 ## core/src/main/java/org/apache/iceberg/data/avro/PlannedDataReader.java: ## @@ -0,0 +1,181 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] Spec: Support geo type [iceberg]

2024-12-16 Thread via GitHub
jiayuasu commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1887549063 ## format/spec.md: ## @@ -584,8 +589,8 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _optional_ |

Re: [PR] fix: gurantee the deserialize order of struct is same as the struct type [iceberg-rust]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #795: URL: https://github.com/apache/iceberg-rust/pull/795#discussion_r1887570509 ## crates/iceberg/src/spec/values.rs: ## @@ -3604,4 +3608,29 @@ mod tests { assert_eq!(result, expected); } + +#[test] +fn test_record_ser_de()

Re: [PR] Parquet: Implement defaults for generic data [iceberg]

2024-12-16 Thread via GitHub
rdblue merged PR #11785: URL: https://github.com/apache/iceberg/pull/11785 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Parquet: Implement defaults for generic data [iceberg]

2024-12-16 Thread via GitHub
rdblue commented on PR #11785: URL: https://github.com/apache/iceberg/pull/11785#issuecomment-2546711840 Merging this. Thanks for the review, @Fokko! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Avro: Support default values for generic data [iceberg]

2024-12-16 Thread via GitHub
rdblue commented on code in PR #11786: URL: https://github.com/apache/iceberg/pull/11786#discussion_r1887573521 ## core/src/main/java/org/apache/iceberg/data/avro/DataReader.java: ## @@ -36,6 +36,10 @@ import org.apache.iceberg.types.Type; import org.apache.iceberg.types.Types

Re: [PR] Prep 0.4.0 release [iceberg-rust]

2024-12-16 Thread via GitHub
sungwy commented on code in PR #809: URL: https://github.com/apache/iceberg-rust/pull/809#discussion_r1887569858 ## crates/iceberg/src/spec/table_metadata.rs: ## @@ -398,33 +397,6 @@ impl TableMetadata { self.partition_statistics.get(&snapshot_id) } -/// Appe

Re: [PR] Spec: Support geo type [iceberg]

2024-12-16 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1887625905 ## format/spec.md: ## @@ -584,8 +589,8 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _optional_

Re: [PR] Spec: Support geo type [iceberg]

2024-12-16 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1887625905 ## format/spec.md: ## @@ -584,8 +589,8 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _optional_

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886441268 ## pyiceberg/manifest.py: ## @@ -105,6 +105,9 @@ def _missing_(cls, value: object) -> Union[None, str]: return member return None +

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886445512 ## tests/integration/test_reads.py: ## @@ -873,3 +873,48 @@ def test_table_scan_empty_table(catalog: Catalog) -> None: result_table = tbl.scan().to_arrow()

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886444111 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886443611 ## pyiceberg/table/__init__.py: ## @@ -191,6 +193,15 @@ class TableProperties: DELETE_MODE_MERGE_ON_READ = "merge-on-read" DELETE_MODE_DEFAULT = DELET

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886446277 ## pyiceberg/table/__init__.py: ## @@ -1229,7 +1240,8 @@ def with_case_sensitive(self: S, case_sensitive: bool = True) -> S: class ScanTask(ABC): -pas

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886446918 ## pyiceberg/table/__init__.py: ## @@ -1253,6 +1265,22 @@ def __init__( self.start = start or 0 self.length = length or data_file.file_size_in

Re: [PR] feat: support `arrow_struct_to_iceberg_struct` [iceberg-rust]

2024-12-16 Thread via GitHub
ZENOTME commented on PR #731: URL: https://github.com/apache/iceberg-rust/pull/731#issuecomment-2545020315 > Thanks @ZENOTME 's effort. I saw that both java/python have schema with partner visitor: > > 1. [SchemaWithPartnerVisitor](https://github.com/apache/iceberg/blob/c07f2aabc0a1d

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2024-12-16 Thread via GitHub
sopel39 commented on code in PR #11781: URL: https://github.com/apache/iceberg/pull/11781#discussion_r1887631464 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -257,17 +257,17 @@ private static class Task implements Supplier>>, Closeable { @Over

Re: [PR] fix: field id in name mapping should be optional [iceberg-python]

2024-12-16 Thread via GitHub
barronw commented on code in PR #1426: URL: https://github.com/apache/iceberg-python/pull/1426#discussion_r1887662401 ## pyiceberg/table/name_mapping.py: ## @@ -333,8 +334,8 @@ def struct(self, struct: StructType, struct_partner: Optional[MappedField], fiel return Stru

Re: [PR] Avro: Support default values for generic data [iceberg]

2024-12-16 Thread via GitHub
rdblue commented on code in PR #11786: URL: https://github.com/apache/iceberg/pull/11786#discussion_r1887662465 ## core/src/main/java/org/apache/iceberg/data/avro/PlannedDataReader.java: ## @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Avro: Support default values for generic data [iceberg]

2024-12-16 Thread via GitHub
rdblue commented on PR #11786: URL: https://github.com/apache/iceberg/pull/11786#issuecomment-2546993293 Thanks for the review, @Fokko! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Avro: Support default values for generic data [iceberg]

2024-12-16 Thread via GitHub
rdblue merged PR #11786: URL: https://github.com/apache/iceberg/pull/11786 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Docs: Change to Flink directory for instructions [iceberg]

2024-12-16 Thread via GitHub
szehon-ho commented on PR #11031: URL: https://github.com/apache/iceberg/pull/11031#issuecomment-2546995771 Whoops sorry, I must have missed this. I think it makes sense to me. cc @pvary @stevenzwu -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] chore(docs): Update Readme - Lakekeeper repository moved [iceberg-rust]

2024-12-16 Thread via GitHub
c-thiel opened a new pull request, #810: URL: https://github.com/apache/iceberg-rust/pull/810 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

[PR] Bump moto from 5.0.22 to 5.0.23 [iceberg-python]

2024-12-16 Thread via GitHub
dependabot[bot] opened a new pull request, #1435: URL: https://github.com/apache/iceberg-python/pull/1435 Bumps [moto](https://github.com/getmoto/moto) from 5.0.22 to 5.0.23. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog.

[PR] Bump mkdocs-material from 9.5.48 to 9.5.49 [iceberg-python]

2024-12-16 Thread via GitHub
dependabot[bot] opened a new pull request, #1437: URL: https://github.com/apache/iceberg-python/pull/1437 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.48 to 9.5.49. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>

[PR] Bump adlfs from 2024.7.0 to 2024.12.0 [iceberg-python]

2024-12-16 Thread via GitHub
dependabot[bot] opened a new pull request, #1436: URL: https://github.com/apache/iceberg-python/pull/1436 Bumps [adlfs](https://github.com/fsspec/adlfs) from 2024.7.0 to 2024.12.0. Changelog Sourced from https://github.com/fsspec/adlfs/blob/main/CHANGELOG.md";>adlfs's changelog.

Re: [PR] Fix comment on `WRITE_OBJECT_STORE_PARTITIONED_PATHS` table property [iceberg]

2024-12-16 Thread via GitHub
ebyhr commented on PR #11798: URL: https://github.com/apache/iceberg/pull/11798#issuecomment-2547049649 I believe this change is correct. The usage is: https://github.com/apache/iceberg/blob/b9b61b1d72ebb192d5e90453ff7030ece73d2603/core/src/main/java/org/apache/iceberg/LocationProviders.

Re: [PR] Core: Allow adding files to multiple partition specs in FastAppend [iceberg]

2024-12-16 Thread via GitHub
anuragmantri commented on code in PR #11771: URL: https://github.com/apache/iceberg/pull/11771#discussion_r1887709101 ## core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java: ## @@ -1590,13 +1590,15 @@ public void testCompleteCreateTransactionMultipleSchemas() {

Re: [PR] Core: Fix numeric overflow of timestamp nano literal [iceberg]

2024-12-16 Thread via GitHub
ebyhr commented on code in PR #11775: URL: https://github.com/apache/iceberg/pull/11775#discussion_r1887709351 ## api/src/test/java/org/apache/iceberg/types/TestConversions.java: ## @@ -111,9 +111,9 @@ public void testByteBufferConversions() { assertConversion( 400

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2024-12-16 Thread via GitHub
sopel39 commented on code in PR #11781: URL: https://github.com/apache/iceberg/pull/11781#discussion_r1886884393 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -257,17 +257,17 @@ private static class Task implements Supplier>>, Closeable { @Over

Re: [PR] Core: Unimplement Map from CharSequenceMap to obey contract [iceberg]

2024-12-16 Thread via GitHub
findepi commented on PR #11704: URL: https://github.com/apache/iceberg/pull/11704#issuecomment-2545942666 @nastra @Fokko can you please take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] Implementing namespace_exists function on the REST Catalog [iceberg-python]

2024-12-16 Thread via GitHub
AhmedNader42 opened a new pull request, #1434: URL: https://github.com/apache/iceberg-python/pull/1434 I have added the namespace_exists functionality for REST Catalog. Here's an example usage similar to the table_exists function of the same class ![Screenshot from 2024-12-16

Re: [I] i cant table create dayTransform() and monthTransform() togetger for single field in schema [iceberg]

2024-12-16 Thread via GitHub
Fokko commented on issue #11788: URL: https://github.com/apache/iceberg/issues/11788#issuecomment-2545929169 @dvnageshpatil If you have a timestamp value: ``` ts = '2024-10-22T19:25:00' ``` Then the transforms will produce: ``` day(ts) = '2024-10-22' month(ts)

Re: [I] i cant table create dayTransform() and monthTransform() togetger for single field in schema [iceberg]

2024-12-16 Thread via GitHub
Fokko closed issue #11788: i cant table create dayTransform() and monthTransform() togetger for single field in schema URL: https://github.com/apache/iceberg/issues/11788 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1887034502 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries

Re: [I] Implement `namespace_exists` function on the REST Catalog [iceberg-python]

2024-12-16 Thread via GitHub
AhmedNader42 commented on issue #1430: URL: https://github.com/apache/iceberg-python/issues/1430#issuecomment-2545954476 Submitted #1434 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] [DISCUSS] Exceptions vs status codes [iceberg-cpp]

2024-12-16 Thread via GitHub
wgtmac commented on issue #14: URL: https://github.com/apache/iceberg-cpp/issues/14#issuecomment-2545956079 As an Arrow developer who has experienced both status and exception in the same repo, I feel inclined to use exception to make the code shorter and easier to debug. It is preferred (p

Re: [PR] Integrate Test Framework [iceberg-cpp]

2024-12-16 Thread via GitHub
pitrou commented on PR #13: URL: https://github.com/apache/iceberg-cpp/pull/13#issuecomment-2545484399 > I found this link: https://yurigeronimus.medium.com/guide-for-choosing-a-test-framework-for-your-c-project-2a7741b53317 Seems like a lot of fluff with no substance, unfortunately.

Re: [PR] feat: Add RemovePartitionSpecs table update [iceberg-rust]

2024-12-16 Thread via GitHub
c-thiel commented on code in PR #804: URL: https://github.com/apache/iceberg-rust/pull/804#discussion_r1886757422 ## crates/iceberg/src/spec/table_metadata_builder.rs: ## @@ -740,6 +740,32 @@ impl TableMetadataBuilder { .set_default_partition_spec(Self::LAST_ADDED)

Re: [PR] feat: TableMetadata Statistic Files [iceberg-rust]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #799: URL: https://github.com/apache/iceberg-rust/pull/799#discussion_r1886819635 ## crates/iceberg/src/catalog/mod.rs: ## @@ -446,6 +446,30 @@ pub enum TableUpdate { /// Properties to remove removals: Vec, }, +/// Set sta

Re: [PR] add Status data structure [iceberg-cpp]

2024-12-16 Thread via GitHub
gaborkaszab commented on PR #8: URL: https://github.com/apache/iceberg-cpp/pull/8#issuecomment-2545642065 > There is an old stackoverflow question which I think we can take a look There is a stackoverflow for everything and for the opposite too. I just googled for "Status codes vs excepti

Re: [PR] feat: TableMetadata Statistic Files [iceberg-rust]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #799: URL: https://github.com/apache/iceberg-rust/pull/799#discussion_r1886823887 ## crates/iceberg/src/spec/table_metadata_builder.rs: ## @@ -524,6 +526,52 @@ impl TableMetadataBuilder { self } +/// Set statistics for a snapshot

[I] Exceptions vs status codes [iceberg-cpp]

2024-12-16 Thread via GitHub
gaborkaszab opened a new issue, #14: URL: https://github.com/apache/iceberg-cpp/issues/14 I'm pretty sure there are pros and cons for each side and people might get into religious fights on this topic. Let's try to come to a conclusion which one we should use in this project. -- This is

Re: [PR] add Status data structure [iceberg-cpp]

2024-12-16 Thread via GitHub
wgtmac commented on PR #8: URL: https://github.com/apache/iceberg-cpp/pull/8#issuecomment-2545814446 I agree with @gaborkaszab that it would be better to discuss a concrete API design (e.g. Table, FileIO, etc.) before introducing a full-functional status implementation. If we decide to go w

Re: [PR] feat: Add RemovePartitionSpecs table update [iceberg-rust]

2024-12-16 Thread via GitHub
c-thiel commented on code in PR #804: URL: https://github.com/apache/iceberg-rust/pull/804#discussion_r1886939579 ## crates/iceberg/src/spec/table_metadata_builder.rs: ## @@ -740,6 +740,35 @@ impl TableMetadataBuilder { .set_default_partition_spec(Self::LAST_ADDED)

Re: [PR] Build: Bump mkdocs-material from 9.5.47 to 9.5.48 [iceberg]

2024-12-16 Thread via GitHub
Fokko merged PR #11790: URL: https://github.com/apache/iceberg/pull/11790 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [I] [Discussion] googletest(gtest) or Catch2 [iceberg-cpp]

2024-12-16 Thread via GitHub
pitrou commented on issue #12: URL: https://github.com/apache/iceberg-cpp/issues/12#issuecomment-2545837400 Arrow C++ and Parquet C++ developers are certainly familiar with GTest too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] feat: TableMetadata Statistic Files [iceberg-rust]

2024-12-16 Thread via GitHub
c-thiel commented on code in PR #799: URL: https://github.com/apache/iceberg-rust/pull/799#discussion_r1886956770 ## crates/iceberg/src/spec/table_metadata_builder.rs: ## @@ -524,6 +526,52 @@ impl TableMetadataBuilder { self } +/// Set statistics for a snapsh

Re: [I] [DISCUSS] Exceptions vs status codes [iceberg-cpp]

2024-12-16 Thread via GitHub
pitrou commented on issue #14: URL: https://github.com/apache/iceberg-cpp/issues/14#issuecomment-2545841290 The main thing I like about the "Status" style is that it makes error propagation and handling (or lack thereof) explicit. However, I have no religious preference :) -- This is an

Re: [I] [Discussion] googletest(gtest) or Catch2 [iceberg-cpp]

2024-12-16 Thread via GitHub
gaborkaszab commented on issue #12: URL: https://github.com/apache/iceberg-cpp/issues/12#issuecomment-2545662593 No strong opinions from me. I myself gravitate towards gtest just because I have experience on that, but could live with any of them. -- This is an automated message from the A

Re: [PR] Spark3.4,3.5: In describe extended view command: fix wrong view catal… [iceberg]

2024-12-16 Thread via GitHub
nastra commented on code in PR #11751: URL: https://github.com/apache/iceberg/pull/11751#discussion_r1886968667 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestViews.java: ## @@ -1414,7 +1414,42 @@ public void describeExtendedView() {

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1886979577 ## tests/conftest.py: ## @@ -149,6 +149,35 @@ def table_schema_simple() -> Schema: ) +@pytest.fixture(scope="session") Review Comment: nit: wydt

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-16 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1886971501 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/VendedAdlsCredentialProvider.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [I] [Discussion] googletest(gtest) or Catch2 [iceberg-cpp]

2024-12-16 Thread via GitHub
wgtmac commented on issue #12: URL: https://github.com/apache/iceberg-cpp/issues/12#issuecomment-2545859560 I don't have experience with Catch2 either. But I don't object to adopt it if it brings good stuff. -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-16 Thread via GitHub
ChaladiMohanVamsi commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1886977070 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/VendedAdlsCredentialProvider.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1886982570 ## pyiceberg/types.py: ## @@ -328,8 +328,8 @@ def __init__( data["type"] = data["type"] if "type" in data else field_type data["required"] =

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
paulcichonski commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1886986206 ## tests/conftest.py: ## @@ -149,6 +149,35 @@ def table_schema_simple() -> Schema: ) +@pytest.fixture(scope="session") Review Comment: Sure, I

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1886988485 ## tests/conftest.py: ## @@ -149,6 +149,35 @@ def table_schema_simple() -> Schema: ) +@pytest.fixture(scope="session") Review Comment: +1 if it b

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
paulcichonski commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1886987316 ## pyiceberg/types.py: ## @@ -328,8 +328,8 @@ def __init__( data["type"] = data["type"] if "type" in data else field_type data["required"

Re: [I] PyIceberg appending data creates snapshots incompatible with Athena/Spark [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on issue #1424: URL: https://github.com/apache/iceberg-python/issues/1424#issuecomment-2545885155 The above replicates the logic of `_generate_snapshot_id` https://github.com/apache/iceberg-python/blob/b981780d313f7fa6fb911381962fe00017073cfe/pyiceberg/table/metadat

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1886989722 ## pyiceberg/types.py: ## @@ -328,8 +328,8 @@ def __init__( data["type"] = data["type"] if "type" in data else field_type data["required"] =

Re: [PR] Deserialize NestedField initial-default and write-default Attributes [iceberg-python]

2024-12-16 Thread via GitHub
paulcichonski commented on code in PR #1432: URL: https://github.com/apache/iceberg-python/pull/1432#discussion_r1886992009 ## pyiceberg/types.py: ## @@ -328,8 +328,8 @@ def __init__( data["type"] = data["type"] if "type" in data else field_type data["required"

  1   2   3   >