Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-27 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2379885390 Perhaps let's just wait on the response on the devlist first. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-26 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2378222143 > Basically, Spark stores day transformed partition values incorrectly in the metadata Thats an interesting find... The core Iceberg library is using `DateType` as the Re

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-26 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2378238683 devlist https://lists.apache.org/thread/2gq7b54nvc9q6f1j08l9lnzgm5onkmx5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-26 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2377884626 I played around with Spark and inspected the generated metadata. I believe the partition value is actually stored in the metadata as date type for day transformed partitions.

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-26 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-237771 The partition value is stored in the metadata as int, but the "partition metadata table" (https://iceberg.apache.org/docs/latest/spark-queries/#partitions) shows the partition

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-26 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2377581839 It's more readable sure, but just does not conform to the spec. I'm not entirely sure what you mean by your question @kevinjqliu, I believe partition values are only stored in m

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375810823 I like that its converted, its more readable! Do you know where the transform happens? Is it only for the metadata table? -- This is an automated message from the Apache Git

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375801600 > Im not 100% sure, perhaps the metadata table does the transformation. > > https://iceberg.apache.org/docs/latest/spark-queries/#partitions I think you are correct

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375566385 Im not 100% sure, perhaps the metadata table does the transformation. https://iceberg.apache.org/docs/latest/spark-queries/#partitions -- This is an automated message from

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375530341 Ok so interesting... Spark actually does store day transforms as date type in the metadata, which is why the integration test is failing. This is probably why this library had t

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on code in PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#discussion_r1776147728 ## pyiceberg/transforms.py: ## @@ -517,9 +517,6 @@ def day_func(v: Any) -> int: def can_transform(self, source: IcebergType) -> bool: return isi

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinzwang commented on code in PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#discussion_r1776140899 ## pyiceberg/transforms.py: ## @@ -517,9 +517,6 @@ def day_func(v: Any) -> int: def can_transform(self, source: IcebergType) -> bool: return isi

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on code in PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#discussion_r1776111931 ## pyiceberg/transforms.py: ## @@ -517,9 +517,6 @@ def day_func(v: Any) -> int: def can_transform(self, source: IcebergType) -> bool: return isi

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinzwang commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375442625 > is this the source of truth? https://iceberg.apache.org/spec/#partition-transforms Yup, precisely -- This is an automated message from the Apache Git Service. To resp

Re: [PR] fix: DayTransform result type override and docs [iceberg-python]

2024-09-25 Thread via GitHub
kevinjqliu commented on PR #1208: URL: https://github.com/apache/iceberg-python/pull/1208#issuecomment-2375337828 is this the source of truth? https://iceberg.apache.org/spec/#partition-transforms -- This is an automated message from the Apache Git Service. To respond to the messag