DinGo4DEV opened a new issue, #2002: URL: https://github.com/apache/iceberg-python/issues/2002
### Apache Iceberg version 0.9.0 (latest release) ### Please describe the bug 🐞 ## Description When using UUIDType as a BucketTransform Partition, an error occurs during table operations such as upsert. The issue appears to be related to the partition key changing from int to str, which causes a type mismatch when the Avro encoder attempts to write an integer. ## Steps to Reproduce 1. Create a table with UUIDType column 2. Configure the table to use BucketTransform on that column for partitioning 3. Attempt to upsert data into the table ## Current Behavior The operation fails with a TypeError as the system attempts to perform integer operations on a string value. ## Expected Behavior The operation should properly handle UUIDType columns when used with BucketTransform partitioning. The uuid bucket partition value should be `1` instead of `"1"` ## Error Stack Trace ```python Traceback (most recent call last): File "test_upsert.py", line 248, in <module> result = table.upsert( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\pyiceberg\table\__init__.py", line 1216, in upsert tx.append(rows_to_insert) File ".venv\Lib\site-packages\pyiceberg\table\__init__.py", line 470, in append with self._append_snapshot_producer(snapshot_properties) as append_files: File ".venv\Lib\site-packages\pyiceberg\table\update\__init__.py", line 71, in __exit__ self.commit() File ".venv\Lib\site-packages\pyiceberg\table\update\__init__.py", line 67, in commit self._transaction._apply(*self._commit()) ^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\pyiceberg\table\update\snapshot.py", line 242, in _commit new_manifests = self._manifests() ^^^^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\pyiceberg\table\update\snapshot.py", line 201, in _manifests return self._process_manifests(added_manifests.result() + delete_manifests.result() + existing_manifests.result()) ^^^^^^^^^^^^^^^^^^^^^^^^ File "~\Python312\Lib\concurrent\futures\_base.py", line 456, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "~\Python312\Lib\concurrent\futures\_base.py", line 401, in __get_result raise self._exception File "~\Python312\Lib\concurrent\futures\thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\pyiceberg\table\update\snapshot.py", line 159, in _write_added_manifest writer.add( File ".venv\Lib\site-packages\pyiceberg\manifest.py", line 847, in add self.add_entry(self._reused_entry_wrapper._wrap_append(self._snapshot_id, None, entry.data_file)) File ".venv\Lib\site-packages\pyiceberg\manifest.py", line 840, in add_entry self._writer.write_block([self.prepare_entry(entry)]) File ".venv\Lib\site-packages\pyiceberg\avro\file.py", line 281, in write_block self.writer.write(block_content_encoder, obj) writer.write(encoder, val[pos] if pos is not None else None) File ".venv\Lib\site-packages\pyiceberg\avro\writer.py", line 176, in write writer.write(encoder, val[pos] if pos is not None else None) writer.write(encoder, val[pos] if pos is not None else None) File ".venv\Lib\site-packages\pyiceberg\avro\writer.py", line 176, in write writer.write(encoder, val[pos] if pos is not None else None) File ".venv\Lib\site-packages\pyiceberg\avro\writer.py", line 66, in write encoder.write_int(val) File ".venv\Lib\site-packages\pyiceberg\avro\encoder.py", line 45, in write_int datum = (integer << 1) ^ (integer >> 63) ``` ## Potential Fix The issue appears to be in the type handling in `partition_record_value` function when initial `PartitionKey` with the `PartitionFieldValue`. https://github.com/apache/iceberg-python/blob/996a7ba4dbf4afdb3d46689f1715206b1c355f2a/pyiceberg/partitioning.py#L385-L406 Would add Union type for `value` to handle the **transformed** value. https://github.com/apache/iceberg-python/blob/996a7ba4dbf4afdb3d46689f1715206b1c355f2a/pyiceberg/partitioning.py#L469-L471 ### Willingness to contribute - [x] I can contribute a fix for this bug independently - [ ] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org