advancedxy commented on code in PR #8579: URL: https://github.com/apache/iceberg/pull/8579#discussion_r1451452956
########## format/spec.md: ########## @@ -329,19 +329,35 @@ The `void` transform may be used to replace the transform in an existing partiti #### Bucket Transform Details -Bucket partition transforms use a 32-bit hash of the source value. The 32-bit hash implementation is the 32-bit Murmur3 hash, x86 variant, seeded with 0. +Bucket partition transforms use a 32-bit hash of the source value(s). The 32-bit hash implementation is the 32-bit Murmur3 hash, x86 variant, seeded with 0. Transforms are parameterized by a number of buckets [1], `N`. The hash mod `N` must produce a positive value by first discarding the sign bit of the hash value. In pseudo-code, the function is: ``` def bucket_N(x) = (murmur3_x86_32_hash(x) & Integer.MAX_VALUE) % N ``` +When bucket transform is applied on a list of values, the input is treated as concatenated bytes of each value. In pseudo-code, the function is: Review Comment: Great, updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org