wgtmac commented on code in PR #653:
URL: https://github.com/apache/iceberg-cpp/pull/653#discussion_r3272689931


##########
src/iceberg/util/bucket_util.cc:
##########
@@ -63,6 +63,16 @@ int32_t HashLiteral<TypeId::kTimestampTz>(const Literal& 
literal) {
   return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
 }
 
+template <>
+int32_t HashLiteral<TypeId::kTimestampNs>(const Literal& literal) {
+  return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
+}
+
+template <>
+int32_t HashLiteral<TypeId::kTimestampTzNs>(const Literal& literal) {
+  return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
+}

Review Comment:
   According to the Iceberg V3 spec and the Java implementation 
(`BucketTimestampNano.java`), nanosecond timestamps must be converted to 
microseconds (divided by 1000) before hashing. This ensures that bucket 
partitioning is consistent between microsecond and nanosecond precision types 
for the same logical time.
   
   ```cpp
   return BucketUtils::HashLong(std::get<int64_t>(literal.value()) / 1000);
   ```



##########
src/iceberg/util/transform_util.cc:
##########
@@ -117,6 +199,21 @@ std::string TransformUtil::HumanTimestamp(int64_t 
timestamp_micros) {
   }
 }
 
+std::string TransformUtil::HumanTimestampNs(int64_t timestamp_nanos) {
+  auto tp = std::chrono::time_point<std::chrono::system_clock, 
std::chrono::seconds>{
+      std::chrono::seconds(timestamp_nanos / kNanosPerSecond)};
+  auto nanos = timestamp_nanos % kNanosPerSecond;

Review Comment:
   For negative timestamps (pre-1970), C++'s division (`/`) and modulo (`%`) 
operators truncate towards zero. This causes `ParseTimestampNs` and 
`HumanTimestampNs` to compute an incorrect base time point and a negative 
fractional part, breaking the string formatting and parsing.
   
   For example, `1969-12-31T23:59:59.123456789` parses to `-876543211` nanos. 
Passing this back here yields `0` for seconds and `-876543211` for nanos, 
resulting in `1970-01-01T00:00:00.-876543211`.
   
   Consider using `std::chrono::floor` to handle the negative values correctly 
(note: the original microsecond `HumanTimestamp` and `ParseTimestamp` also 
suffer from this issue).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to