coderfender commented on issue #21116:
URL: https://github.com/apache/datafusion/issues/21116#issuecomment-4152374181

   IMHO there is no right way to represent timestamps with various offsets and 
effectively leverage vector (SIMD) advantages. 
   1. Save all timestamps with timezone info 
   2. Group all timestamps with same timezone offset together to leverage SIMD 
but this could cause increased memory usage
   We might have 2 approaches and a hybrid implementation based on the variance 
in timezones per batch and a `CARDINALITY_FACTOR` kinda parameter to decide 
which approach is better. Let me raise a draft PR in the following week to 
provide a POC with some benchmarks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to