airborne12 opened a new pull request, #61120: URL: https://github.com/apache/doris/pull/61120
## Proposed changes Fix `reinterpret_cast<std::string*>` on `StringRef*` causing buffer overflow on ARM64, where `std::string` is 24 bytes but `StringRef` is only 16 bytes. ### 1. `function_multi_match.cpp` Convert `StringRef` to `std::string` before passing as `query_value`. Downstream `FullTextIndexReader::query()` does `reinterpret_cast<const std::string*>(query_value)`, reading 8 bytes past the `StringRef` buffer. ### 2. `in_list_predicate.h` Fix 3 sites where `HybridSet` iterator's `get_value()` returns `StringRef*`, but code casts it to `std::string*`. Added `if constexpr (is_string_type(Type))` guard to safely construct `std::string` from `StringRef::data`/`StringRef::size`. ## Problem Summary | Type | `sizeof` (ARM64) | Issue | |------|-----------------|-------| | `std::string` | 24 bytes | Expected by `reinterpret_cast` receiver | | `StringRef` | 16 bytes | Actual type behind `void*` | Reading a 16-byte `StringRef` as a 24-byte `std::string` causes stack-buffer-overflow (detected by ASAN). ## Checklist - [x] Fixes buffer overflow detected by ASAN on ARM64 - [x] No behavioral change — only fixes type safety at the boundary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
