airborne12 opened a new pull request, #61120:
URL: https://github.com/apache/doris/pull/61120

   ## Proposed changes
   
   Fix `reinterpret_cast<std::string*>` on `StringRef*` causing buffer overflow 
on ARM64, where `std::string` is 24 bytes but `StringRef` is only 16 bytes.
   
   ### 1. `function_multi_match.cpp`
   Convert `StringRef` to `std::string` before passing as `query_value`. 
Downstream `FullTextIndexReader::query()` does `reinterpret_cast<const 
std::string*>(query_value)`, reading 8 bytes past the `StringRef` buffer.
   
   ### 2. `in_list_predicate.h`
   Fix 3 sites where `HybridSet` iterator's `get_value()` returns `StringRef*`, 
but code casts it to `std::string*`. Added `if constexpr 
(is_string_type(Type))` guard to safely construct `std::string` from 
`StringRef::data`/`StringRef::size`.
   
   ## Problem Summary
   
   | Type | `sizeof` (ARM64) | Issue |
   |------|-----------------|-------|
   | `std::string` | 24 bytes | Expected by `reinterpret_cast` receiver |
   | `StringRef` | 16 bytes | Actual type behind `void*` |
   
   Reading a 16-byte `StringRef` as a 24-byte `std::string` causes 
stack-buffer-overflow (detected by ASAN).
   
   ## Checklist
   
   - [x] Fixes buffer overflow detected by ASAN on ARM64
   - [x] No behavioral change — only fixes type safety at the boundary


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to