liujiwen-up commented on code in PR #41681:
URL: https://github.com/apache/doris/pull/41681#discussion_r1800408496


##########
be/src/vec/functions/function_string.cpp:
##########
@@ -535,6 +545,42 @@ struct TrimUtil {
         return Status::OK();
     }
 };
+template <bool is_ltrim_in, bool is_rtrim_in, bool trim_single>
+struct TrimInUtil {
+    static Status vector(const ColumnString::Chars& str_data,
+                         const ColumnString::Offsets& str_offsets, const 
StringRef& remove_str,
+                         ColumnString::Chars& res_data, ColumnString::Offsets& 
res_offsets) {
+        const size_t offset_size = str_offsets.size();
+        res_offsets.resize(offset_size);
+        res_data.reserve(str_data.size());
+        std::bitset<256> char_lookup;

Review Comment:
   Currently, there are two processing logics. 
   1. The `simd::VStringFunctions::is_ascii `method is used to judge that when 
the string is all ascii, bitset<128> will be used for processing. The character 
range of the standard ASCII table is 0 to 127. Bitset<128> has exactly 128 
bits, which is enough to represent all standard ASCII characters. 
   2. When the string is not all standard ascii, the utf-8 logic will be used 
for processing.It is especially important to note that when trimming on the 
right, according to the rules of UTF-8, the format of the UTF-8 trailing byte 
is 10xxxxxx. Use `(*prev_char_pos & 0xC0) == 0x80` to determine whether the 
current byte is a trailing byte. When a byte that is not a trailing byte is 
found, this byte is the starting byte of the current character.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to