deemoliu commented on code in PR #12392:
URL: https://github.com/apache/pinot/pull/12392#discussion_r1487311632


##########
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java:
##########
@@ -570,6 +572,81 @@ public static String[] split(String input, String 
delimiter, int limit) {
     return StringUtils.splitByWholeSeparator(input, delimiter, limit);
   }
 
+  /**
+   * @param input an input string for prefix strings generations.
+   * @param length the max length of the prefix strings for the string.
+   * @param regexChar the character for regex matching to be added to prefix 
strings generated. e.g. '^'

Review Comment:
   hmm, the idea here is to support use cases that need to union the (prefixes, 
suffixes, ngram) column in one column, so the indexes size will reduced and 
easier to fit into memory.
   I feel this is a relatively common use case to generate the prefix matcher 
and suffix matcher, so added the regex character as an optimal parameters. 
   
   we can do the following 
   ```
   prefixes(String input, int minLength, int maxLength)
   prefixMatchers(String input, int minLength, int maxLength, String regexChar)
   ```
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to