airborne12 opened a new pull request, #22313:
URL: https://github.com/apache/doris/pull/22313

   ## Proposed changes
   
   Issue Number: close #xxx
   refactor tokenize function, it is used like this
   ```
   mysql> select tokenize("i love china","'parser'='english'");
   +----------------------------------------------------+
   | tokenize('i love china', '''parser''=''english''') |
   +----------------------------------------------------+
   | ["i", "love", "china"]                             |
   +----------------------------------------------------+
   1 row in set (0.02 sec)
   
   mysql> select tokenize("我爱北京天安门","'parser'='unicode'");
   +-------------------------------------------------------------+
   | tokenize('我爱北京天安门', '''parser''=''unicode''')        |
   +-------------------------------------------------------------+| ["我", "爱", 
"北", "京", "天", "安", "门"]                  
|+-------------------------------------------------------------+
   1 row in set (0.01 sec)
   
   mysql> select tokenize("我爱北京天安门","'parser'='chinese'");
   +-------------------------------------------------------------+
   | tokenize('我爱北京天安门', '''parser''=''chinese''')        |
   +-------------------------------------------------------------+
   | ["爱", "北京", "天安门"]                                    |
   +-------------------------------------------------------------+
   1 row in set (0.01 sec)
   
   mysql> select 
tokenize("我爱北京天安门","'parser'='chinese',parser_mode='fine_grained'");
   
+------------------------------------------------------------------------------------------+
   | tokenize('我爱北京天安门', '''parser''=''chinese'',parser_mode=''fine_grained''') 
       |
   
+------------------------------------------------------------------------------------------+
   | ["爱", "北京", "天安", "天安门"]                                                   
      |
   
+------------------------------------------------------------------------------------------+
   1 row in set (0.01 sec)
   ```
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to