Yes, there is one and only one tokenizer allowed. Best, Erick
On Wed, Mar 16, 2016 at 7:51 PM, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote: > Thanks Shawn for your reply. > > Yes, I'm looking to see if we can implement a combination of tokenizes and > filters. > > However, I tried before that we can only implement one tokenizer for each > fieldType. So is it true that I can only stick to one tokenizer, and the > rest of the implementation have to be done by either filters or to > customise the tokenizer in order to possibly achieve what I want? > > Regards, > Edwin > > > On 17 March 2016 at 09:34, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 3/16/2016 4:33 AM, Zheng Lin Edwin Yeo wrote: >> > I found that HMMChineseTokenizer will split a string that consist of >> > numbers and characters (alphanumeric). For example, if I have a code that >> > looks like "1a2b3c4d", it will be split to 1 | a | 2 | b | 3 | c | 4 | d >> > This has caused the search query speed to slow quite tremendously (like >> at >> > least 10 seconds slower), as it has to search through individual tokens. >> > >> > Would like to check, is there any way that we can solve this issue >> without >> > re-indexing? We have quite alot of code in the index which consist of >> > alphanumeric characters, and we have more than 10 million documents in >> the >> > index, so re-indexing with another tokenizer or pipeline is quite a huge >> > process. >> >> ANY change you make to index analysis will require reindexing. >> >> I have no idea what the advantages and disadvantages are in the various >> tokenizers and filters for Asian characters. There may be a combination >> of tokenizer and filters that will do what you want. >> >> We do have an index for a company in Japan. I'm using ICUTokenizer with >> some of the CJK filters, and in some cases I'm using >> ICUFoldingFilterFactory for lowercasing and normalization. The jars >> required for ICU analysis components can be found in the contrib folder >> in the Solr download. >> >> There are ways to create a whole new index and then move it into place >> to replace your existing index. For SolrCloud mode, you would use the >> collection alias feature. For standalone Solr, you can swap cores. >> >> Thanks, >> Shawn >> >>