Re: multiple tokenizers needed

Erik Hatcher Fri, 10 Apr 2009 05:25:08 -0700

Or have the indexing client split the data at these delimiters andjust use the CJKAnalyzer.


        Erik


On Apr 10, 2009, at 7:30 AM, Grant Ingersoll wrote:

The only thing that comes to mind in a short term way is writing twoTokenFilter implementations that wrap the second and third tokenizers


On Apr 9, 2009, at 11:00 PM, Ashish P wrote:

I want to analyze a text based on pattern ";" and separate onwhitespace and

it is a Japanese text so use CJKAnalyzer + tokenizer also.
in short I want to do:
                         <analyzer 
class="org.apache.lucene.analysis.cjk.CJKAnalyzer">
                                <tokenizer class="solr.PatternTokenizerFactory" 
pattern=";" />
                                <tokenizer class="solr.WhitespaceTokenizerFactory" 
/>
                                <tokenizer 
class="org.apache.lucene.analysis.cjk.CJKTokenizer" />
                        </analyzer>

Can anyone please tell me how to achieve this?? Because the abovesyntax is

not at all possible.
--
View this message in context: 
http://www.nabble.com/multiple-tokenizers-needed-tp22982382p22982382.html
Sent from the Solr - User mailing list archive at Nabble.com.


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using Solr/Lucene:

http://www.lucidimagination.com/search

Re: multiple tokenizers needed

Reply via email to