Hi,
Append &debugQuery=true to your request URLs to see what's going on.
Here is something I've used in the past. I suggest you throw out everything
but n-grams while you're debugging.
<!-- n-gram tokenization -->
<fieldType name="unigram" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="org.apache.solr.analysis.NGramTokenizerFactory"
minGramSize="1" maxGramSize="1"/>
</analyzer>
<analyzer type="query">
<tokenizer class="org.apache.solr.analysis.NGramTokenizerFactory"
minGramSize="1" maxGramSize="1"/>
</analyzer>
</fieldType>
...
...
<field name="text_cn" type="unigram" indexed="true" stored="true"
required="true"/>
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
> From: Christian Wittern <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Friday, February 22, 2008 4:32:08 AM
> Subject: help with using ngram analyser needed
>
> Hi Solr users,
>
> This is my first posting to this list, after experimenting with Solr
> for a few days. Please bear with me.
>
> I am trying to set up a text field for searching CJK text. At the
> moment, I am trying using the ngram tokenizer factory, defined in the
> schema.xml as follows:
>
>
>
>
>
>
>
>
>
>
> synonyms="variants.txt" ignoreCase="true" expand="true"/>
>
>
>
>
> I can test this in the administrative interface and it seems to work.
> However, when I do searches, I only get matches for single character
> searches, or for searches that match a complete text field. What I am
> trying to achieve is a substring match that would match any sequence
> of characters in the target field.
>
> Any help appreciated,
>
> Christian
>
>
>
> --
> Christian Wittern, Kyoto
>