Hi Aleksey,

KeywordTokenizerFactory creates a single token out of the input given it.  
You probably want something like WhitespaceTokenizerFactory instead - it 
creates tokens at whitespace boundaries.

Steve

On 10/13/2008 at 5:30 PM, Aleksey Gogolev wrote:
> 
> Hello.
> 
> I use Solr 1.3 and I have a problem with ShingleFilterFactory.
> 
> I read about ShingleFilterFactory and decided to try it.
> I created new type, just for experimenting:
> 
>     <fieldType name="my_type" class="solr.TextField">
>         <analyzer type="index">
>             <tokenizer class="solr.KeywordTokenizerFactory"/>
>             <filter class="solr.LowerCaseFilterFactory" />
>             <filter class="solr.ShingleFilterFactory"
> maxShingleSize="2" outputUnigrams="true"/>
>         </analyzer>
>         <analyzer type="query">
>             <tokenizer class="solr.KeywordTokenizerFactory"/>
>             <filter class="solr.LowerCaseFilterFactory" />
>             <filter class="solr.ShingleFilterFactory"
> maxShingleSize="2" outputUnigrams="true"/>
> 
>         </analyzer>
>     </fieldType>
> 
> Then I went to solr analysis page and typed "samsung monitor ad345
> LCD" to Index Field. I expected to get tree tokens after passing
> ShingleFilterFactory: "samsung monitor", "monitor ad345", "ad345 LCD".
> But after passing ShingleFilterFactory token didn't change and
> remains the same: "samsung monitor ad345 lcd".
> 
> I have no idea what did I do wrong :(
> 
> Thank you in advance.
> 
> --
> Aleksey Gogolev
> developer,
> dev.co.ua
> Aleksey
> 
>

 

Reply via email to