Re: How to make edge_ngram work with number, underscores, dashes and space

Alexandre Rafalovitch Tue, 20 Jan 2015 20:48:07 -0800

So, try the suggested tokenizers and dump the ngrams from query. See
what happens. Ask a separate question with corrected config/output if
you still have issues.


Regards,
   Alex.
----
Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 20 January 2015 at 23:08, Vishal Swaroop <vishal....@gmail.com> wrote:
> Thanks for the response..
> a) I am trying to make it non-case-sensitive... itemName data is indexed in
> upper case
>
> b) I am looking to display the result as type-ahead suggestion which might
> include space, underscore, number...
>
> - "ABC12DE" : It does not work as soon as I type 1.. i.e. ABC1
> Output expected "A", "AB", "ABC", "ABC1"... so on
> Data can also have underscores, dashes
> - "ABC_12DE", : Output expected "A", "AB", "ABC", "ABC_", "ABC_1"... so on
>
> Filed name & type defined in schema :
> <field name="itemName" type="text_general_edge_ngram" indexed="true"
> stored="true" multiValued="false" />
>
> <fieldType name="text_general_edge_ngram" class="solr.TextField"
> positionIncrementGap="100">
>    <analyzer type="index">
>     <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> maxGramSize="15" side="front"/>
>    </analyzer>
>    <analyzer type="query">
>     <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> maxGramSize="15" side="front"/>
>    </analyzer>
> </fieldType>
>
> On Tue, Jan 20, 2015 at 9:53 PM, Alexandre Rafalovitch <arafa...@gmail.com>
> wrote:
>
>> Were you actually trying to "...divides text at non-letters and
>> converts them to lower case"? Or were you trying to make it
>> non-case-sensitive, which would be KeywordTokenizer and
>> LowerCaseFilter?
>>
>> Also, normally we do not use NGRam filter on both Index and Query.
>> That just makes things to match on common prefixes instead of matching
>> what you are searching for to a prefix of original word.
>>
>> Regards,
>>     Alex.
>> ----
>> Sign up for my Solr resources newsletter at http://www.solr-start.com/
>>
>>
>> On 20 January 2015 at 21:47, Vishal Swaroop <vishal....@gmail.com> wrote:
>> > Hi,
>> >
>> > May be this is basic but I am trying to understand which Tokenizer and
>> > Filter to use. I followed some examples as mentioned in solr wiki but
>> > type-ahead does not show expected suggestions.
>> >
>> > Example itemName data can be :
>> > - "ABC12DE" : It does not work as soon as I type 1.. i.e. ABC1
>> > - "ABC_12DE", "ABC 12DE"
>> > - Data can also have underscores, dashes
>> > - I am tyring ignorecase auto suggest
>> >
>> > Filed name & type defined in schema :
>> > <field name="itemName" type="text_general_edge_ngram" indexed="true"
>> > stored="true" multiValued="false" />
>> >
>> > <fieldType name="text_general_edge_ngram" class="solr.TextField"
>> > positionIncrementGap="100">
>> >    <analyzer type="index">
>> >     <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>> >     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
>> > maxGramSize="15" side="front"/>
>> >    </analyzer>
>> >    <analyzer type="query">
>> >     <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>> >     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
>> > maxGramSize="15" side="front"/>
>> >    </analyzer>
>> > </fieldType>
>>

Re: How to make edge_ngram work with number, underscores, dashes and space

Reply via email to