Re: Why I get a hit on %, &, but not on !, @, #, $, ^, *

Steven White Tue, 14 Jul 2015 10:08:46 -0700

Thanks Jack.

Can you provide me with a concrete example of how to:


1) Be able to search and find "$10" (without quotes).  This will get me
started on how to add all other variations for !, @, etc. and be able to
search on them.  In this case, a search for "$10" will give me a hit on
text of "$10", but not "10" and a search on "10" will give me a hit on "10"
but not "$10".

2) Prevent a hit on "10%" (without quotes).  This will get me started on
howto prevent a hit on %, &, etc.  In this case, a search for "%" or "10%"
will give me 0 hits, but a search on "10" will give me a hit on "10" or
"10%".

Do you see where I'm going with this?  Are both of those configurations
possible?  This will let me customize Solr to meet customer need.

Thanks.

Steve

On Mon, Jul 13, 2015 at 11:12 PM, Jack Krupansky <[email protected]>
wrote:

> Oops... that's the "types" attribute.
>
> -- Jack Krupansky
>
> On Mon, Jul 13, 2015 at 11:11 PM, Jack Krupansky <[email protected]
> >
> wrote:
>
> > The word delimiter filter is remmoving special characters. You can add a
> > file containing a list of the special characters that you wish to treat
> as
> > alpha, using the "type" parameter.
> >
> > -- Jack Krupansky
> >
> > On Mon, Jul 13, 2015 at 6:43 PM, Steven White <[email protected]>
> > wrote:
> >
> >> Hi Everyone,
> >>
> >> I think the subject line said it all.  Here is the schema I'm using:
> >>
> >> <fieldType name="my_text" class="solr.TextField"
> >> positionIncrementGap="100"
> >> autoGeneratePhraseQueries="true">
> >>   <analyzer>
> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >> <filter class="solr.StopFilterFactory" ignoreCase="true"
> >> words="lang/stopwords_en.txt"/>
> >> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> >> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> >> catenateAll="1" splitOnCaseChange="0" splitOnNumerics="1"
> >> stemEnglishPossessive="1" preserveOriginal="1"/>
> >> <filter class="solr.LowerCaseFilterFactory"/>
> >> <filter class="solr.KeywordMarkerFilterFactory"
> >> protected="protwords.txt"/>
> >> <filter class="solr.PorterStemFilterFactory"/>
> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >>   </analyzer>
> >> </fieldType>
> >>
> >> I'm guessing this is due to how solr.WhitespaceTokenizerFactory works
> and
> >> those that it is not indexing are removed because they are considered
> >> "white-spaces"?  If so, how can I include %, &, etc. into this
> >> none-indexed
> >> list?  I would rather see all these not indexed vs some are and some are
> >> not causing confusion to my users.
> >>
> >> Thanks
> >>
> >> Steve
> >>
> >
> >
>

Re: Why I get a hit on %, &, but not on !, @, #, $, ^, *

Reply via email to