Thank you all for helping on this topic. I'm going to play with this and might come back with more questions.
Steve On Tue, Jul 14, 2015 at 1:57 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Steve: > > Simplest solution: > remove WordDelimiterFilterFactory. > Use something like PatternReplaceCharFilterFactory or > PatternReplaceFilterFactory to selectively remove the characters you > don't care about and leave in the ones you do care about. > > You might also want to do this kind of thing in a copyField and search > one or the other selectively as desired, or perhaps boost or... > > NOTE: one side effect of WDFF is that punctuation is removed, so you > have to consider what you want to do with periods at the end of a > sentence, apostrophes and the like. > > Best, > Erick > > On Tue, Jul 14, 2015 at 10:08 AM, Steven White <swhite4...@gmail.com> > wrote: > > Thanks Jack. > > > > Can you provide me with a concrete example of how to: > > > > 1) Be able to search and find "$10" (without quotes). This will get me > > started on how to add all other variations for !, @, etc. and be able to > > search on them. In this case, a search for "$10" will give me a hit on > > text of "$10", but not "10" and a search on "10" will give me a hit on > "10" > > but not "$10". > > > > 2) Prevent a hit on "10%" (without quotes). This will get me started on > > howto prevent a hit on %, &, etc. In this case, a search for "%" or > "10%" > > will give me 0 hits, but a search on "10" will give me a hit on "10" or > > "10%". > > > > Do you see where I'm going with this? Are both of those configurations > > possible? This will let me customize Solr to meet customer need. > > > > Thanks. > > > > Steve > > > > On Mon, Jul 13, 2015 at 11:12 PM, Jack Krupansky < > jack.krupan...@gmail.com> > > wrote: > > > >> Oops... that's the "types" attribute. > >> > >> -- Jack Krupansky > >> > >> On Mon, Jul 13, 2015 at 11:11 PM, Jack Krupansky < > jack.krupan...@gmail.com > >> > > >> wrote: > >> > >> > The word delimiter filter is remmoving special characters. You can > add a > >> > file containing a list of the special characters that you wish to > treat > >> as > >> > alpha, using the "type" parameter. > >> > > >> > -- Jack Krupansky > >> > > >> > On Mon, Jul 13, 2015 at 6:43 PM, Steven White <swhite4...@gmail.com> > >> > wrote: > >> > > >> >> Hi Everyone, > >> >> > >> >> I think the subject line said it all. Here is the schema I'm using: > >> >> > >> >> <fieldType name="my_text" class="solr.TextField" > >> >> positionIncrementGap="100" > >> >> autoGeneratePhraseQueries="true"> > >> >> <analyzer> > >> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >> >> <filter class="solr.StopFilterFactory" ignoreCase="true" > >> >> words="lang/stopwords_en.txt"/> > >> >> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > >> >> generateNumberParts="1" catenateWords="1" catenateNumbers="1" > >> >> catenateAll="1" splitOnCaseChange="0" splitOnNumerics="1" > >> >> stemEnglishPossessive="1" preserveOriginal="1"/> > >> >> <filter class="solr.LowerCaseFilterFactory"/> > >> >> <filter class="solr.KeywordMarkerFilterFactory" > >> >> protected="protwords.txt"/> > >> >> <filter class="solr.PorterStemFilterFactory"/> > >> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > >> >> </analyzer> > >> >> </fieldType> > >> >> > >> >> I'm guessing this is due to how solr.WhitespaceTokenizerFactory works > >> and > >> >> those that it is not indexing are removed because they are considered > >> >> "white-spaces"? If so, how can I include %, &, etc. into this > >> >> none-indexed > >> >> list? I would rather see all these not indexed vs some are and some > are > >> >> not causing confusion to my users. > >> >> > >> >> Thanks > >> >> > >> >> Steve > >> >> > >> > > >> > > >> >