My guess is PatternReplaceFilterFactory is most likely the cause. Also, based on your query, you might want to set preserveOriginal=1
You can take one filter out at a time and see which one is altering the query. On Wed, Jul 26, 2017 at 11:13 AM, Webster Homer <webster.ho...@sial.com> wrote: > 1. KeywordTokenizer - we want to treat the entire field as a single term to > parse > 2. preserveOriginal = "0" Thought about changing this to 1 > 3. 6.2.2 > > This is the fieldtype > <fieldType name="cas_num_tokenizer" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.TrimFilterFactory" /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="0" > splitOnCaseChange="0" > splitOnNumerics="1" > generateNumberParts="0" > catenateWords="0" > catenateNumbers="1" > catenateAll="0" > preserveOriginal="0" > stemEnglishPossessive="0"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.TrimFilterFactory" /> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true" > tokenizerFactory="solr.KeywordTokenizerFactory"/> > <!-- remove non-cas queries and junk from synonyms --> > <filter class="solr.PatternReplaceFilterFactory" > pattern="^.*([^- 0-9*]+).*$" replacement="" replace="all"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="0" > splitOnCaseChange="0" > splitOnNumerics="1" > generateNumberParts="0" > catenateWords="0" > catenateNumbers="1" > catenateAll="0" > preserveOriginal="0" > stemEnglishPossessive="0"/> > </analyzer> > </fieldType> > > > On Wed, Jul 26, 2017 at 12:56 PM, Saurabh Sethi < > saurabh.se...@sendgrid.com> > wrote: > > > 1. What tokenizer are you using? > > 2. Do you have preserveOriginal="1" flag set in your filter? > > 3. Which version of solr are you using? > > > > On Wed, Jul 26, 2017 at 10:48 AM, Webster Homer <webster.ho...@sial.com> > > wrote: > > > > > I have several fieldtypes that use the WordDelimiterFilterFactory > > > > > > We have a fieldtype for cas numbers. which look like 1234-12-1, numbers > > > separated by hyphens, users often leave out the hyphens and either use > > > spaces or just string the numbers together. > > > > > > The WDF seemed like a great solution especially as it gave partial > > matches. > > > However, a query like 1234-12-* fails. The analyzer tool shows the > > wildcard > > > getting stripped off. > > > Is there any way to preserve the wildcard in the query analyzer when > > using > > > the WordDelimiterFilterFactory? > > > > > > -- > > > > > > > > > This message and any attachment are confidential and may be privileged > or > > > otherwise protected from disclosure. If you are not the intended > > recipient, > > > you must not copy this message or attachment or disclose the contents > to > > > any other person. If you have received this transmission in error, > please > > > notify the sender immediately and delete the message and any attachment > > > from your system. Merck KGaA, Darmstadt, Germany and any of its > > > subsidiaries do not accept liability for any omissions or errors in > this > > > message which may arise as a result of E-Mail-transmission or for > damages > > > resulting from any unauthorized changes of the content of this message > > and > > > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its > > > subsidiaries do not guarantee that this message is free of viruses and > > does > > > not accept liability for any damages caused by any virus transmitted > > > therewith. > > > > > > Click http://www.emdgroup.com/disclaimer to access the German, French, > > > Spanish and Portuguese versions of this disclaimer. > > > > > > > > > > > -- > > Saurabh Sethi > > Principal Engineer I | Engineering > > > > -- > > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to > any other person. If you have received this transmission in error, please > notify the sender immediately and delete the message and any attachment > from your system. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not accept liability for any omissions or errors in this > message which may arise as a result of E-Mail-transmission or for damages > resulting from any unauthorized changes of the content of this message and > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not guarantee that this message is free of viruses and does > not accept liability for any damages caused by any virus transmitted > therewith. > > Click http://www.emdgroup.com/disclaimer to access the German, French, > Spanish and Portuguese versions of this disclaimer. > -- Saurabh Sethi Principal Engineer I | Engineering