I have been reading threads all day regarding this topic and nothing seems to work the way it says it should. :) I appreciate any and all help in this matter.
Solr 4 is working perfectly for in all regards with this one exception. My requirement from Solr4 is very simple. I am storing a document like a job description in a text_general field. I have added a filter for SynonymFilterFactory so that I can map C++ => cplusplus and c# => csharp during indexing a querying. Here is the field definition: <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="punctuation-whitelist.txt" ignoreCase="true" expand="false"/> <filter class="solr.StandardFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="punctuation-whitelist.txt" ignoreCase="true" expand="false"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> Here is the contents of punctuation-whitelist.txt: c++ => cplusplus C# => csharp I have but one document indexed for the purpose of this test, when I search for resume_text:C++, I get the following result, which is also the same result I get when I just search for resume_text:c You can see from the highlighting that solr is matching on the "C" only <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">20</int> </lst> <result name="response" numFound="1" start="0" maxScore="0.16273327"> <doc> <arr name="resume_text"> <str>C++ Developer with c# experience, including .net</str> </arr> </doc> </result> <lst name="highlighting"> <lst name="208645"> <arr name="resume_text"> <str><em>C</em>++ Developer with <em>c</em># experience, including .net</str> </arr> </lst> </lst> </response> If I use the Analysis tool in the Solr Web UI, putting "C#" or "C++" into the Index or Query boxes translates to just "C" in all filters and tokenizers in the analysis output. Can someone please explain the _Best_ way to accomplish what I am trying to do, which is accurately index, search and highlight text with words like C++ and C#. I am looking for the "right way" and it's okay if I have started down the wrong path. :) Thank you. Dave