Update: RESOLVED On a hunch I decided to forego trying to separate the EdgeNGramFilterFactory from this one column and apply it to all columns that are copied into the 'text' filed that Solr uses for searching. I moved the filter factory into fieldType 'text_general' which is the type that 'text' uses. Everything worked! Thanks for your help Jack!
-Teague -----Original Message----- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Wednesday, February 05, 2014 6:07 PM To: solr-user@lucene.apache.org Subject: Re: Partial Word Search 1. The ngramming occurs in the index, but does not modify the original, "stored" value that a query will return. So, "Example" will be returned even though the index will have all the sub-terms indexed (but not stored.) 2. You need the ngram filters to be asymmetric with regard to indexing and query - the index analyzer does ngramming, but the query analyzer will not. You have a single analyzer, which means that the query will be expanded into a sequence of sub-terms, which will be ORed or ANDed depending on your default query operator. OR will generally work since it will query for all the sub-terms, but AND will only work if all the sub-terms occur in the document field. -- Jack Krupansky -----Original Message----- From: Teague James Sent: Wednesday, February 5, 2014 4:52 PM To: solr-user@lucene.apache.org Subject: Partial Word Search I cannot get Solr 4.6.0 to do partial word search on a particular field that is used for faceting. Most of the information I have found suggests modifying the fieldType "text" to include either the NGramFilterFactory or EdgeNGramFilterFactory in the filter. However since I am copying many other fields to "text" for searching my expectation is that the NGramFilterFactory would create ngrams for everything sent to it, which is unnecessary and probably costly - right? In an effort to try and troubleshoot the issue I created a new field in the schema and stored it so that I could see what was getting populated. However, what I'm finding is that no ngrams are being generated, just the actual data that gets indexed from the database. Here's what my setup looks like: NOTE: Every record in my test environment has the same value "Example" <field name="PartialSubject" type="partialWord" indexed="true" stored="true" multiValued="true" /> <copyField source="PartialSubject" dest="text"> <fieldType name="partialWord" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="10" side="front"/> </analyzer> </fieldType> When I query Solr it reports: <arr name="PartialSubject"> <str>Example</str> </arr> I was expecting exa, exam, examp, example, example to be the values for PartialSubject so that a search for "exam" would turn up all of the records in this test index. Instead I get 0 results. Can anyone provide any guidance on this please?