This is the new configuration: <fieldType name="text" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <charFilter class="solr.HTMLStripCharFilterFactory"/> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.ShingleFilterFactory" maxShingleSize="2" > outputUnigrams="true" tokenSeparator=""/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.SnowballPorterFilterFactory" > language="English" protected="protwords.txt"/> > <filter class="solr.SynonymFilterFactory" > synonyms="stemmed_synonyms_text_prime_index.txt" ignoreCase="true" > expand="true"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords_text_prime_search.txt" enablePositionIncrements="true" /> > <filter class="solr.ShingleFilterFactory" maxShingleSize="2" > outputUnigrams="true" tokenSeparator=""/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/> > <filter class="solr.SnowballPorterFilterFactory" > language="English" protected="protwords.txt"/> > </fieldType> > > These are current docs in my index:
<result name="response" numFound="3" start="0"> <doc> <str name="id">2</str> <str name="title">Icecream</str> <long name="_version_">1475063961342705664</long> </doc> <doc> <str name="id">3</str> <str name="title">Ice-cream</str> <long name="_version_">1475063961344802816</long> </doc> <doc> <str name="id">1</str> <str name="title">Ice Cream</str> <long name="_version_">1475063961203245056</long> </doc> </result> </response> Query: http://localhost:8983/solr/collection1/select?q=title:ice+cream&debug=true Response: <result name="response" numFound="2" start="0"> <doc> <str name="id">1</str> <str name="title">Ice Cream</str> <long name="_version_">1475063961203245056</long> </doc> <doc> <str name="id">3</str> <str name="title">Ice-cream</str> <long name="_version_">1475063961344802816</long> </doc> </result> <lst name="debug"> <str name="rawquerystring">title:ice cream</str> <str name="querystring">title:ice cream</str> <str name="parsedquery"> (+(title:ice DisjunctionMaxQuery((title:cream))))/no_coord </str> <str name="parsedquery_toString">+(title:ice (title:cream))</str> <lst name="explain"> <str name="1"> 0.875 = (MATCH) sum of: 0.4375 = (MATCH) weight(title:ice in 0) [DefaultSimilarity], result of: 0.4375 = score(doc=0,freq=2.0 = termFreq=2.0 ), product of: 0.70710677 = queryWeight, product of: 1.0 = idf(docFreq=2, maxDocs=3) 0.70710677 = queryNorm 0.61871845 = fieldWeight in 0, product of: 1.4142135 = tf(freq=2.0), with freq of: 2.0 = termFreq=2.0 1.0 = idf(docFreq=2, maxDocs=3) 0.4375 = fieldNorm(doc=0) 0.4375 = (MATCH) weight(title:cream in 0) [DefaultSimilarity], result of: 0.4375 = score(doc=0,freq=2.0 = termFreq=2.0 ), product of: 0.70710677 = queryWeight, product of: 1.0 = idf(docFreq=2, maxDocs=3) 0.70710677 = queryNorm 0.61871845 = fieldWeight in 0, product of: 1.4142135 = tf(freq=2.0), with freq of: 2.0 = termFreq=2.0 1.0 = idf(docFreq=2, maxDocs=3) 0.4375 = fieldNorm(doc=0) </str> <str name="3"> 0.70710677 = (MATCH) sum of: 0.35355338 = (MATCH) weight(title:ice in 2) [DefaultSimilarity], result of: 0.35355338 = score(doc=2,freq=1.0 = termFreq=1.0 ), product of: 0.70710677 = queryWeight, product of: 1.0 = idf(docFreq=2, maxDocs=3) 0.70710677 = queryNorm 0.5 = fieldWeight in 2, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 1.0 = idf(docFreq=2, maxDocs=3) 0.5 = fieldNorm(doc=2) 0.35355338 = (MATCH) weight(title:cream in 2) [DefaultSimilarity], result of: 0.35355338 = score(doc=2,freq=1.0 = termFreq=1.0 ), product of: 0.70710677 = queryWeight, product of: 1.0 = idf(docFreq=2, maxDocs=3) 0.70710677 = queryNorm 0.5 = fieldWeight in 2, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 1.0 = idf(docFreq=2, maxDocs=3) 0.5 = fieldNorm(doc=2) </str> </lst> Still not working ???? On Fri, May 30, 2014 at 9:21 PM, Erick Erickson <erickerick...@gmail.com> wrote: > I'd spend some time with the admin/analysis page to understand the exact > tokenization going on here. For instance, sequencing the > shinglefilterfactory before worddelimiterfilterfactory may produce > "interesting" resutls. And then throwing the Snowball factory at it and > putting synonyms in front.... I suspect you're not indexing or searching > what you think you are. > > Second, what happens when you query with &debug=query? That'll show you > what the search string looks like. > > If that doesn't help, please post the results of looking at those things > here, that'll provide some information for us to work with. > > Best, > Erick > > > On Fri, May 30, 2014 at 3:32 AM, sunshine glass < > sunshineglassof2...@gmail.com> wrote: > > > Hi Folks, > > > > Any updates ?? > > > > > > On Wed, May 28, 2014 at 12:13 PM, sunshine glass < > > sunshineglassof2...@gmail.com> wrote: > > > > > Dear Team, > > > > > > How can I handle compound word searches in solr ?. > > > How can i search "hand bag" if I have "handbag" in my index. While > using > > > shingle in query analyzer, the query "ice cube" creates three tokens as > > > "ice","cube", "icecube". Only ice and cubes are searched but not > > > "icecubes".i.e not working for pair though I am using shingle filter. > > > > > > Here's the schema config. > > > > > > > > > 1. <fieldType name="text" class="solr.TextField" > > > positionIncrementGap="100"> > > > 2. <analyzer type="index"> > > > 3. <filter class="solr.SynonymFilterFactory" > > > synonyms="synonyms_text_prime_index.txt" ignoreCase="true" > > expand="true"/> > > > 4. <charFilter class="solr.HTMLStripCharFilterFactory"/> > > > 5. <tokenizer class="solr.StandardTokenizerFactory"/> > > > 6. <filter class="solr.ShingleFilterFactory" > > > maxShingleSize="2" outputUnigrams="true" tokenSeparator=""/> > > > 7. <filter class="solr.WordDelimiterFilterFactory" > > > catenateWords="1" catenateNumbers="1" catenateAll="1" > > preserveOriginal="1" > > > generateWordParts="1" generateNumberParts="1"/> > > > 8. <filter class="solr.LowerCaseFilterFactory"/> > > > 9. <filter class="solr.SnowballPorterFilterFactory" > > > language="English" protected="protwords.txt"/> > > > 10. </analyzer> > > > 11. <analyzer type="query"> > > > 12. <tokenizer class="solr.StandardTokenizerFactory"/> > > > 13. <filter class="solr.SynonymFilterFactory" > > > synonyms="synonyms.txt" ignoreCase="true" expand="true"/> > > > 14. <filter class="solr.ShingleFilterFactory" > > > maxShingleSize="2" outputUnigrams="true" tokenSeparator=""/> > > > 15. <filter class="solr.WordDelimiterFilterFactory" > > > preserveOriginal="1"/> > > > 16. <filter class="solr.LowerCaseFilterFactory"/> > > > 17. <filter class="solr.SnowballPorterFilterFactory" > > > language="English" protected="protwords.txt"/> > > > 18. </analyzer> > > > 19. </fieldType> > > > > > > Any help is appreciated. > > > > > > > > >