Yeah I am about to try turning one on at a time and see what happens. I had a meeting so couldn't do it yet... (darn those meetings) (lol)
-----Original Message----- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Tuesday, April 26, 2011 2:37 PM To: solr-user@lucene.apache.org Subject: Re: term position question from analyzer stack for WordDelimiterFilterFactory Hi Robert, I'm no WDFF expert, but all these zero look suspicious: org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=0, generateNumberParts=0, catenateWords=0, generateWordParts=0, catenateAll=0, catenateNumbers=0} A quick visit to http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDel imiterFilterFactory makes me think you want: splitOnCaseChange=1 (if you want Mc Afee for some reason?) generateWordParts=1 (if you want Mc Afee for some reason?) preserveOriginal=1 Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Robert Petersen <rober...@buy.com> > To: solr-user@lucene.apache.org; yo...@lucidimagination.com > Sent: Tue, April 26, 2011 4:39:49 PM > Subject: RE: term position question from analyzer stack for >WordDelimiterFilterFactory > > OK this is even more weird... everything is working much better except > for one thing: I was testing use cases with our top query terms to make > sure the below query settings wouldn't break any existing behavior, and > got this most unusual result. The analyzer stack completely eliminated > the word McAfee from the query terms! I'm like huh? Here is the > analyzer page output for that search term: > > Query Analyzer > org.apache.solr.analysis.WhitespaceTokenizerFactory {} > term position 1 > term text McAfee > term type word > source start,end 0,6 > payload > org.apache.solr.analysis.SynonymFilterFactory > {synonyms=query_synonyms.txt, expand=true, ignoreCase=true} > term position 1 > term text McAfee > term type word > source start,end 0,6 > payload > org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt, > ignoreCase=true} > term position 1 > term text McAfee > term type word > source start,end 0,6 > payload > org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=0, > generateNumberParts=0, catenateWords=0, generateWordParts=0, > catenateAll=0, catenateNumbers=0} > term position > term text > term type > source start,end > payload > org.apache.solr.analysis.LowerCaseFilterFactory {} > term position > term text > term type > source start,end > payload > com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory > {protected=protwords.txt} > term position > term text > term type > source start,end > payload > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {} > term position > term text > term type > source start,end > payload > > > > -----Original Message----- > From: Robert Petersen [mailto:rober...@buy.com] > Sent: Monday, April 25, 2011 11:27 AM > To: solr-user@lucene.apache.org; yo...@lucidimagination.com > Subject: RE: term position question from analyzer stack for > WordDelimiterFilterFactory > > Aha! I knew something must be awry, but when I looked at the analysis > page output, well it sure looked like it should match. :) > > OK here is the query side WDF that finally works, I just turned > everything off. (yay) First I tried just completely removeing WDF from > the query side analyzer stack but that didn't work. So anyway I suppose > I should turn off the catenate all plus the preserve original settings, > reindex, and see if I still get a match huh? (PS thank you very much > for the help!!!) > > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="0" > generateNumberParts="0" > catenateWords="0" > catenateNumbers="0" > catenateAll="0" > preserveOriginal="0" > /> > > > > -----Original Message----- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik > Seeley > Sent: Monday, April 25, 2011 9:24 AM > To: solr-user@lucene.apache.org > Subject: Re: term position question from analyzer stack for > WordDelimiterFilterFactory > > On Mon, Apr 25, 2011 at 12:15 PM, Robert Petersen <rober...@buy.com> > wrote: > > The search and index analyzer stack are the same. > > Ahhh, they should not be! > Using both generate and catenate in WDF at query time is a no-no. > Same reason you can't have multi-word synonyms at query time: > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.Synonym > FilterFactory > > I'd recommend going back to the WDF settings in the solr example > server as a starting point. > > > -Yonik > http://www.lucenerevolution.org -- Lucene/Solr User Conference, May > 25-26, San Francisco >