Hi Robert,
I'm no WDFF expert, but all these zero look suspicious:
org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=0,
generateNumberParts=0, catenateWords=0, generateWordParts=0,
catenateAll=0, catenateNumbers=0}
A quick visit to
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory
makes me think you want:
splitOnCaseChange=1 (if you want Mc Afee for some reason?)
generateWordParts=1 (if you want Mc Afee for some reason?)
preserveOriginal=1
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
----- Original Message ----
> From: Robert Petersen <[email protected]>
> To: [email protected]; [email protected]
> Sent: Tue, April 26, 2011 4:39:49 PM
> Subject: RE: term position question from analyzer stack for
>WordDelimiterFilterFactory
>
> OK this is even more weird... everything is working much better except
> for one thing: I was testing use cases with our top query terms to make
> sure the below query settings wouldn't break any existing behavior, and
> got this most unusual result. The analyzer stack completely eliminated
> the word McAfee from the query terms! I'm like huh? Here is the
> analyzer page output for that search term:
>
> Query Analyzer
> org.apache.solr.analysis.WhitespaceTokenizerFactory {}
> term position 1
> term text McAfee
> term type word
> source start,end 0,6
> payload
> org.apache.solr.analysis.SynonymFilterFactory
> {synonyms=query_synonyms.txt, expand=true, ignoreCase=true}
> term position 1
> term text McAfee
> term type word
> source start,end 0,6
> payload
> org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
> ignoreCase=true}
> term position 1
> term text McAfee
> term type word
> source start,end 0,6
> payload
> org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=0,
> generateNumberParts=0, catenateWords=0, generateWordParts=0,
> catenateAll=0, catenateNumbers=0}
> term position
> term text
> term type
> source start,end
> payload
> org.apache.solr.analysis.LowerCaseFilterFactory {}
> term position
> term text
> term type
> source start,end
> payload
> com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory
> {protected=protwords.txt}
> term position
> term text
> term type
> source start,end
> payload
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}
> term position
> term text
> term type
> source start,end
> payload
>
>
>
> -----Original Message-----
> From: Robert Petersen [mailto:[email protected]]
> Sent: Monday, April 25, 2011 11:27 AM
> To: [email protected]; [email protected]
> Subject: RE: term position question from analyzer stack for
> WordDelimiterFilterFactory
>
> Aha! I knew something must be awry, but when I looked at the analysis
> page output, well it sure looked like it should match. :)
>
> OK here is the query side WDF that finally works, I just turned
> everything off. (yay) First I tried just completely removeing WDF from
> the query side analyzer stack but that didn't work. So anyway I suppose
> I should turn off the catenate all plus the preserve original settings,
> reindex, and see if I still get a match huh? (PS thank you very much
> for the help!!!)
>
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0"
> generateNumberParts="0"
> catenateWords="0"
> catenateNumbers="0"
> catenateAll="0"
> preserveOriginal="0"
> />
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Yonik
> Seeley
> Sent: Monday, April 25, 2011 9:24 AM
> To: [email protected]
> Subject: Re: term position question from analyzer stack for
> WordDelimiterFilterFactory
>
> On Mon, Apr 25, 2011 at 12:15 PM, Robert Petersen <[email protected]>
> wrote:
> > The search and index analyzer stack are the same.
>
> Ahhh, they should not be!
> Using both generate and catenate in WDF at query time is a no-no.
> Same reason you can't have multi-word synonyms at query time:
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.Synonym
> FilterFactory
>
> I'd recommend going back to the WDF settings in the solr example
> server as a starting point.
>
>
> -Yonik
> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
> 25-26, San Francisco
>