Re: Providing token variants at index time

2010-07-22 Thread Jonathan Rochkind
Paul Dlug wrote: On Thu, Jul 22, 2010 at 4:01 PM, Jonathan Rochkind wrote: The synonym approach won't work as I need to provide them in a file. The variants may be more dynamic and not known in advance, the process creating the documents to index does have that logic and could easily put th

Re: Providing token variants at index time

2010-07-22 Thread Paul Dlug
On Thu, Jul 22, 2010 at 4:01 PM, Jonathan Rochkind wrote: > I think the Synonym filter should actually do exactly what you want, no? > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory > > Hmm, maybe not exactly what you want as you describe it. It comes close,

Re: Providing token variants at index time

2010-07-22 Thread Jonathan Rochkind
I think the Synonym filter should actually do exactly what you want, no? http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Hmm, maybe not exactly what you want as you describe it. It comes close, maybe good enough. Do you REALLY need to support "I Business M

Providing token variants at index time

2010-07-22 Thread Paul Dlug
Is there a tokenizer that supports providing variants of the tokens at index time? I'm looking for something that could take a syntax like: International|I Business|B Machines|M Which would take each pipe delimited token and preserve its position so that phrase queries work properly. The above wo