So I've thrown something together fairly quickly which is based on what Ahmet had sent that I believe will preserve the original token as well as the stemmed version. I didn't go as far as weighting them differently using the payloads however. I am not sure how to use the preserveOriginal attribute from WordDelimeterFilterFactory, can anyone provide guidance on that?
On Fri, Mar 9, 2012 at 2:53 PM, Jamie Johnson <jej2...@gmail.com> wrote: > Further digging leads me to believe this is not the case. The Synonym > Filter supports this, but the Stemming Filter does not. > > Ahmet, > > Would you be willing to provide your filter as well? I wonder if we > can make it aware of the preserveOriginal attribute on > WordDelimterFilterFactory? > > > On Fri, Mar 9, 2012 at 2:27 PM, Jamie Johnson <jej2...@gmail.com> wrote: >> Ok, so I'm digging through the code and I noticed in >> org.apache.lucene.analysis.synonym.SynonymFilter there are mentions of >> a keepOrig attribute. Doing some googling led me to >> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters which >> speaks of an attribute preserveOriginal="1" on >> solr.WordDelimiterFilterFactory. So it seems like I can get the >> functionality I am looking for by setting preserveOriginal, is that >> correct? >> >> >> On Fri, Mar 9, 2012 at 9:53 AM, Ahmet Arslan <iori...@yahoo.com> wrote: >>>> I'd be very interested to see how you >>>> did this if it is available. Does >>>> this seem like something useful to the community at large? >>> >>> I PMed it to you. Filter is not a big deal. Just modified from {@link >>> org.apache.lucene.wordnet.SynonymTokenFilter}. If requested, I can provide >>> it publicly too.