Yes, that's exactly what I needed. I don't know how I missed that.
Thank you!
--
Steve
On Jun 18, 2009, at 4:49 PM, Brendan Grainger wrote:
Are you using Porter Stemming? If so I think you can just specify
your word in the protwords.txt file (or whatever you've called it).
Check out http://wiki.apache.org/solr/
AnalyzersTokenizersTokenFilters and the example config for the
Porter Stemmer:
<fieldtype name="myfieldtype" class="solr.TextField">
<analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt" /> </analyzer>
</fieldtype>
HTH
Brendan
On Jun 18, 2009, at 4:38 PM, Stephen Weiss wrote:
Hi,
I've hit a bit of a problem with destemming and could use some
advice.
Right now there is a word in the index called "Stylesight" and
another word "Stylesightings", which was just added. When users
search for "Stylesightings", the client really only wants them to
get results that match "Stylesightings" and not "Stylesight", as
they are two [relatively] unrelated things. However, I'm guessing
because of the destemmer, "Stylesightings" becomes "Stylesight"
internally... which results in the "wrong" behavior.
I really don't want to turn off the destemmer, that's like killing
an ant with a nuke. I was thinking, perhaps, since we use both
index- and query-time synonyms, I could make a synonym like this:
"Stylesightings" => "xlkje0r923jjfsdf"
or some other random string of un-destemmable junk, that might
work, but I'm not sure and reindexing all the affected documents
will take quite some time so it would be good to know in advance if
this is even a good idea.
Of course, if there's another, better idea, I'd be very open to
that too.
Thanks for any suggestions!
--
Steve