There's nothing really built in to Solr to allow this. Are you absolutely sure you can't just use the copyfield? Have you actually tried it?
But I don't think you need to store the contents twice. Just store it once and always highlight on that field whether you search it or not. Since it's the raw text, you should be fine. You'll have two versions of the field tokenized of course, but that should take less space than you might think. You probably want to store the version with the stemming turned on... That said, storing twice only uses up some disk space, it doesn't require additional memory for searching. So unless you're running out of disk space you can just keep two stored versions around. But If none of that works you might write a custom filter that emits two tokens for each input token at indexing time, similar to what synonyms do. The original should have some special character appended, say $ and the second should be the results of stemming (note, there will be two tokens even if there is no stemming done). So, indexing "running" would index "running$" and "run". Now, when you need to search for an exact match on running, you search for running$. This works for the reverse too. Since the rule is "append $ to all original tokens" "run" gets indexed as "run$" and "run". Now, searching for "run" matches as does "run$". But "run$" does not match the doc that had "running" since the two tokens emitted in that case are "run" and "running$". But look at what's happened here. You're indexing two tokens for every one token in the input. Furthermore, you're adding a bunch of unique tokens to the index. It's hard to see how this results in any savings over just using copyField. You have to index the two tokens since you have to distinguish between the stemmed and un-stemmed version. You might be able to do something really exotic with payloads. This is _really_ out of left field, but it just occurred to me. You'd have to define a transformation from the original word into the stemmed word that created a unique value. Something like no stemming -> 0 removing ing -> 1 removing s -> 2 etc. Actually, this would have to be some kind of function on the letters removed so that removing "ing" mapped to, say, the ordinal position of the letter in the alphabet * position * 100. So "ing" would map to 'i' - 'a' + ('n' - 'a') * 100 + ('g' - 'a') * 10000 etc... (you'd have to take considerable care to get this right for any code sets that had more than 100 possible code points)... Now, you've included the information about what the original word was and could use the payload to fail to match in the exact-match case. Of course the other issue would be to figure out the syntax to get the fact that you wanted an exact match down into your custom scorer. But as you can see, any scheme is harder than just flipping a switch, so I'd _really_ verify that you can't just use copyField.... Best Erick On Wed, Oct 10, 2012 at 7:38 AM, meghana <meghana.rav...@amultek.com> wrote: > 0 down vote favorite > > > We are using solr 3.6. > > We have field named Description. We want searching feature with stemming and > also without stemming (exact word/phrase search), with highlighting in both > . > > For that , we had made lot of research and come to conclusion, to use the > copy field with data type which doesn't have stemming factory. it is working > fine at now. > > (main field has stemming and copy field has not.) > > The data for that field is very large and we are having millions of > documents; and as we want, both searching and highlighting on them; we need > to keep this copy field stored and indexed both. which will increase index > size a lot. > > we need to eliminate this duplication if possible any how. > > From the recent research, we read that combining fuzzy search with dismax > will fulfill our requirement. (we have tried a bit but not getting success.) > > Please let me know , if this is possible, or any other solutions to make > this happen. > > Thanks in Advance > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-Make-Exact-Search-on-Field-with-Fuzzy-Query-tp4012888.html > Sent from the Solr - User mailing list archive at Nabble.com.