Hi Andrew, This would not necessarily increase the size of your index that much - you don't to store both fields, just 1 of them if you really need it for highlighting or displaying. If not, just index.
Otis ---- Performance Monitoring for Solr - http://sematext.com/spm/solr-performance-monitoring >________________________________ > From: Andrew Wagner <wagner.and...@gmail.com> >To: solr-user@lucene.apache.org >Sent: Tuesday, April 24, 2012 7:21 AM >Subject: Re: Deciding whether to stem at query time > >Ah, this is a really good point. Still seems like it has the downsides of >#2, though, much bigger space requirements and possibly some time lost on >queries. > >On Mon, Apr 23, 2012 at 3:35 PM, Walter Underwood <wun...@wunderwood.org>wrote: > >> There is a third approach. Create two fields and always query both of >> them, with the exact field given a higher weight. This works great and >> performs well. >> >> It is what we did at Netflix and what I'm doing at Chegg. >> >> wunder >> >> On Apr 23, 2012, at 12:21 PM, Andrew Wagner wrote: >> >> > So I just realized the other day that stemming basically happens at index >> > time. If I'm understanding correctly, there's no way to allow a user to >> > specify, at run time, whether to stem particular words or not based on a >> > single index. I think there are two options, but I'd love to hear that >> I'm >> > wrong: >> > >> > 1.) Incrementally build up a white list of words that don't stem very >> well. >> > To pick a random example out of the blue, "light" isn't super closely >> > related to, "lighter", so I might choose not to stem that. If I wanted to >> > do this, I think (if I understand correctly), stemmerOverrideFilter would >> > help me out with this. I'm not a big fan of this approach. >> > >> > 2.) Index all the text in two fields, once with stemming and once >> without. >> > Then build some kind of option into the UI for specifying whether to stem >> > the words or not, and search the appropriate field. Unfortunately, this >> > would roughly double the size of my index, and probably affect query >> times >> > too. Plus, the UI would probably suck. >> > >> > Am I missing an option? Has anyone tried one of these approaches? >> > >> > Thanks! >> > Andrew >> >> >> >> >> >> > > >