I can think of a way to not store stems in the index, but to gain the benefit from stemming, i.e. improved recall: expand the query to include all index terms that share stems with the original query terms.
Here's one way to achieve this: - When indexing, run all terms through the stemmer, and maintain a map of [stem -> terms]. Save this map. (Don't index the stems.) - When querying, stem the original query terms, then augment the query with all results from the [stem -> terms] map. The resulting expanded query will not contain stemmed terms, just original query terms, along with index terms that share stems with them. Note that the above process will result in a) slower query response times; and b) potentially lower precision (the standard precision/recall tradeoff applies here). Steve On 12/20/2007 at 10:59 AM, Otis Gospodnetic wrote: > Kamran, > > I think Bertrand's suggestion is the only possible solution. > I can't think of a way you can not stem at index time and > make it an option at search time. If you look at and > understand low-level/basic indexing and term matching > process, I think you'll see why this seems impossible. But > maybe somebody will come up with a clever suggestion. :) > > Otis > > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > ----- Original Message ---- > From: Kamran Shadkhast <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Wednesday, December 19, 2007 9:46:10 AM > Subject: Re: Making stemming dynamic at query time > > > The easiest is probably to have two copies of your field, using > <copyField>, one stemmed and one not, and search in one or the other. > > -Bertrand > > Yes, I knew this, but it costs me too much, in my case having > more than > 65M > records and saving most of the fields inside the index for highlighting > purpose does not work. I am looking for making it as option. it is good > to have index everything stemmed but at query time make it optional to > filter > query stemmed or not > > Thanks, > -Kamran