Hi, keepword-filter is no solution for this problem, since this would lead to the problematic that one has to manage a word-dictionary. As explained, this would lead to too much effort.
You can easily add outputUnigrams=true and check out the analysis.jsp for this field. So you can see how much bigger a single field will become with this option. However, I am quite sure that the difference between using outputUnigrams=true and indexing in a seperate field is not noteworthy. I would suggest you to do it the additionally-field-way, since this would lead to more flexibility in boosting the different fields. Unfortunately, I haven't understood your explanation about the use-case. But it sounds a little bit like tagging? Kind regards, - Mitch iorixxx wrote: > >> Isn't set outputUnigrams="true" will >> make index size about twice than when it's set to false? > > Sure index will be bigger. I didn't know that this is problem for you. But > if you have a list of special single words that you want to keep, > keepwordfilter can eliminate other tokens. So index size will be okey. > >> >> Scott >> >> ----- Original Message ----- From: "Ahmet Arslan" <iori...@yahoo.com> >> To: <solr-user@lucene.apache.org> >> Sent: Saturday, August 21, 2010 1:15 AM >> Subject: Re: Doing Shingle but also keep special single >> word >> >> >> >> I am building index with Shingle >> >> filter. We know it's minimum 2-gram but I also >> want keep >> >> some special single word, e.g. IBM, Microsoft, >> etc. i.e. I >> >> want to do a minimum 2-gram but also want to have >> these >> >> single word in my index, Is it possible? >> > >> > outputUnigrams="true" parameter does not work for >> you? >> > >> > After that you can cast <filter >> class="solr.KeepWordFilterFactory" words="keepwords.txt" >> ignoreCase="true"/> with keepwords.txt=IBM, Microsoft. >> > >> > >> > >> > >> >> > > > > > -- View this message in context: http://lucene.472066.n3.nabble.com/Doing-Shingle-but-also-keep-special-single-word-tp1241204p1276506.html Sent from the Solr - User mailing list archive at Nabble.com.