> You've got CommonGramsFilterFactory and StopFilterFactory both using > stopwords.txt, which is a confusing configuration. Normally you'd want one > or the other, not both ... but if you did legitimately have both, you'd want > them to each use a different wordlist. >
Maybe I am wrong. But my intentions of using both of them is - first I want to use phrase queries so used CommonGramsFilterFactory. Secondly, I dont want those stopwords in my index, so I have used StopFilterFactory to remove them. > > The commongrams filter turns each found occurrence of a word in the file > into two tokens - one prepended with the token before it, one appended with > the token after it. If it's the first or last term in a field, it only > produces one token. When it gets to the stopfilter, the combined terms no > longer match what's in stopwords.txt, so no action is taken. > > If I had to guess, what you are seeing in the top 10 terms is the > concatenation of your most common stopword with another word. If it were > English, I would guess that to be "of_the" or something similar. If my > guess is wrong, then I'm not sure what's going on, and some cut/paste of > what you're actually seeing might be in order. term frequencyto 26164and 25804the 25566of 25022a 24918in 24590for 23646n23588 with 23055is 22510 > Did you do delete and do a full reindex after you changed your schema? > Yup I did that a couple of times > > Thanks, > Shawn > > *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com/> | Google <http://www.google.com/profiles/pranny>