Thank you, Chris and Erick, for the answers,

it was new to me that "the*" is expanded to all known the* words in the index. Good to know.

And yes, the AND operation between the query terms are certainly the problem. (I would like to switch to OR instead. The result set will grow the more words you are searching for, but as the results are ordered for the hit quality this would be ok. But the customer does not like this behaviour, because he thinks that the more words you are searching for, the smaller the result set should become. So this is not an option.).

On 28.05.2010 22:06, Chris Hostetter wrote:
word2*) ..." in the client, that you instead consider using multiple
fields -- one "text" defined as you have it now, and one "text_prefix"
defined similarly but with an additional EdgeNGramTokenFilter used when
indexing to generate "prefix" tokens. then search those fields using
dismax...

q=word1 word2 word3&  qf=text text_prefix&  mm=100%&  tie=0

Ok, I will think about this. But I wonder if this will be more efficient than just not filtering stopwords? (But I have to study the EdgeNGram thing first. AFAIK it indexes all WORDS as WORDS, WORD, WOR, WO. So the index will be blown up, too?)

What I do not understand in your idea, why I should use a second text_prefix field. Wouldn't it work with just this text_prefix without the normal text field, too, as I always let search for "word" and "word*" and never without the prefix?

Thanks,
Gert

Reply via email to