escaped characters

Andrzej Bialecki Mon, 03 Aug 2009 14:03:14 -0700

Peter Keane wrote:

I've used Luke to figure out what is going on, and I see in the fields that
fail to match, a "null_1".  Could someone tell me what that is?  I see some
null_100s there as well, which see to separate field values.  Clearly the
null_1s are causing the search to fail.

You used the "Reconstruct" function to obtain the field values forunstored fields, right? null_NNN is Luke's way of telling you that thetokens that should be on these positions are absent, because they wereremoved by analyzer during indexing, and there is no stored value ofthis field from which you could recover the original text. In otherwords, they are holes in the token stream, of length NNN.

Such holes may be also produced by artificially increasing the tokenpositions, hence the null_100 that serves to separate multiple fieldvalues so that e.g. phrase queries don't match unrelated text.

Phrase queries that you can construct using QueryParser can't match twotokens separated by a hole, unless you set a slop value > 0.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Solr Search probem w/ phrase searches, text type, w/ escaped characters

Reply via email to