Mike Klaas wrote:
On 19-Apr-08, at 3:02 AM, Christian Wittern wrote:
So it could be that the match is not part of the fragment? This
sounds a bit strange. Is there a way to make sure the fragment
contains the match other than returning the whole field and do the
fragmenting myself?
[...]
As you can see, only fragments containing a match are returned (note
that there is very often multiple matches--you seemed to assume only
one).
Mike, thank you for the clarification. Now I understand what went wrong
in the example I looked at. I am querying ngram indexed data (Chinese
text). A user enters two or three characters and expect them to be
matched more or less as a substring match. The fragment I looked at did
contain only one of the characters (the other was cut off at the end),
this is what made me wondering. From what you say, even adding
quotation marks around the query will not prevent this from happening
(in this case, it would simply obscure the match).
Are there any plans to improve the algorithm for fragmentation? Or are
there other work arounds?
All the best,
Christian