(11/07/15 8:23), SBS wrote:
I have come across a somewhat baffling problem. I am indexing HTML documents
and one of them is larger than the rest at about 200K. For some reason when
I search for terms which occur only towards the end of the document (i.e.
after some apparent "cutoff" point in the document), the document itself is
returned as a match but when I call Highlighter#getBestFragments() it
returns an empty array. This same method returns fragments if the terms
occur in the first part of the document.
So, am I running into some size limitation in either documents or fragments?
What else could be causing this behaviour?
There is a limitation. Try to set the following parameter to high (default is
50*1024):
http://lucene.apache.org/java/3_3_0/api/all/org/apache/lucene/search/highlight/Highlighter.html#setMaxDocCharsToAnalyze%28int%29
koji
--
http://www.rondhuit.com/en/