Juan,

Pay close attention to the boundary scanner you’re employing:

http://wiki.apache.org/solr/HighlightingParameters#hl.boundaryScanner

You can be explicit to indicate a type (hl.bs.type) with options such as 
CHARACTER, WORD, SENTENCE, and LINE.  The default is WORD (as the wiki 
indicates) and I presume this is what you are employing.

Be careful about using explicit characters.  I had an interesting case of 
highlight returns that looked like this:

> This is a highlight
> Here is another highlight
> Yes, another one, etc…

It was a bit maddening trying to figure out why “>” was in the highlight…turned 
out it was XML content and the character boundary clipped the trailing “>” 
based on the boundary rules.

In any case, you should be able to achieve a pretty flexible result depending 
on what you’re really after with the right combination of settings.

Jason

On Feb 19, 2014, at 7:53 AM, Juan Carlos Serrano <jcserran...@gmail.com> wrote:

> Hello everybody,
> 
> I'm using Solr 4.6.1. and I'd like to know if there's a way to determine
> exactly the number of characters of a fragment used in highlights. If I use
> hl.fragsize=70 the length of the fragments that I get is variable (often)
> and I get results of 90 characters length.
> 
> Regards and thanks in advance,
> 
> Juan Carlos

Reply via email to