On Wed, Aug 14, 2013 at 3:53 AM, ses <stew...@ssims.co.uk> wrote:

> We are trying out the new PostingsHighlighter with Solr 4.2.1 and finding
> that the highlighting section of the response includes self-closing tags
> for
> all the fields in hl.fl (by default for edismax it is all fields in qf)
> where there are no highlighting matches. In contrast the same query on Solr
> 4.0.0 without PostingsHighlighter it returns only the fields containing
> highlighting matches.
>
> here is a simplified example of the highlighting response for a document
> with no matches in the fields specified by hl.fl:
> with PostingsHighlighter:
> <response>
>   ...
>   <lst name="highlighting">
>     <lst name="Z123456">
>       <arr name="A1"/>
>       <arr name="A2"/>
>       <arr name="A3"/>
>       ...
>     </lst>
>   </lst>
> </response>
>
> without PostingsHighlighter:
> <response>
>   ...
>   <lst name="highlighting">
>     <lst name="Z123456"/>
>   </lst>
> </response>
>

Do you want to open a JIRA issue to just change the behavior?


> This is a big problem for us as we have a large number of fields in a
> dynamic field and we believe every time a highlighted response comes back
> it
> is sending us a very large number of self-closing tags which bloats the
> response to an unreasonable size (in some cases 100MB+).
>

Unrelated: If your queries actually go against a large number of fields,
I'm not sure how efficient this highlighter will be. Thats because at some
number of N fields, it will be much more efficient to use a
document-oriented term vector approach (e.g. standard
highlighter/fast-vector-highlighter).

Reply via email to