We are trying out the new PostingsHighlighter with Solr 4.2.1 and finding
that the highlighting section of the response includes self-closing tags for
all the fields in hl.fl (by default for edismax it is all fields in qf)
where there are no highlighting matches. In contrast the same query on Solr
4.0.0 without PostingsHighlighter it returns only the fields containing
highlighting matches.

here is a simplified example of the highlighting response for a document
with no matches in the fields specified by hl.fl:
with PostingsHighlighter:
<response>
  ...
  <lst name="highlighting">
    <lst name="Z123456">
      <arr name="A1"/>
      <arr name="A2"/>
      <arr name="A3"/>
      ...
    </lst>
  </lst>
</response>

without PostingsHighlighter:
<response>
  ...
  <lst name="highlighting">
    <lst name="Z123456"/>
  </lst>
</response>

This is a big problem for us as we have a large number of fields in a
dynamic field and we believe every time a highlighted response comes back it
is sending us a very large number of self-closing tags which bloats the
response to an unreasonable size (in some cases 100MB+).

We have tried using hl.requireFieldMatch=true but this seems to make no
difference.

Is there anything we can specify in the query (or solrconfig) to avoid
returning these empty tags? Or could this be a known bug?

We are considering looking at the source and modifying PostingsHighlighter
or associated classes, so any pointers on where to look would also be handy.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/PostingsHighlighter-returning-fields-which-don-t-match-tp4084495.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to