On Wed, Aug 14, 2013 at 3:53 AM, ses <stew...@ssims.co.uk> wrote: > We are trying out the new PostingsHighlighter with Solr 4.2.1 and finding > that the highlighting section of the response includes self-closing tags > for > all the fields in hl.fl (by default for edismax it is all fields in qf) > where there are no highlighting matches. In contrast the same query on Solr > 4.0.0 without PostingsHighlighter it returns only the fields containing > highlighting matches. > > here is a simplified example of the highlighting response for a document > with no matches in the fields specified by hl.fl: > with PostingsHighlighter: > <response> > ... > <lst name="highlighting"> > <lst name="Z123456"> > <arr name="A1"/> > <arr name="A2"/> > <arr name="A3"/> > ... > </lst> > </lst> > </response> > > without PostingsHighlighter: > <response> > ... > <lst name="highlighting"> > <lst name="Z123456"/> > </lst> > </response> >
Do you want to open a JIRA issue to just change the behavior? > This is a big problem for us as we have a large number of fields in a > dynamic field and we believe every time a highlighted response comes back > it > is sending us a very large number of self-closing tags which bloats the > response to an unreasonable size (in some cases 100MB+). > Unrelated: If your queries actually go against a large number of fields, I'm not sure how efficient this highlighter will be. Thats because at some number of N fields, it will be much more efficient to use a document-oriented term vector approach (e.g. standard highlighter/fast-vector-highlighter).