Hi Andrea, I'm using the original highlighter.
Below is my configuration for the highlighter in solrconfig.xml <requestHandler name="/highlight" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <int name="rows">10</int> <str name="wt">json</str> <str name="indent">true</str> <str name="df">text</str> <str name="fl">id, title, content_type, last_modified, url, score </str> <str name="hl">on</str> <str name="hl.fl">id, title, content, author </str> <str name="hl.highlightMultiTerm">true</str> <str name="hl.preserveMulti">true</str> <str name="hl.encoder">html</str> <str name="hl.fragsize">200</str> <str name="hl.maxAnalyzedChars">1000000</str> <str name="group">true</str> <str name="group.field">signature</str> <str name="group.main">true</str> <str name="group.cache.percent">100</str> </lst> </requestHandler> Have you managed to solve the problem? Regards, Edwin On 4 December 2015 at 23:54, Andrea Gazzarini <a.gazzar...@gmail.com> wrote: > Hi Zheng, > just curiousity, because shortly I will have to deal with a similar > scenario (Solr 5.3.1 + large documents + highlighting). > Which highlighter are you using? > > Andrea > > 2015-12-04 16:51 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > > > Hi, > > > > I'm using Solr 5.3.0 > > > > I found that in large documents, sometimes I face situation that when I > do > > a highlight query, the resultset that is returned does not contain the > > highlighted query. There are actually matches in the documents, but just > > that they located further back in the documents. > > > > I have tried to increase the value of the hl.maxAnalyzedChars, as the > > default value is 51200, and I have documents that are much larger than > > 51200 characters. Although this method works, but, when I increase this > > value, the performance of the search and highlight drops. It can drop > from > > less than 0.5 seconds to more than 10 seconds. > > > > Would like to check, is this method of increasing the value of the > > hl.maxAnalyzedChars the best method to use, or is there other ways which > > can solve the same purpose, but without affecting the performance much? > > > > Regards, > > Edwin > > >