On 9-Aug-07, at 2:10 PM, Benjamin Higgins wrote:

Hi all, I'd like to provide a blurb of documents matching a search in
the case when there is no text highlighted. I assumed that perhaps the highlighter would give me back the first few words in a document if this occurred, but it doesn't. My conundrum is that I'd rather not grab the whole document body field because some of them are large. Is there some
way I can request from Lucene the first N words or lines from a field?

The way I deal with this is that I modified the highlighter fragment scorer to return a positive (but low) score for the first few fragments of a doc. This will work, but tends not to provide great summaries and will definitely still fetch and process the entire doc contents.

The better way to do this is to generate a better general summary yourself and store it in a separate field; this can be used if no highlighting is generated (or, capability in Solr to automatically substitute a field in the case of no highlighting would be cool). I might even implement this if there is sufficient interest :).

Unfortunately, the highlighter does not know (and realy has no way of knowing) what parts of a doc matched, so it would still have to try highlighting first.

Note that you can control the cpu usage for long fields by setting hl.maxAnalyzedChars (will be in the next release).

best,
-Mike

Reply via email to