I tried to post it myself, got the address wrong, thanks for re-posting. the problem we have with highlighting outside of the indexer is that the systems we use that store co-ords are... based on term string (in one case) and the specific term offset in another. Both of which break horribly when trying to do interesting things with solr/lucene.
The only real way to do it is to store that term based data with the index. Otherwise you'll have to use the lucene query parser to reparse the search string and write our own searcher to search our custom xml co-ord files. Most unsatisfactory. P.S. I noticed that my original email had way too many spelling mistakes, sorry about that. Best Regards, Martin Owens On Mon, 2008-08-11 at 17:43 -0600, Tricia Williams wrote: > Martin, > > I've been over some of the same thoughts you present here in the last > few years. The path of least resistance ended up being to deal with the > highlighting portion of OCRed images outside of Solr. That's not to say > it couldn't or shouldn't be done differently. I briefly even pursued a > similar course of action evident in > https://issues.apache.org/jira/browse/SOLR-386. This would make it > easier if you wanted to write your own highlighter. > > I'm interested to see what others think of your suggestions. I've > forwarded this to the solr-user list. > > Tricia