Uwe goes on to say:

> This works, as long as you don't need query highlighting, because the offsets 
> from the first field addition cannot be used for highlighting inside the text 
> with markup. *In this case, you have to write your own analyzer that removes 
> the markup in the tokenizer, but preserves the original offsets. *Examples of 
> this are e.g. The Wikipedia contrib in Lucene, which has an hand-crafted 
> analyzer that can handle Mediawiki Markup syntax.
>
>

On Sun, Dec 5, 2010 at 3:35 PM, Jonathan Rochkind <rochk...@jhu.edu> wrote:

> That suggestion says "This works, as long as you don't need query
> highlighting."  Have you found a way around that, or have you decided not to
> use highlighting after all?  Or am I missing something?
> ________________________________________
> From: Rich Cariens [richcari...@gmail.com]
> Sent: Sunday, December 05, 2010 10:58 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Full text hit term highlighting
>
> Thanks Lance.  I'm storing the original document and indexing all it's
> extracted content, but I need to be able to high-light the text within it's
> original markup.  I'm going to give Uwe's suggestion <http://bit.ly/hCSdYZ>a
> go.
>
> On Sat, Dec 4, 2010 at 7:18 PM, Lance Norskog <goks...@gmail.com> wrote:
>
> > Set the fragment length to 0. This means highlight the entire text
> > body. If, you have stored the text body.
> >
> > Otherwise, you have to get the term vectors somehow and highlight the
> > text yourself.
> >
> > I investigated this problem awhile back for PDFs. You can add a
> > starting page and an OR list of search terms to the URL that loads a
> > PDF into the in-browser version of the Adobe PDF reader. This allows
> > you to load the PDF at the first occurence of any of the search terms,
> > with the terms highlighted. The search button takes you to the next of
> > any of the terms.
> >
> > On Sat, Dec 4, 2010 at 4:10 PM, Rich Cariens <richcari...@gmail.com>
> > wrote:
> > > Anyone ever use Solr to present a view of a document with hit-terms
> > > highlighted within?  Kind of like Google's cached <
> http://bit.ly/hgudWq
> > >copies?
> > >
> >
> >
> >
> > --
> > Lance Norskog
> > goks...@gmail.com
> >
>

Reply via email to