Chris Hostetter wrote: > > I'm not sure i'm understanding your question ... is it how to highlight a > stored field that has HTML in it, or how to index a chunk of HTML text? > > the first should be no difference then highlighting any other bit of text > -- the second can be accomplished using the > HTMLStripStandardTokenizerFactory (or > HTMLStripWhitespaceTokenizerFactory) in your schema. > > -Hoss >
It seems both cases you described are not what I want: Please allow me to explain it again: I have two fields in my doc: <field name="html" type="string" indexed="false" stored="true" compressed="true"/> <field name="pageContent" type="text" indexed="true" stored="true" compressed="true"/> In "html" I store the raw html grabbed from internet. It's not indexed, and just stored as string. After removing tags in "html", I get text and store it as "pageContent". This field will be indexed and stored. When a user performs a search, I will return a list of links containing highlighted fragments from "pageContent". If a link is clicked, I want to return the associated raw html back to user AND have search keywords in it to be highlighted, just like google cached page. -- View this message in context: http://www.nabble.com/highlight-search-keywords-on-html-page-tf3240492.html#a9014907 Sent from the Solr - User mailing list archive at Nabble.com.