Hi Chris thank you for replying. My "content" field in the schema is stored="true" and indexed="false" because I am copying the "content" field in "text" field which is by default indexed="true".
I was having a query that I am able to search in the html documents I had fed to the solr, but as the results returned by the Tika/ExtractingRequestHandler is stripped down version of the HTML document, I am not able to present the document in the original format at my site. :( I got certain idea based upon Jack's reply that making my own request handler and I am working on it. I'll update if I am coming up with any solution also any help is most welcomed..!!! Thank you all for all your support...!!! On Fri, Feb 22, 2013 at 6:42 AM, Chris Hostetter <hossman_luc...@fucit.org>wrote: > > : Hi everyone, i am new to solr technology and not getting a way to get > back > : the original HTML document with Hits highlighted into it. what > : configuration and where i can do to instruct SolrCell/ Tika so that it > does > : not strips down the tags of HTML document in the content field. > > I _think_ what you want is simply to ensure that you have a "content" > field in your schema which is stored="true" (and indexed="true" if you > want to serach on it directly) ... and then ExtractingRequestHandler will > put the entire XHTML it generates from the documents you index into that > field. > > http://wiki.apache.org/solr/ExtractingRequestHandler > > If that isn't what you had in mind, then you need to provide us with more > details about what you've tried, what results you get, and how exactly > those results differ fro mwhat you want to get. > > > -Hoss > -- Regards, Divyanand Tiwari