RE: Displaying highlights in formatted HTML document

2011-06-09 Thread Ahmet Arslan
> Yes, I asked the wrong question. What I was subconsciously > getting at is > this: how are you avoiding the possibility of getting hits > in the HTML > elements? Is that accomplished by putting tag names in your > stopwords, or > by some other mechanism? HtmlStripCharFilter removes html tags. Af

RE: Displaying highlights in formatted HTML document

2011-06-09 Thread Bryan Loofbourrow
> > OK, I think see what you're up to. Might be pretty viable > > for me as well. > > Can you talk about anything in your mappings.txt files that > > is an > > important part of the solution? > > It is not important. I just copied it. Plus html strip char filter does > not have mappings parameter.

RE: Displaying highlights in formatted HTML document

2011-06-09 Thread Ahmet Arslan
> OK, I think see what you're up to. Might be pretty viable > for me as well. > Can you talk about anything in your mappings.txt files that > is an > important part of the solution? It is not important. I just copied it. Plus html strip char filter does not have mappings parameter. It was a copy

RE: Displaying highlights in formatted HTML document

2011-06-09 Thread Bryan Loofbourrow
> -Original Message- > From: Ahmet Arslan [mailto:iori...@yahoo.com] > Sent: Wednesday, June 08, 2011 11:56 PM > To: solr-user@lucene.apache.org > Subject: Re: Displaying highlights in formatted HTML document > > > > --- On Thu, 6/9/11, Bryan Loofbourrow

RE: Displaying highlights in formatted HTML document

2011-06-09 Thread lboutros
I am not (yet) a tika user, perhaps that the iorixxx's solution is good for you. We will share the highlighter module and 2 other developments soon. ('have to see how to do that) Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Displaying-highli

RE: Displaying highlights in formatted HTML document

2011-06-09 Thread Bryan Loofbourrow
Ludovic, >> how do you index your html files ? I mean do you create fields for different parts of your document (for different stop words lists, stemming, etc) ? with DIH or solrj or something else ? << We are sending them over http, and using Tika to strip the HTML, at present. We do not split

Re: Displaying highlights in formatted HTML document

2011-06-09 Thread Ahmet Arslan
> iorixxx, could you please explain a bit more your solution, > because I don't > see how your solution could give an "exact highlighting", I > mean with the > different fields analysis for each fields. It does not work with your use case (e.g. different synonyms applied different parts of the ht

Re: Displaying highlights in formatted HTML document

2011-06-09 Thread lboutros
Hi Bryan, how do you index your html files ? I mean do you create fields for different parts of your document (for different stop words lists, stemming, etc) ? with DIH or solrj or something else ? iorixxx, could you please explain a bit more your solution, because I don't see how your solution

Re: Displaying highlights in formatted HTML document

2011-06-08 Thread Ahmet Arslan
--- On Thu, 6/9/11, Bryan Loofbourrow wrote: > From: Bryan Loofbourrow > Subject: Displaying highlights in formatted HTML document > To: solr-user@lucene.apache.org > Date: Thursday, June 9, 2011, 2:14 AM > Here is my use case: > > > > I have a large number of HTML documents, sizes in the >