> Yes, I asked the wrong question. What I was subconsciously
> getting at is
> this: how are you avoiding the possibility of getting hits
> in the HTML
> elements? Is that accomplished by putting tag names in your
> stopwords, or
> by some other mechanism?
HtmlStripCharFilter removes html tags. Af
> > OK, I think see what you're up to. Might be pretty viable
> > for me as well.
> > Can you talk about anything in your mappings.txt files that
> > is an
> > important part of the solution?
>
> It is not important. I just copied it. Plus html strip char filter does
> not have mappings parameter.
> OK, I think see what you're up to. Might be pretty viable
> for me as well.
> Can you talk about anything in your mappings.txt files that
> is an
> important part of the solution?
It is not important. I just copied it. Plus html strip char filter does not
have mappings parameter. It was a copy
> -Original Message-
> From: Ahmet Arslan [mailto:iori...@yahoo.com]
> Sent: Wednesday, June 08, 2011 11:56 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Displaying highlights in formatted HTML document
>
>
>
> --- On Thu, 6/9/11, Bryan Loofbourrow
Displaying-highlights-in-formatted-HTML-document-tp3041909p3045654.html
Sent from the Solr - User mailing list archive at Nabble.com.
Ludovic,
>> how do you index your html files ? I mean do you create fields for
different
parts of your document (for different stop words lists, stemming, etc) ?
with DIH or solrj or something else ? <<
We are sending them over http, and using Tika to strip the HTML, at
present.
We do not split
> iorixxx, could you please explain a bit more your solution,
> because I don't
> see how your solution could give an "exact highlighting", I
> mean with the
> different fields analysis for each fields.
It does not work with your use case (e.g. different synonyms applied different
parts of the ht
n is not
enought for your particular use case.
Ludovic.
-
Jouve
France.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Displaying-highlights-in-formatted-HTML-document-tp3041909p3042983.html
Sent from the Solr - User mailing list archive at Nabble.com.
--- On Thu, 6/9/11, Bryan Loofbourrow wrote:
> From: Bryan Loofbourrow
> Subject: Displaying highlights in formatted HTML document
> To: solr-user@lucene.apache.org
> Date: Thursday, June 9, 2011, 2:14 AM
> Here is my use case:
>
>
>
> I have a large number of H
Here is my use case:
I have a large number of HTML documents, sizes in the 0.5K-50M range, most
around, say, 10M.
I want to be able to present the user with the formatted HTML document, with
the hits tagged, so that he may iterate through them, and see them in the
context of the document, wit
10 matches
Mail list logo