Re: Solr is indexing XML only?

Yonik Seeley Thu, 27 Apr 2006 06:56:11 -0700

On 4/27/06, David Trattnig <[EMAIL PROTECTED]> wrote:
> thank you so much! Could you also explain me how to use these two
> Tokenizers?


Here's the HTMLStrip tokenizer description:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-031d5d370010955fdcc529d208395cd556f4a73e

Read through the Solr example schema.xml and it should hopefully be
apparent how to use it.

> But if there is a Tokenizer which throws away HTML markup it should be also
> possible to extend it and exclude additional content easily?

If the additional content has nothing to do with HTML, it should be
developed as a separate TokenFilter.  Filters are meant to be chained
to gether to gain more configuration flexibility.

-Yonik

Re: Solr is indexing XML only?

Reply via email to