On 4/27/06, David Trattnig <[EMAIL PROTECTED]> wrote: > thank you so much! Could you also explain me how to use these two > Tokenizers?
Here's the HTMLStrip tokenizer description: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-031d5d370010955fdcc529d208395cd556f4a73e Read through the Solr example schema.xml and it should hopefully be apparent how to use it. > But if there is a Tokenizer which throws away HTML markup it should be also > possible to extend it and exclude additional content easily? If the additional content has nothing to do with HTML, it should be developed as a separate TokenFilter. Filters are meant to be chained to gether to gain more configuration flexibility. -Yonik