: I created a field type:
: 
: <fieldType name="htmlTxt" class="solr.TextField" positionIncrementGap="100">

        ...

: Everything works (the div tags, p tags are removed) but some
: <strong>nnn</strong>   or <br/> tags are style in the text after indexing.

i cut/paste that fieldtype into the example schema.xml, and experimented 
with the analysis tool (http://localhost:8983/solr/admin/analysis.jsp) and 
both of those examples were correctly striped.

do you have a more specific example of something that doesn't work?

Hmm... it seems like maybe the problem is examples like this...
        blahblah<string>nnn</strong>
...if the tag is direclty adjacent to other text, it may not get striped 
off ... i'm not sure if that's specific to the HtmlWhitespaceTokenizer.




-Hoss

Reply via email to