I found a bug in the HTML Standard Strip filter where it doesn't place word boundaries at html tags that should be ends of blocks.
I've just discovered that if I index some text like this: <h2>title</h2><p>some text</p> it is stripped and indexed as "titlesome" and "text". Putting a space or newline between the tags fixes the problem, but I'm often seeing html like this being generated by our CMS system, so I don't always have easy control of this. Where do I file a bug report? -Matt -- View this message in context: http://www.nabble.com/HTML-Standard-Strip-filter-word-boundary-bug-tp18865749p18865749.html Sent from the Solr - User mailing list archive at Nabble.com.