Re: char filter factory and tokeniser issue in admin Analysis form

Lee Carroll Tue, 20 Oct 2015 07:28:33 -0700

B*ll*cks, before posting I spent an hour searching for issues, honest.
Soon as I post within seconds I find


https://issues.apache.org/jira/browse/SOLR-5800



On 20 October 2015 at 15:21, Lee Carroll <lee.a.carr...@googlemail.com>
wrote:

> Hi,
>
> on solr 4.7 I've ran into a strange issue. Whilst setting up a field I've
> noticed in the analysis form when I use a char filter factory (for example
> HTMLSCF) with a tokeniser (ST) the analysis chain grinds to a halt. the
> char filter does not seem to pass anything into the tokeniser.
>
> Field type is:
>
> <fieldType name="clean_text" class="solr.TextField"
> positionIncrementGap="100">
>               <analyzer>
>                 <charFilter class="solr.HTMLStripCharFilterFactory"/>
>                 <tokenizer class="solr.StandardTokenizerFactory"/>
>                 <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>                 <filter class="solr.SnowballPorterFilterFactory"
> language="English"/>
>               </analyzer>
>     </fieldType>
>
> outpout of the analysis screen is:
>
> Field value (index)
> Content with mark up <br /> should be cleaned
>
> HTMLSCF > Content with mark up should be cleaned
> ST > <BLANK>
>
> I know I must be missing something obvious !
>
> Cheers Lee C
> ...
>

Re: char filter factory and tokeniser issue in admin Analysis form

Reply via email to