Hi Jay, the text analysis always operates on the indexed content. The
stored content of a filed is left untouched unless you do something
before it gets indexed (e.g. on client side or by an
UpdateRequestProcessor).
Cheers,
Andrea
On 14/01/2019 08:46, Jay Potharaju wrote:
Hi,
I have a copy field in which i am copying the contents of text_en field to
another custom field.
After indexing i was expecting any of the special characters in the
paragraph to be removed, but it does not look like that is happening. The
copied content is same as the what is there in the source. I ran analysis
...looks like the pattern matching works as expected and the special
characters are removed.
Any suggestions?
<fieldType name="text_no_specialchars" class="solr.TextField"> <analyzer> <
charFilter class="solr.PatternReplaceCharFilterFactory" pattern=
"['!#\$%'\(\)\*+,-\./:;=?@\[\]\^_`{|}~!@#$%^*]" /> <tokenizer class=
"solr.StandardTokenizerFactory"/> <filter class=
"solr.SuggestStopFilterFactory" ignoreCase="true" words=
"lang/stopwords_en.txt" /> <filter class="solr.LowerCaseFilterFactory"/> <
filter class="solr.EnglishPossessiveFilterFactory"/> <filter class=
"solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> </analyzer> </
fieldType>
Thanks
Jay