Hi,

I think this kind of text manipulation should be done before indexing, if you 
have font-size font-family in your text, very likely you’re indexing an html 
with css.
If I’m right, you’re just entering in a hell of words that should be removed 
from your text. 

On the other hand, if you have to do this at index time, a quick and dirty 
solution is using the pattern-replace filter. 

https://lucene.apache.org/solr/guide/7_5/filter-descriptions.html#pattern-replace-filter

Ciao,
Vincenzo

--
mobile: 3498513251
skype: free.dev

> On 31 Dec 2018, at 02:47, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote:
> 
> Hi,
> 
> I noticed that during the indexing of EMLfiles, there are words like
> "*FONT-SIZE:
> 9pt; FONT-FAMILY: arial*" that are being indexed into the content as well.
> 
> Would like to check, how are we able to remove those words during the
> indexing?
> 
> I am using Solr 7.5.0
> 
> Regards,
> Edwin

Reply via email to