Re: Highlighting words with special characters

2017-07-20 Thread Lasitha Wattaladeniya
Hi Shawn, Yes I can confirm, it works with out any errors with multiple tokenizers. Following is my analysis chain StandardTokenizerFactory (only in index) StopFilterFactory LowerCaseFilterFactory ASCIIFoldingFilterFactory EnglishPossessiveFilterFactory StemmerOverrideFilterFactory (only in query

Re: Highlighting words with special characters

2017-07-19 Thread Lasitha Wattaladeniya
Hi ahmet, But I have NgramTokenizerFactory at the end of indexing analyzer chain. Therefore I should still tokenize the email address. But how this affects the highlighting?, that's what I'm confused to understand Solr version : 4.10.4 Regards, Lasitha On 20 Jul 2017 08:28, "Ahmet Arslan" wrot

Re: Highlighting words with special characters

2017-07-19 Thread Ahmet Arslan
Hi, Maybe name of the UAX29URLEMailTokenizer is deceiving you?It does *not* tokenize URLs and Emails. Actually it recognises them and emits them as a single token. Ahmet On Wednesday, July 19, 2017, 12:00:05 PM GMT+3, Lasitha Wattaladeniya wrote: Update, I changed the UAX29URLEmailTokenizerF

Re: Highlighting words with special characters

2017-07-19 Thread Lasitha Wattaladeniya
Update, I changed the UAX29URLEmailTokenizerFactory to StandardTokenizerFactory and now it shows highlighted text fragments in the indexed email text. But I don't understand this behavior. Can someone shed some light please On 18 Jul 2017 14:18, "Lasitha Wattaladeniya" wrote: > Further more, n

Re: Highlighting words with special characters

2017-07-17 Thread Lasitha Wattaladeniya
Further more, ngram field has following tokenizer/filter chain in index and query UAX29URLEmailTokenizerFactory (only in index) stopFilterFactory LowerCaseFilterFactory ASCIIFoldingFilterFactory EnglishPossessiveFilterFactory StemmerOverrideFilterFactory (only in query) NgramTokenizerFactory (only

Highlighting words with special characters

2017-07-17 Thread Lasitha Wattaladeniya
Hi devs, I have setup solr highlighting with default setup (only changed the fragsize to 0 to match any field length). It worked fine but recently I discovered it doesn't highlight for words with special characters in the middle. For an example, let's say I have indexed email address test.f...@ra