Hi Shawn,
Yes I can confirm, it works with out any errors with multiple tokenizers.
Following is my analysis chain
StandardTokenizerFactory (only in index)
StopFilterFactory
LowerCaseFilterFactory
ASCIIFoldingFilterFactory
EnglishPossessiveFilterFactory
StemmerOverrideFilterFactory (only in query
Hi ahmet,
But I have NgramTokenizerFactory at the end of indexing analyzer chain.
Therefore I should still tokenize the email address. But how this affects
the highlighting?, that's what I'm confused to understand
Solr version : 4.10.4
Regards,
Lasitha
On 20 Jul 2017 08:28, "Ahmet Arslan" wrot
Hi,
Maybe name of the UAX29URLEMailTokenizer is deceiving you?It does *not*
tokenize URLs and Emails. Actually it recognises them and emits them as a
single token.
Ahmet
On Wednesday, July 19, 2017, 12:00:05 PM GMT+3, Lasitha Wattaladeniya
wrote:
Update,
I changed the UAX29URLEmailTokenizerF
Update,
I changed the UAX29URLEmailTokenizerFactory to StandardTokenizerFactory and
now it shows highlighted text fragments in the indexed email text.
But I don't understand this behavior. Can someone shed some light please
On 18 Jul 2017 14:18, "Lasitha Wattaladeniya" wrote:
> Further more, n
Further more, ngram field has following tokenizer/filter chain in index and
query
UAX29URLEmailTokenizerFactory (only in index)
stopFilterFactory
LowerCaseFilterFactory
ASCIIFoldingFilterFactory
EnglishPossessiveFilterFactory
StemmerOverrideFilterFactory (only in query)
NgramTokenizerFactory (only
Hi devs,
I have setup solr highlighting with default setup (only changed the
fragsize to 0 to match any field length). It worked fine but recently I
discovered it doesn't highlight for words with special characters in the
middle.
For an example, let's say I have indexed email address test.f...@ra