Question about StandardTokenizer in Solr 4.9

Jorge Luis Betancourt González Sun, 02 Nov 2014 13:35:28 -0800

Hi all:

>From the description of the StandardTokenizer, it should Recognizes Internet 
>domain names and email addresses and preserves them as a single token, which 
>works great, but I've detected that in cases like this:


socks25.domain.com it outputs 2 tokens: socks25 | domain.com

if the URL doesn't have any numbers:

socks.domain.com it outputs a single token: socks.domain.com

The same happens if the number is not at the end an URL part:

so2cks.domain.com it outputs a single token: so2cks.domain.com

Is this an intended behavior? The odd part is that without the number at the 
end of an URL part it works fine.

Regards,

Question about StandardTokenizer in Solr 4.9

Reply via email to