Hi all:

>From the description of the StandardTokenizer, it should Recognizes Internet 
>domain names and email addresses and preserves them as a single token, which 
>works great, but I've detected that in cases like this:

socks25.domain.com it outputs 2 tokens: socks25 | domain.com

if the URL doesn't have any numbers:

socks.domain.com it outputs a single token: socks.domain.com

The same happens if the number is not at the end an URL part:

so2cks.domain.com it outputs a single token: so2cks.domain.com

Is this an intended behavior? The odd part is that without the number at the 
end of an URL part it works fine.

Regards,

Reply via email to