Yes. Because « Hello. How are you? » is a sentence that can be broken in « hello », « how », « are », « you ». But in « I paid it 2.50 euros », I would most likely keep « 2.50 » as a whole token.
-- David Pilato - Developer | Evangelist elastic.co @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr <https://twitter.com/elasticsearchfr> | @scrutmydocs <https://twitter.com/scrutmydocs> > Le 29 mai 2015 à 10:59, Marian Steinbach <[email protected]> a écrit > : > > Thanks for the reply! However, it doesn't make sense to me directly. > > If I use the dot as an additional seperator, I will end up with the tokens > "swarmvars" and "json", but not "swarmvars.json". Right? > > > Am Freitag, 29. Mai 2015 10:47:56 UTC+2 schrieb David Pilato: > I would probably go with a Pattern Tokenizer and define whatever regex you > need. > https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html > > <https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html> > > The standard one is more for english text which means that a dot need to have > a space after it in order to be considered as a break between two tokens. > > Make sense? > > > -- > Please update your bookmarks! We have moved to https://discuss.elastic.co/ > <https://discuss.elastic.co/> > --- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/44d85c90-acad-43b9-a082-6343395f19c5%40googlegroups.com > > <https://groups.google.com/d/msgid/elasticsearch/44d85c90-acad-43b9-a082-6343395f19c5%40googlegroups.com?utm_medium=email&utm_source=footer>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/DD60ACAE-9659-43F1-AF10-6517D0D79DEF%40pilato.fr. For more options, visit https://groups.google.com/d/optout.
