A dot is treated differently if it has a number on no, one, or both sides. > stri_extract_all_words("me.com", simplify = TRUE) [,1] [1,] "me.com" > stri_extract_all_words("me1.com", simplify = TRUE) [,1] [,2] [1,] "me1" "com" > stri_extract_all_words("me1.2com", simplify = TRUE) [,1] [1,] "me1.2com"
?stri_extract_all_words sent me to ?"stringi-search-boundaries" which suggests that you should spend some time with the user guide: _Boundary Analysis_ - ICU User Guide, <URL: http://userguide.icu-project.org/boundaryanalysis> Depending on your objective, you might be better off with strsplit() separating on whitespace. Sarah On Wed, Nov 30, 2016 at 3:51 PM, Dimitri Liakhovitski <dimitri.liakhovit...@gmail.com> wrote: > Hello! > > library(stringi) > > stri_extract_all_words("me.com", simplify = TRUE) # returns with a dot > stri_extract_all_words("watch32.com", simplify = TRUE) # removes the dot > > Why is the dot removed only in the second case? > How is it possible to ask it NOT to remove the dot in the second case? > > Thanks a lot! > -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.