You searched, but did not tell us what you found, nor why it was unsuitable for you undescribed use case. So all we can do is guess: my guess is http://docs.rexamine.com/R-man/stringi/stringi-search-boundaries.html
Best, Ista On Mar 3, 2016 8:14 AM, "Sascha Wolfer" <wol...@ids-mannheim.de> wrote: > Hello list members, > > I am looking for an implementation of Unicode text segmentation (word > boundary detection) algorithms in R. You can find information about the > algorithms here: http://www.unicode.org/reports/tr29/#Word_Boundaries > > The help page for the function ‚casefuns‘ from the excellent ‚Unicode‘ > package says: "Other methods will be added eventually (once the Unicode > text segmentation algorithm is implemented for detecting word boundaries).“ > My simple question is: Are these algorithms already implemented in an R > package? I didn’t find anything on the web, but I am counting on the power > of this list. My Stata-using colleague is already picking at me… (in Stata, > the function ’ustrword’ does exactly what I want to do in R). > > Thanks for your help, have a good day, you all! > Sascha W. > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.