Hello, I'm having issues working with the tm package on non-english languages.
are there any extensions that enable the package to work with Hebrew (and other non-roman letter languages for that matter) ? (e.g. although I can construct a Corpus that shows the hebrew documents alright, I cannot create a term-document matrix (it does not identify any of the hebrew words, so it thinks there are 0 terms inside the corpus)). Thanks eitan > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.