Hi Paulo. My development version has that warning turned off. However, the Rstem package predates the encoding in R, AFAIR. So when I call wordStem() with a string which has an Encoding() of UTF-8, the resulting string has Encoding() "unknown".
I'll take a look and add see if I can add support for it. I am traveling at present, so not certain precisely when. Thanks, D. Paulo Cortez wrote: > Greetings, > > I have R 2.7.1 in MacOs and I believe UTF encoding is already installed. > At least: > > > Sys.getenv() > > shows several variables, including: > LANG "pt_PT.UTF-8" > > I installed the Rstem and tm packages and when I try the following code: > > > wordStem(c("aberração","aberrações"), language="portuguese") > [1] "aberraç\xc3" "aberraçõ" > Warning message: > In wordStem(c("aberração", "aberrações"), language = "portuguese") : > Currently, only 'english' is tested. You will need support for UTF > characters. > > So my question is. Am I using Rstem wrong or I do not really have UTF > support? What do I need to do? > > Best regards, > -- > Paulo Alexandre Ribeiro Cortez (PhD, MSc) > Lecturer (Prof. Auxiliar) at the Department of Information Systems (DSI) > University of Minho, Campus de AzurÈm, 4800-058 Guimaraes, Portugal > http://www.dsi.uminho.pt/~pcortez +351253510313 Fax:+351253510300 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- "There are men who can think no deeper than a fact" - Voltaire Duncan Temple Lang [EMAIL PROTECTED] Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Bldg. fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA
pgp4zslIoCSLz.pgp
Description: PGP signature
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.