Hello,
I want to use the SnowballStemmer on a collection of plain text documents. However, when I apply it to my corpus using the tm_map function it only stems the last word of each document (The problem is the for wordStem and stemDocument does not work at all). An example: > path <- c("c:\path\to\directory") # collection of plain text documents > corp <- Corpus(DirSource(path), readerControl = list(reader = readPlain, > language = "en_US" , load = T)) > inspect(corp) A corpus with 2 text documents The metadata consists of 2 tag-value pairs and a data frame Available tags are: create_date creator Available variables in the data frame are: MetaID $`1.txt` running runs runners $`2.txt` happyness happies > corp2<-tm_map(corp, SnowballStemmer) > inspect(corp2) A corpus with 2 text documents The metadata consists of 2 tag-value pairs and a data frame Available tags are: create_date creator Available variables in the data frame are: MetaID $`1.txt` [1] running runs runn $`2.txt` [1] happyness happi How can I get the stemming function to work? [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.