Hi list

Although this query applies specifically to the tm package, perhaps it's something that others might be able to lend a thought to.

Using tm to do some initial text mining, I want to include an external (to R) generated dictionary of words that I want removed from the corpus.

I have created a comma separated list of terms in " " marks in a stopList.txt plain UTF-8 file. I want to read this into R, so do:

> stopDict <- read.table('~/path/to/file/stopList.txt', sep=',')

When I want to load it as part of the removeWords function in tm, I do:

> docs <- tm_map(docs, removeWords, stopDict)

which has no effect. Neither does:

> docs <- tm_map(docs, removeWords, c(stopDict))

What am I not seeing/ doing?

How do I pass a text file with pre-defined terms to the removeWords transform of tm?

Thanks for any ideas.

Cheers

Sun

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to