Re: [R] Using a text file as a removeWord dictionary in tm_map

2015-03-03 Thread Jim Holtman
as a removeWord dictionary in tm_map Hi again I've now had the chance to try this out, and using scan() doesn't seem to work either. This is what I used: 1) I generated a plain text file called stopDict.txt. This file is of the format: "a, bunch, of, words, to, use" 2) I

Re: [R] Using a text file as a removeWord dictionary in tm_map

2015-03-03 Thread Jeff Newmiller
You seem to be conflating the data input operation with your data processing. You need to stop and examine the in-memory representation of your data {"userStopList"), and compare it with the expectations of your data processing operation ("tm_map"). Then adjust your input data by choosing a diff

Re: [R] Using a text file as a removeWord dictionary in tm_map

2015-03-03 Thread Sun Shine
Hi again I've now had the chance to try this out, and using scan() doesn't seem to work either. This is what I used: 1) I generated a plain text file called stopDict.txt. This file is of the format: "a, bunch, of, words, to, use" 2) I invoked scan(), like this: > userStopList <- scan(text

Re: [R] Using a text file as a removeWord dictionary in tm_map

2015-03-01 Thread Sun Shine
Thanks Jim. I thought that I was passing a vector, not realising I had converted this to a list object. I haven't come across the scan() function so far, so this is good to know. Good explanation - I'll give this a go when I can get back to that piece of work later today. Thanks again. Re

Re: [R] Using a text file as a removeWord dictionary in tm_map

2015-03-01 Thread jim holtman
The 'read.table' was creating a data.frame (not a vector) and applying 'c' to it converted it to a list. You should alway look at the object you are creating. You probably want to use 'scan'. == > testFile <- "Although,this,query,applies,specifically,to,the,tm,package" > # re

[R] Using a text file as a removeWord dictionary in tm_map

2015-02-28 Thread Sun Shine
Hi list Although this query applies specifically to the tm package, perhaps it's something that others might be able to lend a thought to. Using tm to do some initial text mining, I want to include an external (to R) generated dictionary of words that I want removed from the corpus. I have