Richard Liu wrote:
>
>
> I tried the other two suggestions, but paste seemed not to "glue" the
> separate lines together into one character string. Perhaps I missed
> something (collapse?). Perhaps I'll have another look.
>
>
Yes, that is what 'collapse' should do! If you read text using readLines R
makes every line of the original document into an element of a character
vector, so a text with 30 lines would end up as vector with 30 elements. To
have one vector element per document, you need to collapse these, say, 30
elements into a single one - that is what collapse does. The value you
assign to collapse is the character (sequence) R puts between the single
elements. If you do not need to preserve paragraph structure, a single white
space is the logical choice (collapse = " "). (Paste just turns an object
into a character object - so using paste alone on the vector produced by
readLines would be meaningless, using collapse is the whole point here.)
Worked fine with me - did you get an error message or did it just not yield
the result you'd expected?
Dieter Menne wrote:
>
>
> library(tm)
> filenames = list.files(path=".",pattern="\\.txt")
> docs = ""
> for (filename in filenames){
> docs = c(docs,paste(readLines(file(filename)),collapse="\n"))
> }
> docs
> ## continue as in example
> vs = VectorSource(docs)
>
>
If in any way possible I would recommend to do the whole procedure via
lists, not recursively. Since readLines produces a vector and a list is, in
this case, a vector of vectors, it should be no problem.
Ken
--
View this message in context:
http://www.nabble.com/How-to-read-plain-text-documents-into-a-vector--tp25867792p25886956.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.