Gabor,
Thanks for the suggestion, I'll try it out tonight or tomorrow.
Regards,
Richard
_
Richard R. Liu
Dittingerstr. 33
CH-4053 Basel
Switzerland
Tel. +41 79 708 67 66
Sent from my iPhone 3GS
On Apr 29, 2010, at 13:06, Gabor Grothendieck
wrote:
Using charmatch part
In developing a machine learner to classify sentences in plain text
sources of scientific documents I have been using the caret
package and
following the procedures described in the vignettes. What I miss
in the
package -- but quite possibly I am overlooking it! -- is functions
t
I'm running R 2.10.0 under Mac OS X 10.5.8; however, I don't think this
is a Mac-specific problem.
I have a very large (158,908 possible sentences, ca. 58 MB) plain text
document d which I am
trying to tokenize: t <- strapply(d, "\\w+", perl = T). I am
encountering the following error:
Error in
When I run sentDetect in the openNLP package I receive a Java heap space
exception. How can I increase the heap space?
I am running the 64-bit "Leopard" version of R 2.9.2 and R.app on a Mac with
OS X 10.5.8.
Thanks,
Richard
--
View this message in context:
http://www.nabble.com/Increase-the-
kenhorvath wrote:
>
>
>
> Paul Hiemstra wrote:
>>
>>
>> file_list = list.files("/where/are/the/files")
>> obj_list = lapply(file_list, FUN = yourfunction)
>>
>> yourfunction is probably either read.table or some read function from
>> the tm package. So obj_list will become a list of either
I'm new to R. I'm working with the text mining package tm. I have several
plain text documents in a directory, and I would like to read all the files
with extension .txt in that directory into a vector, one text document per
vector element. That is, v[1] would be the first document, v[2] the se
6 matches
Mail list logo