Note that you don't need perl = T since by default strapply uses tcl regular expressions and they support \w. What happens if you omit the perl = T?
Also please specify the version of gsubfn you are using and if its not the latest then try it with the latest version. On Tue, Nov 3, 2009 at 11:01 AM, <richard....@pueo-owl.ch> wrote: > I'm running R 2.10.0 under Mac OS X 10.5.8; however, I don't think this > is a Mac-specific problem. > > I have a very large (158,908 possible sentences, ca. 58 MB) plain text > document d which I am > trying to tokenize: t <- strapply(d, "\\w+", perl = T). I am > encountering the following error: > > Error in base::gsub(pattern, rs, x, ...) : > Calloc could not allocate (-1398215180 of 1) memory > > This happens regardless of whether I run in 32- or 64-bit mode. The > machine has 8 GB of RAM, so > I can hardly believe that RAM is a problem. > > Thanks, > Richard ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.