You can alway convert to lower case afterwards with probably a shorter
vector. You did not indicate that you needed that conversion; it only
looked like you did it for the regular expression.
On Fri, Sep 14, 2012 at 3:13 PM, Sam Steingold wrote:
>> * jim holtman [2012-09-14 13:10:37 -0400]:
>>
> * jim holtman [2012-09-14 13:10:37 -0400]:
>
> more than half the time is in 'tolower' and 'nchar', so it is not all
> 'sub's problem.
aha, thanks!
> This version runs a little faster since it does not need the 'tolower':
>
> canonicalize.language <- function (s) {
> # s <- tolower(s)
> lo
First thing to do is to run Rprof and see where the time is going;
here it is from my computer:
self.time self.pct total.time total.pct
tolower4.4239.46 4.42 39.46
sub3.5631.79 3.56 31.79
nchar
3 matches
Mail list logo