Thank you Andy for your reply, I can have optional classes like (B1|B2|B3)? and some keywords are multiword expression it can have some words within its parts. Exemple: *associate … with, **protect … from. Can *Aho-Corasick string matching used for this task. If I understood Aho-Corasick string matching is for only fixed keywords.
Le mardi 13 décembre 2016 17:31:58 UTC+1, Andy Balholm a écrit : > > If it’s actually just a list of keywords (no wildcards, character ranges, > etc.), I would recommend using Aho-Corasick string matching rather than > regular expressions. > > Andy > > On Dec 13, 2016, at 7:53 AM, David Sofo <[email protected] <javascript:>> > wrote: > > Hi, > > For a set of rules expressed in regular expression (around 1000 rules > expected) to find some keywords in a text file (~50Ko each file), how to > speed up the execution time. Currently I compile the regex rule at > initialization time with init function at put them in a map at package > level then run the regex rules with a loop. The regex have this form: > > \b(?:( (A1|A2|A3) | (B1|B2|B3) ) )\b > > spaces are put for readability. A and B are classes of keywords. > > How to speed up the execution: at regular expression level or others > levels (such execution priority). I am using Ubuntu 14.04. Any suggestion > is welcome. Thank you. > > Here a code > > Regards > David > > > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > For more options, visit https://groups.google.com/d/optout. > > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
