Please explain collections.defaultdict(lambda: 1)
I'm reading http://norvig.com/spell-correct.html and do not understand the expression listed in the subject which is part of this function: def train(features): model = collections.defaultdict(lambda: 1) for f in features: model[f] += 1 return model Per http://docs.python.org/lib/defaultdict-examples.html It seems that there is a default factory which initializes each key to 1. So by the end of train(), each member of the dictionary model will have value >= 1 But why wouldnt he set the value to zero and then increment it each time a "feature" (actually a word) is encountered? It seems that each model value would be 1 more than it should be. -- http://mail.python.org/mailman/listinfo/python-list
creating an (inefficent) alternating regular expression from a list of options
Pyparsing has a really nice feature that I want in PLY. I want to
specify a list of strings and have them converted to a regular
expression.
A Perl module which does an aggressively optimizing job of this is
Regexp::List -
http://search.cpan.org/~dankogai/Regexp-Optimizer-0.15/lib/Regexp/List.pm
I really dont care if the expression is optimal. So the goal is
something like:
vowel_regexp = oneOf("a aa i ii u uu".split()) # yielding r'(aa|a|uu|
u|ii|i)'
Is there a public module available for this purpose?
--
http://mail.python.org/mailman/listinfo/python-list
Re: creating an (inefficent) alternating regular expression from a list of options
On Sep 9, 9:23 am, [EMAIL PROTECTED] wrote:
> >> I really dont care if theexpressionis optimal. So the goal is
> >> something like:
>
> >> vowel_regexp = oneOf("a aa i ii u uu".split()) # yielding r'(aa|a|uu|
> >> u|ii|i)'
>
> >> Is there a public module available for this purpose?
>
> Check Ka-Ping Yee's rxb module:
>
> http://lfw.org/python/
Ok
suffers from the possibility of putting shorter match before longer
one:
def either(*alternatives):
options = []
for option in alternatives:
options.append(makepat(option).regex)
return Pattern('\(' + string.join(options, '|') + '\)')
> Also, check PyPI to see if
> someone has already updated rxb for use with re.
No one has - http://pypi.python.org/pypi?%3Aaction=search&term=rxb&submit=search
no results returned
--
http://mail.python.org/mailman/listinfo/python-list
Re: creating an (inefficent) alternating regular expression from a list of options
On Sep 9, 12:42 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > > you may also want to do re.escape on all the words, to avoid surprises > when the choices contain special characters. yes, thank you very much: import re def oneOf(s): alts = sorted(s.split(), reverse=True) alts = [re.escape(s) for s in alts] return "|".join(alts) -- http://mail.python.org/mailman/listinfo/python-list
