Re: [Tutor] Working with some sort of Timer

2011-02-17 Thread Alan Gauld


"Ryan Strunk"  wrote


def heartbeat(self):
  if self.round_clock_counter > self.round_clock_max:
  #todo, call end_round
  return
  if global_vars.player.fatigue < 100:
  global_vars.player.fatigue += 1
  self.round_clock = delay(1, heartbeat)

after the bell rings. At the same time, however, the code carries 
the burden

of being infinitely recursive without the proper checks in place.


Its not really recursive.
It passes itself as an argument to timer which will eventually call it 
again.
But that is not the same as recursion which actually makes the call 
inside

the function. So this should not encounter the usual problems with
recursive limits being reached.

'll leave others to comment on your other issues regarding the use
of Threading.timer because I have no real experience of its use.

--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] regex questions

2011-02-17 Thread Albert-Jan Roskam
Hello,

I have a couple of regex questions:

1 -- In the code below, how can I match the connecting words 'van de' , 'van 
der', etc. (all quite common in Dutch family names)?
2 -- It is quite hard to make a regex for all surnames, but easier to make 
regexes for the initials and the connecting words. How could I ' subtract'  
those two regexes to end up with something that matches the surnames (I used 
two 
.replaces() in my code, which roughly work, but I'm thinking there's a re way 
to 
do it, perhaps with carets (^).
3 -- Suppose I want to yank up my nerd rating by adding a re.NONDIACRITIC flag 
to the re module (matches letters independent of their accents), how would I go 
about? Should I subclass from re and implement the method, using the other 
existing methods as an example? I would find this a very useful addition.

Thanks in advance for your thoughts!

Python 2.7.0+ (r27:82500, Sep 15 2010, 18:04:55) 
[GCC 4.4.5] on linux2

>>> import re
>>> names = ["J. van der Meer", "J. van den Meer", "J. van Meer", "Meer, J. van 
>>>der", "Meer, J. van den", "Meer, J. van de", "Meer, J. van"]
>>> for name in names:
print re.search("(van? de[nr]?)\b? ?", name, re.IGNORECASE).group(1)

van der
van den
Traceback (most recent call last):
  File "", line 2, in 
print re.search("(van? de[nr]?)\b? ?", name, re.IGNORECASE).group(1)
AttributeError: 'NoneType' object has no attribute 'group'

 Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~



  ___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] regex questions

2011-02-17 Thread Steven D'Aprano

Albert-Jan Roskam wrote:

Hello,

I have a couple of regex questions:

1 -- In the code below, how can I match the connecting words 'van de' , 'van 
der', etc. (all quite common in Dutch family names)?


You need to step back a little bit and ask, what is this regex supposed 
to accomplish? What is your input data? Do you expect this to tell the 
difference between van used as a connecting word in a name, and van used 
otherwise?


In other words, do you want:

re.search(???, "J. van Meer")  # matches
re.search(???, "The van stopped")  # doesn't match

You might say, "Don't be silly, of course not!" *but* if you expect this 
regex to detect names in arbitrary pieces of text, that is exactly what 
you are hoping for. It is beyond the powers of a regex alone to 
distinguish between arbitrary text containing a name:


"... and to my nephew Johann van Meer I leave my collection of books..."

and arbitrary text without a name:

"... and the white van merely touched the side of the building..."

You need a proper parser for that.

I will assume that your input data will already be names, and you just 
want to determine the connecting words:


van der
van den
van de
van

wherever they appear. That's easy: the only compulsory part is "van":

pattern = r"\bvan\b( de[rn]?)?"

Note the use of a raw string. Otherwise, \b doesn't mean "backslash b", 
but instead means "ASCII backspace".


Here's a helper function for testing:

def search(pattern, string):
mo = re.search(pattern, string, re.IGNORECASE)
if mo:
return mo.group(0)
return "--no match--"


And the result is:

>>> names = ["J. van der Meer", "J. van den Meer", "J. van Meer",
... "Meer, J. van der", "Meer, J. van den", "Meer, J. van de",
... "Meer, J. van"]
>>>
>>> for name in names:
... print search(pattern, name)
...
van der
van den
van
van der
van den
van de
van

Don't forget to test things which should fail:

>>> search(pattern, "Fred Smith")
'--no match--'
>>> search(pattern, "Fred Vanderbilt")
'--no match--'



2 -- It is quite hard to make a regex for all surnames, but easier to make 


"\b[a-z]+[-']?[a-z]*\b" should pretty much match all surnames using only 
English letters, apostrophes and hyphens. You can add in accented 
letters as need.


(I'm lazy, so I haven't tested that.)


regexes for the initials and the connecting words. How could I ' subtract'  
those two regexes to end up with something that matches the surnames (I used two 
.replaces() in my code, which roughly work, but I'm thinking there's a re way to 
do it, perhaps with carets (^).


Don't try to use regexes to do too much. Regexes are a programming 
language, but the syntax is crap and there's a lot they can't do. They 
make a good tool for *parts* of your program, not the whole thing!


The best approach, I think, is something like this:


def detect_dutch_name(phrase):
"""Return (Initial, Connecting-words, Surname) from a potential
Dutch name in the form "Initial [Connecting-words] Surname" or
"Surname, Initial Connecting-words".
"""
pattern = (  r"(?P.*?), "
 r"(?P[a-z]\.) ?(?Pvan (de[rn]?))?"  )
mo = re.match(pattern, phrase, re.IGNORECASE)
if mo:
return (mo.group('initial'), mo.group('connect') or '',
mo.group('surname'))
# Try again.
pattern = (  r"(?P[a-z]\.) "
 r"(?Pvan (de[rn]? ))?(?P.*)"  )
# Note: due to laziness, I accept any character at all in surnames.
mo = re.match(pattern, phrase, re.IGNORECASE)
if mo:
return (mo.group('initial'), mo.group('connect') or '',
mo.group('surname'))
return ('', '', '')

Note that this is BUGGY -- it doesn't do exactly what you want, although 
it is close:


>>> detect_dutch_name("Meer, J. van der")  # Works fine
('J.', 'van der', 'Meer')

but:

>>> detect_dutch_name("J. van der Meer")  # almost, except for the space
('J.', 'van der ', 'Meer')
>>> detect_dutch_name("J. van Meer")  # not so good
('J.', '', 'van Meer')

Debugging regexes is a PITA and I'm going to handball the hard part to 
you :)



3 -- Suppose I want to yank up my nerd rating by adding a re.NONDIACRITIC flag 
to the re module (matches letters independent of their accents), how would I go 
about? Should I subclass from re and implement the method, using the other 
existing methods as an example? I would find this a very useful addition.


As would lots of people, but it's not easy, and it's not even certain to 
me that it is always meaningful.


In English, and (I believe) Dutch, diacritics are modifiers, and so 
accented letters like é are considered just a variant of e. (This is not 
surprising, for English is closely related to Dutch.) But this is not a 
general rule -- many languages which use diacritics consider the 
accented letters to be as distinct.


See, for example, http://en.wikipedia.org/wiki/Diacritic for a 
discussion of which diacritics are considered different letters and 
which are not.


You might li