On Thu, 12 Apr 2012 09:06:53 -0700 Michael Lewis <mjole...@gmail.com> wrote:
> Here's the "pattern" portion that I don't understand: > > re.findall("[^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+" > You have 5 different parts here: 1) [^A-Z]+ - this matches one or more non-uppercase characters. The brackets [] describe a set of wanted characters. A-Z would match any uppercase character, but the caret ^ at the first position inside the brackets means to inverse the set (i.e., match any character not in the set). + means to match at least one of the character(s) described before. 2) [A-Z]{3} - this matches exactly three uppercase characters. With the braces {} you can define how many characters should match: {3} matches exactly 3, {3,} matches at least 3, {,3} matches up to three and {3,6} matches 3 to 6. 3) ([a-z]) - this matches exactly one lowercase character. The parens () are used to save the character for later use. (using the group()/groups()-methods, see the docs). 4) [A-Z]{3} - again matches exactly three uppercase characters. 5) [^A-Z]+ - again matches at least one non-uppercase character. HTH, Andreas _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor