On Mar 29, 7:22 am, "aspineux" <[EMAIL PROTECTED]> wrote:
> I want to parse
>
> '[EMAIL PROTECTED]' or '<[EMAIL PROTECTED]>' and get the email address [EMAIL
> PROTECTED]
>
> the regex is
>
> r'<[EMAIL PROTECTED]>|[EMAIL PROTECTED]'
>
> now, I want to give it a name
>
> r'<(?P<email>[EMAIL PROTECTED])>|(?P<email>[EMAIL PROTECTED])'
>
> sre_constants.error: redefinition of group name 'email' as group 2;
> was group 1
>
> BUT because I use a | , I will get only one group named 'email' !
>
> Any comment ?
>
> PS: I know the solution for this case is to use r'(?P<lt><)?(?P<email>
> [EMAIL PROTECTED])(?(lt)>)'
Regular expressions, alternation, named groups ... oh my!
It tends to get quite complex especially if you need
to reject cases where the string contains a left bracket
and not the right, or visa-versa.
>>> pattern = re.compile(r'(?P<email><[EMAIL PROTECTED]>|(?<!<)[EMAIL
>>> PROTECTED](?!>))')
>>> for email in ('[EMAIL PROTECTED]' , '<[EMAIL PROTECTED]>', '<[EMAIL
>>> PROTECTED]'):
... matched = pattern.search(email)
... if matched is not None:
... print matched.group('email')
...
[EMAIL PROTECTED]
<[EMAIL PROTECTED]>
I suggest you try some other solution (maybe pyparsing).
--
Hope this helps,
Steven
--
http://mail.python.org/mailman/listinfo/python-list