Hi, I have a file that is a long list of records (roughly) in the format
[EMAIL PROTECTED] So, for example: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] .... What I would like to do is run a regular expression against this and wind up with: [EMAIL PROTECTED]@[EMAIL PROTECTED]@data4 [EMAIL PROTECTED] So I ran the following regex against the string: re.compile(r'([EMAIL PROTECTED])@(.*)\n\1@(.*)').sub(r'\1\2\3', string) and I wound up with: [EMAIL PROTECTED]@data2 [EMAIL PROTECTED]@data4 [EMAIL PROTECTED] So, my questions are: (1) Is there any way to get a single regular expression to handle overlapping matches so that I get what I want in one call? (2) Is there any way (without comparing the before and after strings) to know if a re.sub(...) call did anything? I suppose I could do something like: pattern = re.compile(r'([EMAIL PROTECTED])@(.*)\n\1@(.*)') while(pattern.search(string)): string = pattern.sub(r'\1\2\3', string) but I would like to avoid the explicit loop if possible... Actually, should I be able to do something like that? If I execute it in my debugger, my string gets really funky... like the re is losing track of what the groups are... and I end up with a single really long string rather than what I expect.. Any help on this would be appreciated. -jdc _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor