On Sun, Feb 22, 2009 at 10:49 PM, ish_ling <ish_l...@yahoo.com> wrote: > I have a string: > > 'a b c<H d e f gH> h<H i j kH>' > > I would like a regex to recursively match all alpha letters that are between > <H and [a-z]H>. That is, I would like the following list of matches: > > ['d', 'e', 'f', 'i', 'j'] > > I do not want the 'g' or the 'k' matched. > > I have figured out how to do this in a multiple-step process, but I would > like to do it in one step using only one regex (if possible). My multiple > step process is first to use the regex > > '(?<=H )[a-z][^H]+(?!H)'
I would use a slightly different regex, it seems more explicit to me. r'<H([a-z ]+?)[a-z]H>' > > with re.findall() in order to find two strings > > ['d e f ', 'i j '] > > I can then use another regex to extract the letters out of the strings. str.split() will pull out the individual strings. You can still write it as a one-liner if you want: In [1]: import re In [2]: s = 'a b c<H d e f gH> h<H i j kH>' In [3]: regex = r'<H([a-z ]+?)[a-z]H>' In [5]: [m.split() for m in re.findall(regex, s)] Out[5]: [['d', 'e', 'f'], ['i', 'j']] Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor