I have a string:

    'a b c<H d e f gH> h<H i j kH>'

I would like a regex to recursively match all alpha letters that are between <H 
and [a-z]H>. That is, I would like the following list of matches:

    ['d', 'e', 'f', 'i', 'j']

I do not want the 'g' or the 'k' matched. 

I have figured out how to do this in a multiple-step process, but I would like 
to do it in one step using only one regex (if possible). My multiple step 
process is first to use the regex

    '(?<=H )[a-z][^H]+(?!H)'

with re.findall() in order to find two strings 

    ['d e f ', 'i j ']

I can then use another regex to extract the letters out of the strings. But, as 
I said above I would prefer to do this in one swoop.

 

Another example:

    'a b c<H dH>'

There should be no matches.

 

Last example:

    'a b c<H d eH>'

There should be one match:

    ['d'] 


(For background, although it's probably irrelevant, the string is a possible 
representation of a syllable (a, b, c, etc.) to tone (H) mapping in tonal 
languages.)

 
If anyone has ideas, then I would greatly appreciate it. 





      

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to