Regular Expression question

2006-08-21 Thread stevebread
Hi, I am having some difficulty trying to create a regular expression.

Consider:

   




Whenever a tag1 is followed by a tag 2, I want to retrieve the values
of the tag1:name and tag2:value attributes. So my end result here
should be
john, tall
jack, short

My low quality regexp
re.compile('tag1.+?name="(.+?)".*?(?!tag1).*?="adj__(.*?)__',
re.DOTALL)

cannot handle the case where there is a tag1 that is not followed by a
tag2. findall returns
john, tall
joe, short

Ideas?

Thanks.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regular Expression question

2006-08-21 Thread stevebread
Thanks, i just tried it but I got the same result.

I've been thinking about it for a few hours now and the problem with
this approach is that the .*? before the (?=tag2) may have matched a
tag1 and i don't know how to detect it.

And even if I could, how would I make the search reset its start
position to the second tag1 it found?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regular Expression question

2006-08-21 Thread stevebread
got zero results on this one :)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regular Expression question

2006-08-21 Thread stevebread
Hi, thanks everyone for the information! Still going through it :)

The reason I did not match on tag2 in my original expression (and I
apologize because I should have mentioned this before) is that other
tags could also have an attribute with the value of "adj__" and the
attribute name may not be the same for the other tags. The only thing I
can be sure of is that the value will begin with "adj__".

I need to match the "adj__" value with the closest preceding tag1
irrespective of what tag the "adj__" is in, or what the attribute
holding it is called, or the order of the attributes (there may be
others). This data will be inside an html page and so there will be
plenty of html tags in the middle all of which I need to ignore.

Thanks very much!
Steve

-- 
http://mail.python.org/mailman/listinfo/python-list