Regular Expression question
Hi, I am having some difficulty trying to create a regular expression.
Consider:
Whenever a tag1 is followed by a tag 2, I want to retrieve the values
of the tag1:name and tag2:value attributes. So my end result here
should be
john, tall
jack, short
My low quality regexp
re.compile('tag1.+?name="(.+?)".*?(?!tag1).*?="adj__(.*?)__',
re.DOTALL)
cannot handle the case where there is a tag1 that is not followed by a
tag2. findall returns
john, tall
joe, short
Ideas?
Thanks.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Regular Expression question
Thanks, i just tried it but I got the same result. I've been thinking about it for a few hours now and the problem with this approach is that the .*? before the (?=tag2) may have matched a tag1 and i don't know how to detect it. And even if I could, how would I make the search reset its start position to the second tag1 it found? -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular Expression question
got zero results on this one :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular Expression question
Hi, thanks everyone for the information! Still going through it :) The reason I did not match on tag2 in my original expression (and I apologize because I should have mentioned this before) is that other tags could also have an attribute with the value of "adj__" and the attribute name may not be the same for the other tags. The only thing I can be sure of is that the value will begin with "adj__". I need to match the "adj__" value with the closest preceding tag1 irrespective of what tag the "adj__" is in, or what the attribute holding it is called, or the order of the attributes (there may be others). This data will be inside an html page and so there will be plenty of html tags in the middle all of which I need to ignore. Thanks very much! Steve -- http://mail.python.org/mailman/listinfo/python-list
