Hi friends,
Please, see the attachment and examine a code I have provide. The
problem is, I want fetch data from <H2>Comments</H2> until the first
</TD> occurrence , but with my code data fetchind until the last </TD>
in htmlData variable, but that is not what I want. So question is, what
is my mistake?
Thanks in advance.
import re
import string
htmlData = """
<h2>Instructions</h2>
<p>Preparation</p>
<dl>
<dd>Lie supine on bench. Dismount barbell from rack over the upper chest
using a wide oblique overhand grip.
</dd></dl>
<p>Execution</p>
<dl>
<dd>Lower weight to upper chest. Press bar until arms are extended. Repeat.
</dd></dl>
<h2>Comments</h2>
<dl>
<dd>None
</dd></dl>
</td>
<td valign="top" width="50%"><h2>Classification</h2>
<h2><table border="1" cellpadding="1" cellspacing="0" height="60" width="100%">
<tbody><tr>
<td width="50%"><b> Utility:</b></td>
<td width="50%"><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1270300">Basic</a></td></tr>
<tr>
<td width="50%"><b> Mechanics:</b></td>
<td width="50%"><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1271511">Compound</a></td></tr>
<tr>
<td width="50%"><b> Force:</b></td>
<td width="50%"><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor400787">Push</a></td></tr>
</tbody></table>
</h2>
<h2>Muscles</h2>
<p><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1274950">Target</a></p>
<ul>
<li><a href="http://www.exrx.net/Muscles/PectoralisSternal.html">Pectoralis Major, Sternal</a>
</li></ul>
<p><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1275394">Synergists</a></p>
<ul>
<li><a href="http://www.exrx.net/Muscles/PectoralisClavicular.html">Pectoralis Major,
Clavicular</a>
</li><li><a href="http://www.exrx.net/Muscles/DeltoidAnterior.html">Deltoid, Anterior</a>
</li><li><a href="http://www.exrx.net/Muscles/TricepsBrachii.html">Triceps Brachii</a>
</li></ul>
<p><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1276508">Dynamic Stabilizers</a></p>
<ul>
<li><a href="http://www.exrx.net/Muscles/BicepsBrachii.html">Biceps Brachii, Short Head</a>
</li></ul>
</td></tr>
</tbody></table>
</h1>
"""
if __name__ == '__main__':
# Extract comments
p = re.search('<H2>Comments</H2>(.+)</TD>', htmlData,
re.I | re.S | re.M)
commentsHTML = string.strip(p.group(1))
print commentsHTML
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor