Hi friends,

Please, see the attachment and examine a code I have provide. The problem is, I want fetch data from <H2>Comments</H2> until the first </TD> occurrence , but with my code data fetchind until the last </TD> in htmlData variable, but that is not what I want. So question is, what is my mistake?

Thanks in advance.
import re
import string

htmlData = """
<h2>Instructions</h2>

<p>Preparation</p>

<dl>
  <dd>Lie supine on bench. Dismount barbell from rack over the upper chest
  using a wide oblique overhand grip.
</dd></dl>

<p>Execution</p>

<dl>
  <dd>Lower weight to upper chest. Press bar until arms are extended. Repeat.
</dd></dl>

<h2>Comments</h2>

<dl>
  <dd>None
</dd></dl>
</td>
<td valign="top" width="50%"><h2>Classification</h2>

<h2><table border="1" cellpadding="1" cellspacing="0" height="60" width="100%">
<tbody><tr>
<td width="50%"><b>&nbsp;Utility:</b></td>
<td width="50%"><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1270300";>Basic</a></td></tr>
<tr>
<td width="50%"><b>&nbsp;Mechanics:</b></td>
<td width="50%"><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1271511";>Compound</a></td></tr>
<tr>
<td width="50%"><b>&nbsp;Force:</b></td>
<td width="50%"><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor400787";>Push</a></td></tr>
</tbody></table>

</h2>

<h2>Muscles</h2>

<p><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1274950";>Target</a></p>

<ul>
  <li><a href="http://www.exrx.net/Muscles/PectoralisSternal.html";>Pectoralis Major, Sternal</a>
</li></ul>

<p><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1275394";>Synergists</a></p>

<ul>
  <li><a href="http://www.exrx.net/Muscles/PectoralisClavicular.html";>Pectoralis Major,
  Clavicular</a>
  </li><li><a href="http://www.exrx.net/Muscles/DeltoidAnterior.html";>Deltoid, Anterior</a>
  </li><li><a href="http://www.exrx.net/Muscles/TricepsBrachii.html";>Triceps Brachii</a>
</li></ul>

<p><a href="http://www.exrx.net/WeightTraining/Glossary.html#anchor1276508";>Dynamic Stabilizers</a></p>

<ul>
  <li><a href="http://www.exrx.net/Muscles/BicepsBrachii.html";>Biceps Brachii, Short Head</a>

</li></ul>
</td></tr>
</tbody></table>
</h1>
"""

if __name__ == '__main__':
    # Extract comments
    p = re.search('<H2>Comments</H2>(.+)</TD>', htmlData,
                  re.I | re.S | re.M)
    commentsHTML = string.strip(p.group(1))
    print commentsHTML
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to