That is one seriously manly regex, but I'd recommend using the Tag Soup
parser instead:

  http://ccil.org/~cowan/XML/tagsoup/

wunder

On 10/4/07 10:11 PM, "J.J. Larrea" <[EMAIL PROTECTED]> wrote:

> It uses a PatternTokenizerFactory with a RegEx that swallows runs of HTML- or
> XML-like tags:
> 
>   (?:\s*</?\w+((\s+\w+(\s*=\s*(?:"?&"'.?'|[^'">\s]+))?)\s*|\s*)/?>\s*)|\s

Reply via email to