Re: unable to figure out nutch type highlighting in solr....

Walter Underwood Fri, 05 Oct 2007 07:37:11 -0700

That is one seriously manly regex, but I'd recommend using the Tag Soup
parser instead:


  http://ccil.org/~cowan/XML/tagsoup/

wunder

On 10/4/07 10:11 PM, "J.J. Larrea" <[EMAIL PROTECTED]> wrote:

> It uses a PatternTokenizerFactory with a RegEx that swallows runs of HTML- or
> XML-like tags:
> 
>   (?:\s*</?\w+((\s+\w+(\s*=\s*(?:"?&"'.?'|[^'">\s]+))?)\s*|\s*)/?>\s*)|\s

Re: unable to figure out nutch type highlighting in solr....

Reply via email to