Re: Restricting HTML search?

Paul Libbrecht Tue, 24 Aug 2010 22:55:49 -0700

Wouldn't the usage of the NeckoHTML (as an XML-parser) and XPath besafer?

I guess it all depends on the "quality" of the source document.


paul


Le 25-août-10 à 02:09, Lance Norskog a écrit :

I would do this with regular expressions. There is a Pattern Analyzer
and a Tokenizer which do regular expression-based text chopping. (I'm
not sure how to make them do what you want). A more precise tool is
the RegexTransformer in the DataImportHandler.

Lance

On Tue, Aug 24, 2010 at 7:08 AM, Andrew Cogan
<[email protected]> wrote:

I'm quite new to SOLR and wondering if the following is possible: in
addition to normal full text search, my users want to have theoption tosearch only HTML heading innertext, i.e. content inside of <H1>,<H2>, or
<H3> tags.

Re: Restricting HTML search?

Reply via email to