processing chain, but
>> that may be too much effort compared to the HTML strip filter.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: okayndc
>> Sent: Monday, April 30, 2012 10:07 AM
>> To: solr-user@lucene.apache.org
>> Subject: Solr: e
CopyField to a text field field that has
>> the
>> HTMLStripCharFilter to strip the HTML tags and index only the text
>> (indexed, but not stored.)
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: okayndc
>> Sent: Monday, April 30, 2012 5:06 PM
>&
will
not see "".
-- Jack Krupansky
-Original Message-
From: okayndc
Sent: Tuesday, May 01, 2012 10:08 AM
To: solr-user@lucene.apache.org
Subject: Re: extracting/indexing HTML via cURL
Thank you Jack.
So, it's not doable/possible to search and highlight keywords with
-- Jack Krupansky
>
> -Original Message- From: okayndc
> Sent: Monday, April 30, 2012 5:06 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr: extracting/indexing HTML via cURL
>
> Great, thank you for the input. My understanding of HTMLStripCharFilter is
> that it stri
Sent: Monday, April 30, 2012 5:06 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr: extracting/indexing HTML via cURL
Great, thank you for the input. My understanding of HTMLStripCharFilter is
that it strips HTML tags, which is not what I want ~ is this correct? I
want to keep the HTML tags i
iginal Message- From: okayndc
> Sent: Monday, April 30, 2012 10:07 AM
> To: solr-user@lucene.apache.org
> Subject: Solr: extracting/indexing HTML via cURL
>
>
> Hello,
>
> Over the weekend I experimented with extracting HTML content via cURL and
> just
> wondering why the e
nday, April 30, 2012 10:07 AM
To: solr-user@lucene.apache.org
Subject: Solr: extracting/indexing HTML via cURL
Hello,
Over the weekend I experimented with extracting HTML content via cURL and
just
wondering why the extraction/indexing process does not include the HTML
tags.
It seems as though
Hello,
Over the weekend I experimented with extracting HTML content via cURL and
just
wondering why the extraction/indexing process does not include the HTML
tags.
It seems as though the HTML tags either being ignored or stripped somewhere
in the pipeline.
If this is the case, is it possible to in