Re: How to index correctly a text save with tinyMCE

Marek Tichy Thu, 23 Jun 2011 10:08:05 -0700

Or fix the problem at it's source, i think you need to google for
entity_encoding : "raw"


on tinyMCE.


> Hi Ariel,
>
> On 6/23/2011 at 12:34 PM, Ariel wrote:
>   
>> But it still doesn't convert the code to the correct character, for
>> instance: Espa&amp;ntilde;a must be converted to España but it still
>> remains as Espa&amp;ntilde;a.
>>     
>
> So it looks like your text processing tool(s) escape markup meta-characters 
> (e.g. "&" -> "&amp;") after escaping above-ASCII characters to their named 
> entity equivalents (e.g. "n" with a tilde to "&ntilde;").  This two-level 
> escaping appears to be the problem.
>
> According to the analysis.jsp output you sent, your original text 
> "Espa&amp;ntilde;a" was converted to "Espa&ndilde;a" - the first level of 
> escaping was reversed.
>
> I suspect you could fix the problem by including HTMLStripCharFilter twice, 
> e.g.:
>
>    <charFilter class="solr.HTMLStripCharFilterFactory"/>
>    <charFilter class="solr.HTMLStripCharFilterFactory"/>
>    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>    ...
>
> Good luck,
> Steve
>
>

Re: How to index correctly a text save with tinyMCE

Reply via email to