When I use HTML::Entities to encode my text, I get this error:

SEVERE: org.xmlpull.v1.XmlPullParserException: could not resolve entity
named 'para'

Its complaining about finding:   ¶   in my text. Anyone know why this
is a problem?





Jérôme Etévé-2 wrote:
> 
> If I understand, you want to keep the raw html code in solr like that
> (in your posting xml file):
> 
> <field name="storyFullText">
>   <html></html>
> </field>
> 
> I think you should encode your content to protect these xml entities:
> <  ->  &lt;
>> -> &gt;
> " -> &quot;
> & -> &amp;
> 
> If you use perl, have a look at HTML::Entities.
> 
> 
> On 9/25/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>> Hello,
>>
>> I've got some problem with html code who is embedded in xml file:
>>
>> Sample source .
>>
>> <content>
>>         <stories>
>>                 <div class="storyTitle">
>>                          Les débats
>>                 </div>
>>                 <div class="storyIntroductionText">
>>                         Le premier tour des élections fédérales se
>> déroulera le 21
>> octobre prochain. D'ici là, La 1ère vous propose plusieurs rendez-
>> vous, dont plusieurs grands débats à l'enseigne de Forums.
>>                 </div>
>>                 <div class="paragraph">
>>                         <div class="paragraphTitle"/>
>>                         <div class="paragraphText">
>>                                 my para textehere
>>                                 <br/>
>>                                 <br/>
>>                                 Vous trouverez sur cette page toutes les
>> dates et les heures de
>> ces différents rendez-vous ainsi que le nom et les partis des
>> débatteurs. De plus, vous pourrez également écouter ou réécouter
>> l'ensemble de ces émissions.
>>                         </div>
>>                 </div>
>> ....
>> ---------
>> When a make a query on solr I've got something like that in the
>> source code of the xml result:
>>
>> <td xmlns="http://www.w3.org/1999/xhtml";>
>> &lt;
>> div
>> class
>> =
>> "paragraph"
>> &gt;<div class="expander-content">
>> <div class="indent">&lt;
>> div
>> class
>> =
>> "paragraphTitle"
>> /&gt;</div><table><tr>
>> <td class="expander">−<div class="spacer"/>
>> </td><td>&lt;
>> ...
>>
>> It is not exactly what I want. I want to keep the html tags, that all
>> without formatting.
>>
>> So the br tags and a tags are well formed in xml and json result, but
>> the div tags are not kept.
>> ---------
>> In the schema.xml I've got this for the html content
>>
>> <fieldType name="html" class="solr.TextField" />
>>
>>   <field name="storyFullText" type="html" indexed="true"
>> stored="true" multiValued="true"/>
>>
>> ---------
>>
>> Any help would be appreciate.
>>
>> Thanks in advance.
>>
>> S. Christin
>>
>>
>>
>>
>>
>>
> 
> 
> -- 
> Jerome Eteve.
> [EMAIL PROTECTED]
> http://jerome.eteve.free.fr/
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-html-code-inside-xml-tp12877194p15907551.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to