for what it's worth, i wrote a recursive template in xsl that replaces the
escaped characters with actual elements. here, the variable $val would be
the tag, e.g. "em". this has been working okay for me so far.

<xsl:template name="unescapeEm">
   <xsl:param name="val" select="''"/>
   <xsl:variable name="preEm" select="substring-before($val, '&lt;')"/>
   <xsl:choose>
       <xsl:when test="$preEm or starts-with($val, '&lt;')">
           <xsl:variable name="insideEm" select="substring-before($val,
'&lt;/')"/>
           <xsl:value-of select="$preEm"/><em><xsl:value-of
select="substring($insideEm, string-length($preEm)+5)"/></em>
           <xsl:variable name="leftover" select="substring($val,
string-length($insideEm) + 6)"/>
           <xsl:if test="$leftover">
               <xsl:call-template name="unescapeEm">
                   <xsl:with-param name="val" select="$leftover"/>
               </xsl:call-template>
           </xsl:if>
       </xsl:when>
       <xsl:otherwise>
           <xsl:value-of select="$val"/>
       </xsl:otherwise>
   </xsl:choose>
</xsl:template>

On 1/3/07, Thorsten Scherler <[EMAIL PROTECTED]> wrote:

On Wed, 2007-01-03 at 02:16 +0000, Edward Garrett wrote:
> thorsten,
>
> see the following for discussion. your case is indeed an annoyance--the
> thread below discusses motivations for it and ways of working around it.
(i
> too confess that i wish it were not so.)
>
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg01483.html

Thanks Edward, the problem is with the suggestion in the above thread is
that:
"just create an XSL that
generates XML and unescapes the fields you know will contain wellformed
XML data -- then apply your second transform client side"

Is not possible with xsl. See e.g.
http://www.biglist.com/lists/xsl-list/archives/200109/msg00318.html
"> How can I match the Cdata Section?!?
>
You can't, the XPath data model regards CDATA as merely an input shortcut,
not as an information-bearing part of the XML content. In other words,
"<![CDATA[x]]>" and "x" look exactly the same to the XSLT processor.

Mike Kay"

Michael Kay is the xsl guru and I can say as well from my own experience
one would need to write a custom parser since <![CDATA[<em>TERM</em>]]>
is equal to &lt;em&gt;TERM&lt;/em&gt; and this in xsl is a string (XPath
would match text()).

IMO the highlighter should really return pure xml and not escape it.
I will have a look in the XmlResponseWriter maybe I find a way to change
this.

salu2


>
> -edward
>
> On 1/2/07, Mike Klaas <[EMAIL PROTECTED]> wrote:
> >
> > Hi Thorsten,
> >
> > The highlighter does not escape anything itself: you are seeing the
> > results of solr's automatic escaping of xml data within its xml
> > response.  This should be transparent (your xml decoder should
> > un-escape the values on the way out).  I'm not really familiar with
> > xslt so I'm unsure why that isn't so (perhaps it is automatically
> > html-escaping the values after un-xml-escaping them?)
> >
> > Be careful of documents containing html fragments natively.
> >
> > cheers,
> > -MIke
> >
> > On 1/2/07, Thorsten Scherler <
[EMAIL PROTECTED]>
> > wrote:
> > > Hi all,
> > >
> > > I am playing around with the highlighter and found that all
highlight
> > > terms get escaped.
> > >
> > > I mean solr will return
> > >  &lt;em&gt;TERM&lt;/em&gt; and not
> > > <em> TERM </em>
> > >
> > > I am not sure where this escaping is happening but I would need the
> > > highlighting to NOT escape the hl.simple.pre and hl.simple.post tag
> > > since it is horror to work with cdata sections in xsl.
> > >
> > > I had a look in the lucene highlighter and it seem that it does not
> > > escape the tags.
> > >
> > > Can somebody point me to code which is responsible for escaping and
> > > maybe give me a tip how I can patch to make it configurable.
> > >
> > > TIA
> > >
> > > salu2
> > >
> > >
> >
>
>
>
--
thorsten

"Together we stand, divided we fall!"
Hey you (Pink Floyd)





--
Edward Garrett

Visiting Fellow (2006-07)
Endangered Languages Academic Programme
School of Oriental and African Studies
London, UK
0207 898 4536

Assistant Professor, Linguistics Program
Eastern Michigan University
612 Pray-Harrold Building
Ypsilanti, MI, USA

Reply via email to