I’ve isolated the problem to a template definition that is trying to replace
apace characters with non-breaking spaces. Evidently it clobbers some surrogate
pairs. FWIW: here’s the offending line(s):
<xsl:template name="zero_width_space_1">
<xsl:param name="data"/>
<xsl:param name="counter" select="0"/>
<xsl:choose>
<xsl:when test="$counter < string-length($data)+1">
<xsl:value-of select='concat(substring($data,$counter,1),"​")'/>
<xsl:call-template name="zero_width_space_2">
<xsl:with-param name="data" select="$data"/>
<xsl:with-param name="counter" select="$counter+1"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="zero_width_space_2">
<xsl:param name="data"/>
<xsl:param name="counter"/>
<xsl:value-of select='concat(substring($data,$counter,1),"​")'/>
<xsl:call-template name="zero_width_space_1">
<xsl:with-param name="data" select="$data"/>
<xsl:with-param name="counter" select="$counter+1"/>
</xsl:call-template>
</xsl:template>
So, not an FOP problem.
Marc
From: Marc Kaufman [mailto:[email protected]]
Sent: Thursday, July 14, 2016 12:22 PM
To: [email protected]
Subject: RE: isolated high surrogate
I tried that. Doesn’t work. I understand that non-BMP is not supported, and I’m
prepared to live with two .notdef characters in the result, but I’m not sure
why I’m getting the fatal error from the parser.
From: Glenn Adams [mailto:[email protected]]
Sent: Thursday, July 14, 2016 12:01 PM
To: FOP Users
<[email protected]<mailto:[email protected]>>
Subject: Re: isolated high surrogate
Non-BMP characters are not presently supported by FOP, see [1]. When they are
supported, you would best encode them in a file using a single (not two)
numeric character entities, e.g., 𐀁, etc.
[1] https://issues.apache.org/jira/browse/FOP-1969
On Thu, Jul 14, 2016 at 12:51 PM, Marc Kaufman
<[email protected]<mailto:[email protected]>> wrote:
I’m stumped by this error:
org.xml.sax.SAXParseException; lineNumber: 92; columnNumber: 51;
java.lang.IllegalArgumentException: isolated high surrogate
I have text with surrogate pairs throughout the file, but this only occurs in
this context:
<fo:block padding-top="2em" padding-bottom=".5em" text-align="left"
font-family="Kozuka Gothic PR6N" font-size="18pt" color="black">
<xsl:call-template name="zero_width_space_1">
<xsl:with-param name="data" select="@documentName"/>
</xsl:call-template>
</fo:block>
I’ve checked the input stream, and all the surrogates are correctly paired.
I’ve tried escaping the surrogate pairs (e.g. “&#-integer-;”), but that doesn’t
change the error.