On Sat, Nov 13, 2010 at 1:50 PM, Steven A Rowe <sar...@syr.edu> wrote:
> Looks to me like the returned value is in a Solr-internal form of XML 
> character escaping: \u0000 is represented as "#0;" and \u0008 is represented 
> as "#8;".  (The escaping code is in 
> solr/src/java/org/apache/common/util/XML.java.)

Yep, there is no legal way to represent some unicode code points in XML.

> You can get the value back in its original binary form by unescaping the 
> /#[0-9]+;/ format.  Here is a test illustrating this fix that I added to 
> SolrExampleTests, then ran from SolrExampleEmbeddedTest:

The problem here is that one might then unescape what was meant to be
a literal "#8;"
One could come up with a full escaping mechanism over XML I suppose...
but I'm not sure it would be worth it.

-Yonik
http://www.lucidimagination.com

Reply via email to