Hello Cam,

Are you writing your xml by hand, as in no xml writer? That can cause
problems. In your exception it says "latitude 59&", the & should have
converted to '&'(I think). If you can use Java6, there is a
XMLStreamWriter in java.xml.stream that does automatic special character
escaping. This can simplify writing simple xml.

Unfortunatly the stream writer does not filter out invalid xml
characters. So I will point you to a helpful website: 
http://cse-mjmcl.cse.bris.ac.uk/blog/2007/02/14/1171465494443.html


Hope this helps.

Brian

Am Mittwoch, den 14.05.2008, 19:23 +0300 schrieb Cam Bazz:
> Hello,
> 
> I made a simple java program to convert my pdfs to text, and then to xml
> file.
> I am getting a strange exception. I think the converted files have some
> errors. should I encode the txt string that I extract from the pdfs in a
> special way?
> 
> Best,
> -C.B.
> 
> EVERE: org.xmlpull.v1.XmlPullParserException: entity reference names can not
> start with character ' ' (position: START_TAG seen
> ...ay\n                                                  latitude 59& ...
> @80:64)
>         at org.xmlpull.mxp1.MXParser.parseEntityRef(MXParser.java:2212)
>         at org.xmlpull.mxp1.MXParser.nextImpl(MXParser.java:1275)
>         at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)
>         at org.xmlpull.mxp1.MXParser.nextText(MXParser.java:1058)
>         at
> org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandler.java:332)
>         at
> org.apache.solr.handler.XmlUpdateRequestHandler.update(XmlUpdateRequestHandler.java:162)
>         at
> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:84)
>         at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:77)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:191)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:159)
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>         at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>         at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>         at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>         at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>         at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>         at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>         at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>         at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>         at org.mortbay.jetty.Server.handle(Server.java:285)
>         at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>         at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>         at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>         at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

Reply via email to