I have a valid xml document that begins:

<add><doc><field name="id">mdp.39015052775379</field>
<field name="rights">2</field>
<field name="title">Technology transfer and in-house R&amp;D in Indian industry : in the later 1990s / edited and with an introduction by Binay Kumar Pattnaik. v.1</field>
<field name="author">Not found</field>
<field name="ocr"> TECHNOLOGY
TRANSFER AND
IN.HOUSE R&amp;D
IN
INDIAN
INDUSTRY

I believe Solr is throwing an exception when it sees the line:

IN.HOUSE R&amp;D

The error message is:

SEVERE: [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 32); expected
 a semi-colon after the reference for entity 'D'

This seems wrong. It is as though the parser has converted &amp;D to &D and then complains about a missing semi-colon.

Can anyone make sense of this?

Full traceback follows.

Thanks!!

Phil

----
        
Solr Specification Version: 1.3.0.2008.12.04.08.06.02
Solr Implementation Version: nightly exported - yonik - 2008-12-04 08:06:02
Lucene Specification Version: 2.9-dev
Lucene Implementation Version: 2.9-dev 719313 - 2008-11-20 23:51:24
Current Time: Tue Aug 25 17:51:57 EDT 2009
Server Start Time:Tue Aug 25 17:10:44 EDT 2009

Aug 25, 2009 12:42:16 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/mbooks-ls-shard-2 path=/update params={} status=500 QTime=4
Aug 25, 2009 12:42:16 PM org.apache.solr.common.SolrException log
SEVERE: [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 32); expected
 a semi-colon after the reference for entity 'D'
 at [row,col {unknown-source}]: [4,57]
at com.ctc.wstx.exc.WstxLazyException.throwLazily(WstxLazyException.java:45)
        at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:729)
at com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3659)
        at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
        at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:276)
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1313)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
        at java.lang.Thread.run(Thread.java:619)
Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 32); expected a semi-colon after the reference
for entity 'D'
 at [row,col {unknown-source}]: [4,57]
at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:648)
        at 
com.ctc.wstx.sr.StreamScanner.parseEntityName(StreamScanner.java:1994)
at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1496) at com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4681) at com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4126) at com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3701) at com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3649)
        ... 24 more


Reply via email to