This is an issue with "extractOnly=true" on Solr 3.6.1. We upgraded to 4.0 Beta 
2 and the problem went away. Just in case anyone runs into this.

Sincerely,
Alex 


-----Original Message-----
From: Alexander Cougarman [mailto:acoug...@bwc.org] 
Sent: 23 August 2012 12:27 PM
To: solr-user@lucene.apache.org
Subject: Can't extract Outlook message files

Hi. We're trying to use the following Curl command to perform an "extract only" 
of *.MSG file, but it blows up:

   curl "http://localhost:8983/solr/update/extract?extractOnly=true"; -F 
"myfile=@900002.msg"

If we do this, it works fine:

  curl "http://localhost:8983/solr/update/extract?literal.id=doc1&commit=true"; 
-F "myfile=@900002.msg"

We've tried a variety of MSG files and they all produce the same error; they 
all have content in them. What are we doing wrong?

Here's the exception the extractOnly=true command generates:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/> 
<title>Error 500 null

org.apache.solr.common.SolrException
        at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
actingDocumentLoader.java:233)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
ntentStreamHandlerBase.java:58)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
erBase.java:129)
        at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
Request(RequestHandlers.java:244)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
.java:365)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
r.java:260)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
Handler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3
99)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav
a:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1
82)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7
66)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)

        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand
lerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.
java:114)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1
52)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:54
2)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio
n.java:945)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.
java:228)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j
ava:582)
Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException 
from org.apache.tika.parser.microsoft.OfficeParser@aaf063
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244
)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
20)
        at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
actingDocumentLoader.java:227)
        ... 23 more
Caused by: java.lang.IllegalStateException: Internal: Internal error: element 
st ate is zero.
        at org.apache.xml.serialize.BaseMarkupSerializer.leaveElementState(Unkno
wn Source)
        at org.apache.xml.serialize.XMLSerializer.endElementIO(Unknown Source)
        at org.apache.xml.serialize.XMLSerializer.endElement(Unknown Source)
        at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandler
Decorator.java:136)
        at org.apache.tika.sax.SecureContentHandler.endElement(SecureContentHand
ler.java:256)
        at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandler
Decorator.java:136)
        at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandler
Decorator.java:136)
        at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandler
Decorator.java:136)
        at org.apache.tika.sax.SafeContentHandler.endElement(SafeContentHandler.
java:273)
        at org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandl
er.java:213)
        at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java
:178)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
)
        ... 26 more
</title>
</head>
<body><h2>HTTP ERROR 500</h2>
<p>Problem accessing /solr/update/extract. Reason:
<pre>    null

org.apache.solr.common.SolrException
        at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
actingDocumentLoader.java:233)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
ntentStreamHandlerBase.java:58)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
erBase.java:129)
        at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
Request(RequestHandlers.java:244)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
.java:365)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
r.java:260)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
Handler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3
99)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav
a:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1
82)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7
66)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)

        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand
lerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.
java:114)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1
52)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:54
2)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio
n.java:945)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.
java:228)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j
ava:582)
Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException 
from org.apache.tika.parser.microsoft.OfficeParser@aaf063
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244
)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
20)
        at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
actingDocumentLoader.java:227)
        ... 23 more
Caused by: java.lang.IllegalStateException: Internal: Internal error: element 
st ate is zero.
        at org.apache.xml.serialize.BaseMarkupSerializer.leaveElementState(Unkno
wn Source)
        at org.apache.xml.serialize.XMLSerializer.endElementIO(Unknown Source)
        at org.apache.xml.serialize.XMLSerializer.endElement(Unknown Source)
        at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandler
Decorator.java:136)
        at org.apache.tika.sax.SecureContentHandler.endElement(SecureContentHand
ler.java:256)
        at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandler
Decorator.java:136)
        at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandler
Decorator.java:136)
        at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandler
Decorator.java:136)
        at org.apache.tika.sax.SafeContentHandler.endElement(SafeContentHandler.
java:273)
        at org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandl
er.java:213)
        at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java
:178)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
)
        ... 26 more
</pre></p><hr /><i><small>Powered by Jetty://</small></i><br/>

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

</body>
</html>


Sincerely,
Alex 


Reply via email to