Alexandru Corghencea created SOLR-14337:
-------------------------------------------

             Summary: SOLR 8.0 NoClassDefFoundError when change schema and 
index MSOffice and PDF files
                 Key: SOLR-14337
                 URL: https://issues.apache.org/jira/browse/SOLR-14337
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: contrib - Solr Cell (Tika extraction)
    Affects Versions: 8.1, 8.0
            Reporter: Alexandru Corghencea


After a schema change, just added a new field in a document and indexing 
documents throws *NoClassDefFoundError* practically for every document MS 
office or PDF that worked before schema change. This error comes out as:
 
{{{{"ERROR d.t.d.d.b.d.SolrRequestExecutor - Failed to index file
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://localhost:8983/solr: Expected mime type 
application/octet-stream but got text/html."}}}}

{{{{ }}}}

with {{"Error 500 Server Error".}}

{{}}
 * Linux Ubuntu 18.04
 * Solr 8.0.0
 * Java Oracle 1.8.

{{}}

This happens only at our project version release :)

{{}}

already tried:

{{}}
 * reload core
 * reboot server
 * delete and re-index all

{{}}

Server has enough disk space. Cpu and memory still has reserves during indexing.

{{}}

Maybe there are some temporary junk files left in "data" folder, or elsewhere. 
Or maybe the core gets corrupted after changing schema but it should get 
cleaned when reloaded.
 
{{2019-11-08 20:05:41 ERROR d.t.d.d.b.d.SolrRequestExecutor - Failed to index 
file
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://localhost:8983/solr: Expected mime type 
application/octet-stream but got text/html. <html><head><meta 
http-equiv="Content-Type" content="text/html;charset=utf-8"/><title>Error 500 
Server Error</title></head><body><h2>HTTP ERROR 500</h2><p>Problem accessing 
/solr/document/update/extract. Reason:<pre>    Server Error</pre></p><h3>Caused 
by:</h3><pre>java.lang.NoClassDefFoundError: Could not initialize class 
javax.imageio.ImageIO        at 
org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:177)        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)        
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)       
 at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)    
    at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)  
      at 
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102)
        at 
org.apache.tika.extractor.EmbeddedDocumentUtil.parseEmbedded(EmbeddedDocumentUtil.java:220)
        at 
org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedResource(AbstractPOIFSExtractor.java:124)
        at 
org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedResource(AbstractPOIFSExtractor.java:100)
        at 
org.apache.tika.parser.microsoft.WordExtractor.handlePictureCharacterRun(WordExtractor.java:640)
        at 
org.apache.tika.parser.microsoft.WordExtractor.handleParagraph(WordExtractor.java:367)
        at 
org.apache.tika.parser.microsoft.WordExtractor.handleHeaderFooter(WordExtractor.java:259)
        at 
org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:182)    
    at 
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:175)      
  at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131) 
       at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)        
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)       
 at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)    
    at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
        at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
        at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559)        at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)        at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:394)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:340)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)      
  at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)   
     at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)     
   at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) 
       at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
        at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)       
 at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)   
     at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
        at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
        at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) 
       at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
        at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) 
       at org.eclipse.jetty.server.Server.handle(Server.java:502)        at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)        at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)     
   at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)    
    at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)     
   at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
        at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
        at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
        at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
        at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) 
       at java.base/java.lang.Thread.run(Thread.java:834)</pre></body></html>   
     at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:607)
        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
        at 
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)        at 
de.tsystems.dep.document.base.dao.SolrRequestExecutor.doSolrUpdateRequest(SolrRequestExecutor.java:42)
        at 
de.tsystems.dep.document.base.dao.SolrRequestExecutor.indexAuditFile(SolrRequestExecutor.java:35)
        at 
de.tsystems.dep.document.base.business.handler.WordIndexingHandler.createAuditedFile(WordIndexingHandler.java:31)
        at 
de.tsystems.dep.document.base.business.handler.DocumentIndexingHandler.handleAuditedFile(DocumentIndexingHandler.java:52)
        at 
de.tsystems.dep.document.base.business.handler.DocumentIndexingHandler.receiveAuditedFile(DocumentIndexingHandler.java:36)
        at 
de.tsystems.dep.document.base.DepDocumentBaseApplication.runSvnIndexing(DepDocumentBaseApplication.java:65)
        at 
de.tsystems.dep.document.base.DepDocumentBaseApplication.scheduledRun(DepDocumentBaseApplication.java:50)
        at 
de.tsystems.dep.document.base.DepDocumentBaseApplication.main(DepDocumentBaseApplication.java:42)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  
      at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)        at 
org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)  
      at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)      
  at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)        
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)}}

{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to