: I'm trying to get anything to index. Starting with the simplest file
: possible. As it stands no extraction is working. I'm just trying to get any
: extraction working. I've followed that guide, I'll try again.

let's back up for a minute.

You have a plain text file, and you want to index it.

you are attempting to index it using the /update/extract handler.

so far so good: for arbitrary, non-structure, files that's the correct 
approach.

the stack trace you are getting is a very low level java error -- it has 
*NOTHING* to do with the names of fields in your schema (or in your 
document) as some previous people in this thread have stated...


> java.lang.Thread.run(Thread.java:745)\nCaused by:
> java.lang.NoSuchFieldError: LFH_SIG\n\tat
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.<clinit>(ZipArchiveInputStream.java:766)\n\tat
> org.apache.commons.compress.archivers.ArchiveStreamFactory.createArchiveInputStream(ArchiveStreamFactory.java:280)\n\tat
> org.apache.tika.parser.pkg.ZipContainerDetector.detectArchiveFormat(ZipContainerDetector.java:113)\n\tat
> org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:77)\n\tat
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)\n\tat
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)\n\tat

...what that says is that when the "ArchiveStreamFactory" class caused the 
JVM to load the "ZipArchiveInputStream" there was a mismatch between the 
fields it expected to find in the object (ie: what fields some dependent 
class had at compilation) and what field were actually found at run time.

This *USUALLY* means that your classpath has conflicting versions of 
classes in it.

> I've tried new version of common compress - to no avail - any ideas
> kind regards,Joel

do not do that.  do not try to change classes out from under solr.  this 
will not end well.

As far as why you might see errors krelated to class files used for 
parsing "ZIP" files coming from the Extract requesthandler when you upload 
a text file -- that's easy to explain:  if you haven't explicitly *TOLD* 
the extracting request handler that you are sending it plain text, then 
Tika uses "detectors" to try and figure out hte file type.  this error is 
coming from the "ZipContainerDetector" which Tika was trying to use to see 
if the file *might* be a text file -- and it encountered such a low level 
java error tika couldn't proceed.

to get the the bottom of *WHY* you are getting such a low level error, you 
have to provide us a *LOT* more details then you have...

 - which version of solr?

 - how did you install it? what servlet container?

 - what does your installation directory structure look like?

 - did you move/copy/add any jars to your servlet container (before you 
got this error and said you tried to upgrade compress)

 - what do your full servlet container logs & solr logs look like ? ... 
both on startup and when this error happened (they might give us clues 
as to other jars, and other versions of jars, being loaded that should not 
be.

details matter...

https://wiki.apache.org/solr/UsingMailingLists






-Hoss
http://www.lucidworks.com/

Reply via email to