Re: Managing ZIP files inside ZIP files

2015-11-04 Thread Alexandre Rafalovitch
How are you injesting them now? I'd probably use Java8 with SolrJ and use new Virtual File System approach to read right out of the zip and gzip . http://docs.oracle.com/javase/8/docs/api/java/nio/file/FileSystems.html#newFileSystem-java.nio.file.Path-java.lang.ClassLoader- Tar is a bit harder, t

Managing ZIP files inside ZIP files

2015-11-04 Thread Frédéric Olier
Hi, I have a ZIP (tar.gz) that contains many (> 100) other tar.gz files inside. Solr takes ages to ingest the document. I'd like to know if other users experienced with such a configuration and what the solution they found ? Is there a way to tell Solr to go '1 level deep' while analysing the