Re: ExtractingRequestHandler indexing zip files

2014-09-11 Thread keeblerh
Working now - fyi - the "update/extract" from a post works extracting from a kmz(zip) but I am still having trouble from the dataimport. I'll move to another thread for that. THANKS all. -- View this message in context: http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip

Re: ExtractingRequestHandler indexing zip files

2014-09-10 Thread keeblerh
Thanks for the info Sergio. I updated my 4.8.1 version with that patch and SOLR 4216 (which was really the same thing). It took a day to get it to compile on my network and it still doesn't work. Did my config file look correct? I'm wondering if I need another param somewhere. "Patch has to be

Re: ExtractingRequestHandler indexing zip files

2014-09-09 Thread marotosg
hi keeblerh, Patch has to be applied to the source code and compile again Solr.war. If you do that then it works extracting the content of documents Regards, Sergio -- View this message in context: http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p415767

Re: ExtractingRequestHandler indexing zip files

2014-09-09 Thread keeblerh
I am also having the issue where my zip contents (or kmz contents) are not being processed - only the file names are processed. It seems to recognize the kmz extension and open the file just doesn't recurse the processing on the contents. The patch you mention has been around for a while. I am ru

Re: ExtractingRequestHandler indexing zip files

2014-05-28 Thread marotosg
I extended ExtractingDocumentLoader with this patch and it works. https://issues.apache.org/jira/secure/attachment/12473188/SOLR-2416_ExtractingDocumentLoader.patch Iterates throw all documents and extracts the name and the content of all documents inside the file. Regards, Sergio -- View thi

Re: ExtractingRequestHandler indexing zip files

2014-05-27 Thread Siegfried Goeschl
Hi Sergio, your either do the stuff on the caller side (which is probably a good idea since you are off-load the SOLR server) or extend the ExtractingRequestHandler Cheers, Siegfried Goeschl On 27 May 2014, at 10:37, marotosg wrote: > Hi, > > Thanks for your answer Alexandre. > I have zip f

Re: ExtractingRequestHandler indexing zip files

2014-05-27 Thread marotosg
Hi, Thanks for your answer Alexandre. I have zip files with only one document inside per zip file. These documents are mainly pdf,xml,html. I tried to index "tini.txt.gz" file which is located in the trunk to be used by extraction tests \trunk\solr\contrib\extraction\src\test-files\extraction\tin

Re: ExtractingRequestHandler indexing zip files

2014-05-26 Thread Alexandre Rafalovitch
A zip file can contain many files and directories in a nested structure. With files of any type and size. What would you expect Solr to do facing a generic Zip file? And what would you like it to do for _your_ - one assumes more restricted - scenario? Regards, Alex. Personal website: http://