Working now - fyi - the "update/extract" from a post works extracting from a
kmz(zip) but I am still having trouble from the dataimport. I'll move to
another thread for that. THANKS all.
--
View this message in context:
http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip
Thanks for the info Sergio. I updated my 4.8.1 version with that patch and
SOLR 4216 (which was really the same thing). It took a day to get it to
compile on my network and it still doesn't work. Did my config file look
correct? I'm wondering if I need another param somewhere.
"Patch has to be
hi keeblerh,
Patch has to be applied to the source code and compile again Solr.war.
If you do that then it works extracting the content of documents
Regards,
Sergio
--
View this message in context:
http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p415767
I am also having the issue where my zip contents (or kmz contents) are not
being processed - only the file names are processed. It seems to recognize
the kmz extension and open the file just doesn't recurse the processing on
the contents.
The patch you mention has been around for a while. I am ru
I extended ExtractingDocumentLoader with this patch and it works.
https://issues.apache.org/jira/secure/attachment/12473188/SOLR-2416_ExtractingDocumentLoader.patch
Iterates throw all documents and extracts the name and the content of all
documents inside the file.
Regards,
Sergio
--
View thi
Hi Sergio,
your either do the stuff on the caller side (which is probably a good idea
since you are off-load the SOLR server) or extend the ExtractingRequestHandler
Cheers,
Siegfried Goeschl
On 27 May 2014, at 10:37, marotosg wrote:
> Hi,
>
> Thanks for your answer Alexandre.
> I have zip f
Hi,
Thanks for your answer Alexandre.
I have zip files with only one document inside per zip file. These documents
are mainly pdf,xml,html.
I tried to index "tini.txt.gz" file which is located in the trunk to be used
by extraction tests
\trunk\solr\contrib\extraction\src\test-files\extraction\tin
A zip file can contain many files and directories in a nested
structure. With files of any type and size.
What would you expect Solr to do facing a generic Zip file?
And what would you like it to do for _your_ - one assumes more
restricted - scenario?
Regards,
Alex.
Personal website: http://