thank you Erik for your precious advice.
2016-01-14 17:24 GMT+00:00 Erik Hatcher :
> And also, bin/post can be your friend when it comes to troubleshooting or
> introspecting Tika parsing via /update/extract. Like this:
>
> $ bin/post -c test -params "extractOnly=true&wt=ruby&indent=yes" -out ye
And also, bin/post can be your friend when it comes to troubleshooting or
introspecting Tika parsing via /update/extract. Like this:
$ bin/post -c test -params "extractOnly=true&wt=ruby&indent=yes" -out yes
docs/SYSTEM_REQUIREMENTS.html
java -classpath /Users/erikhatcher/solr-5.3.0/dist/solr-co
No good way except to try them. For getting details on Tika parsing
failures, I much prefer the SolrJ process that the link I sent you
outlines.
Best,
Erick
On Thu, Jan 14, 2016 at 7:52 AM, kostali hassan
wrote:
> thank you Eric I have prb with this files; last question how to define or
> get th
thank you Eric I have prb with this files; last question how to define or
get the list of files cant be indexing or bad files.
>
>
>
>
Then you probably have a corrupt file or have
discovered a Tika bug.
Next I'd try running the file through stand-alone Tika,
perhaps trying different versions of Tika. If this latter
is the case, you can always use a more recent version
of Tika with Solr and/or process the file on a SolrJ client
(
yes i'am indexing succeflly with DIH other files ; now i try to index this
files with ExtractingRequestHandler i get this ERROR:
null:org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Error creating OOXML
extractor
at
org.apache.solr.handler.extraction.Extrac
Looks like a bad file. Do you have any success using DIH on any files?
What happens if you just send that particular file throug the
ExtractingRequestHandler?
Best,
Erick
On Mon, Jan 11, 2016 at 3:51 PM, kostali hassan
wrote:
> such files msword and pdf donsnt indexing using *dataimoprt i have
such files msword and pdf donsnt indexing using *dataimoprt i have this
error:*
Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to read content Processing Document # 2
at
org.apache.solr.handl