Re: indexing rich data with solr 5.3

2016-01-15 Thread kostali hassan
thank you Erik for your precious advice. 2016-01-14 17:24 GMT+00:00 Erik Hatcher : > And also, bin/post can be your friend when it comes to troubleshooting or > introspecting Tika parsing via /update/extract. Like this: > > $ bin/post -c test -params "extractOnly=true&wt=ruby&indent=yes" -out ye

Re: indexing rich data with solr 5.3

2016-01-14 Thread Erik Hatcher
And also, bin/post can be your friend when it comes to troubleshooting or introspecting Tika parsing via /update/extract. Like this: $ bin/post -c test -params "extractOnly=true&wt=ruby&indent=yes" -out yes docs/SYSTEM_REQUIREMENTS.html java -classpath /Users/erikhatcher/solr-5.3.0/dist/solr-co

Re: indexing rich data with solr 5.3

2016-01-14 Thread Erick Erickson
No good way except to try them. For getting details on Tika parsing failures, I much prefer the SolrJ process that the link I sent you outlines. Best, Erick On Thu, Jan 14, 2016 at 7:52 AM, kostali hassan wrote: > thank you Eric I have prb with this files; last question how to define or > get th

Fwd: indexing rich data with solr 5.3

2016-01-14 Thread kostali hassan
thank you Eric I have prb with this files; last question how to define or get the list of files cant be indexing or bad files. > > > >

Re: indexing rich data with solr 5.3

2016-01-12 Thread Erick Erickson
Then you probably have a corrupt file or have discovered a Tika bug. Next I'd try running the file through stand-alone Tika, perhaps trying different versions of Tika. If this latter is the case, you can always use a more recent version of Tika with Solr and/or process the file on a SolrJ client (

Re: indexing rich data with solr 5.3

2016-01-12 Thread kostali hassan
yes i'am indexing succeflly with DIH other files ; now i try to index this files with ExtractingRequestHandler i get this ERROR: null:org.apache.solr.common.SolrException: org.apache.tika.exception.TikaException: Error creating OOXML extractor at org.apache.solr.handler.extraction.Extrac

Re: indexing rich data with solr 5.3

2016-01-11 Thread Erick Erickson
Looks like a bad file. Do you have any success using DIH on any files? What happens if you just send that particular file throug the ExtractingRequestHandler? Best, Erick On Mon, Jan 11, 2016 at 3:51 PM, kostali hassan wrote: > such files msword and pdf donsnt indexing using *dataimoprt i have

indexing rich data with solr 5.3

2016-01-11 Thread kostali hassan
such files msword and pdf donsnt indexing using *dataimoprt i have this error:* Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read content Processing Document # 2 at org.apache.solr.handl