Hi, I am trying to setup Solr for our project which can return full text searches on PDF documents. I am able to run the sample Tika DIH example locally on my windows server machine. It can index all PDF documents recursively in "baseDir" of config xml. Presently "baseDir" points to local folder on the same machine and has around 10K pdf files. This whole setup works as expected.
Next step is to import PDF documents located on network share. I created another core, with very similar configuration files except this time, baseDir points to network share ("\\myserver\pdfshare"). I have no success in indexing these documents on newly created core. I have tried mapping this network share to local drive and updated config accordingly but still no success. I managed to copy all pdf file from network share to local folder where example core with sample Tika DIH points and I am able to index all pdf files. So I am not sure why Tika config with network path is not able to index the files. Looking into log I can see following entries but that doesn't explain anything. Can someone guide to resolve the issue? 2019-03-26 13:58:37.250 DEBUG (Scheduler-1147580192) [ ] o.e.j.i.FillInterest onFail FillInterest@419eacc8{AC.ReadCB@1ad637ed{HttpConnection@1ad637ed::SocketChannelEndPoint@6190d407{/10.206.11.68:51486<->/10.205.53.163:8983,OPEN,fill=FI,flush=-,to=120010/120000}{io=1/1,kio=1,kro=1}->HttpConnection@1ad637ed[p=HttpParser{s=START,0 of -1},g=HttpGenerator@7d81e85c{s=START}]=>HttpChannelOverHttp@10e588cc{r=2,c=false,a=IDLE,uri=null,age=0}}} java.util.concurrent.TimeoutException: Idle timeout expired: 120010/120000 ms at org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166) [jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50) [jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:1.8.0_201] at java.util.concurrent.FutureTask.run(Unknown Source) [?:1.8.0_201] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) [?:1.8.0_201] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.8.0_201] at java.lang.Thread.run(Unknown Source) [?:1.8.0_201] Is it possible that Solr is not ale to access the network share? Is this anyway that I can run Solr.cmd under different user (who as access to network share) in windows environment? Please let me know if you wish to know any more details about the issue. Thanks in advance -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html