Re: DIH for TikaEntityProcessor

2018-10-12 Thread Kamuela Lau
Glad to help :) 2018年10月12日(金) 21:10 Martin Frank Hansen (MHQ) : > You sir just made my day!!! > > It worked!!! Thanks a million! > > > Martin Frank Hansen, > > -Oprindelig meddelelse- > Fra: Kamuela Lau > Sendt: 12. oktober 2018 11:41 > Til: solr-user@

Re: DIH for TikaEntityProcessor

2018-10-12 Thread Alexandre Rafalovitch
Solr ships with DIH Tika example that seems 90% identical to yours. Can you get that to run? If it works, then you can focus on the 10% difference. Perhaps it is explicit dataSource=null in the outer entity? Or maybe format=text on the inner one. Regards, Alex On Fri, Oct 12, 2018, 3:11 AM

Re: DIH for TikaEntityProcessor

2018-10-12 Thread Kamuela Lau
Also, just wondering, have you have tried to specify dataSource="bin" for read_file? On Fri, Oct 12, 2018 at 6:38 PM Kamuela Lau wrote: > Hi, > > I was unable to reproduce the error that you got with the information > provided. > Below are the data-config.xml and managed-schema fields I used; th

Re: DIH for TikaEntityProcessor

2018-10-12 Thread Kamuela Lau
Hi, I was unable to reproduce the error that you got with the information provided. Below are the data-config.xml and managed-schema fields I used; the data-config is mostly the same (I think that BinFileDataSource doesn't actually require a dataSource, so I think it's safe to put dataSource="null