Hi Charlie, Thanks for your suggestion, but I will have thousands of these files coming from different sources. It would become very tedious if I have to first convert them to csv and then run liny by line.
I was hoping if there could be a simpker way to achieve these using DIH which I thought can be configured to read and ingest MS Excel (xlsx) files. I am not too sure of how the configuration file would look like. Any pointers are welcome. Thanks! On Fri, 26 Jul, 2019, 1:56 PM Charlie Hull, <char...@flax.co.uk> wrote: > Convert the Excel file to a CSV and then write a teeny script to go > through it line by line and submit to Solr over HTTP? Tika would > probably work but it's a lot of heavy lifting for what seems to me like > a simple problem. > > Cheers > > Charlie > > On 26/07/2019 09:19, Vipul Bahuguna wrote: > > Hi Guys - can anyone suggest how to achieve this? > > I have understood how to insert json documents. So one alternative that > > comes to my mind is that I can convert the rows in my excel to json > format > > with the header of my excel file becoming the json keys (corresponding to > > the fields I have defined in my managed-schema.xml). And then each cell > in > > the excel file will become the value of this field. > > > > However, I am sure there must be a better way and directly ingesting the > > excel file to achieve the same. I was trying to reach about DIH and > Apache > > Tika, but I am not very sure of how the configuration works. > > > > My sample excel file has 4 columns namely - > > 1. First Name > > 2. Last Name > > 3. Phone > > 4. Website Link > > > > I want to index these fields into SOLR in a way that all these columns > > become my solr schema fields and later I can search based on these > fields. > > > > Any suggestions please. > > > > thanks ! > > > > -- > Charlie Hull > Flax - Open Source Enterprise Search > > tel/fax: +44 (0)8700 118334 > mobile: +44 (0)7767 825828 > web: www.flax.co.uk > >