Can you use Tika?
https://tika.apache.org/0.9/formats.html

On Wed, 2016-06-08 at 10:06 -0400, Aniruddh Sharma wrote:

> Hi
> 
> I am new to use Solr.
> 
> I am running Solr 4.10.3 on CDH 5.5.
> 
> My use case is , I have real time data ingestion in Hadoop on which I want
> to implement search.
> 
> My input data format is XML and it has nested child nodes. So my question
> is about schema creation for solr.
> 
> Technically I notice in JSON format , it is possible to handle nested data.
> 
> a) Although technically JSON can handle nested child data. Is it also
> doable in XML format. If no, then are there any guidelines to change XML
> data to JSON or what is best way around to deal with this.
> 
> b) Even though if could be technically done, from a functional point of
> view when does it make sense to store data in Solr as nested vs flattened .
> What is functional use case which drives this.
> 
> 
> Thanks and Regards


Reply via email to