Hi,

 

I want to perform scoped searches in XML documents using Solr. I am
using Solr-Cell to index my document files. I've noticed that when I
index an xml file to Solr (via Solr-Cell) the field tags get stripped
off and only the values are sent to Solr.

i.e. Say I have an XML document which contains the following data:

<test>

    <node1>

        <inner_node1>XYZ</inner_node1>

        <inner_node2>ABC</inner_node2>

        <sometag>PPPP</sometag>

    </node1>

    <node1>

        ....

    </node1>

</test>

 

When I index this xml file, only the field values(XYZ, ABC and PPPP)
seem to go to Solr and the tag elements are stripped off!!! (Although
probing a bit more into the cause seems to point out that this is what
Apache Tika does).

 

Is there any setting or feature which would enable me to preserve the
field/tag information and hence allow me to perform scoped searches
using Solr?

 

Just to clear any confusion by the term "scoped search":

What I mean by scoped search is when I index the above xml document,
Scoped search would allow me to find all occurrences of ABC within the
<inner_node2> XML tag.

 

 

-Kumar

Reply via email to