It sounds like you haven't yet looked at the way Solr handles fields. I assume that Solr-Cell (which I haven't looked at yet but hope to soon) indexes everything into a single field. When using Solr on its own, the first thing you do is create a schema that specifies the fields you want in your index; you then massage your xml into the form Solr expects. In your example you would end up with input documents somehting like
<doc> <field name="inner_node1">XYZ</field> <field name="inner_node2">ABC</field> <field name="sometag">PPPP</field> </doc> (That applies to updating the index by posting xml to Solr; there are many other mechanisms for populating the index now, but the basic ideas of specifying fields remain the same). The wiki page on Solr schemas (http://wiki.apache.org/solr/SchemaXml) and the sample schema linked there will make it clear how to specify your fields. You will then be able to specify fields in your queries like "sometag:PPPP". Now you'll need to figure out how this underlying Solr functionality is exposed by Solr-Cell, but I hope this is a start. Peter > -----Original Message----- > From: Jana, Kumar Raja [mailto:kj...@ptc.com] > Sent: Monday, December 22, 2008 6:30 AM > To: solr-user@lucene.apache.org > Subject: RE: Scoped searches in XML documents > > Hi Shalin, > > Thanks for the quick response. I've found my mistake. It was > actually a silly setting in my application before sending the > documents to Solr-Cell which was stripping off the xml tags. > I was able to index the document with the xml tags. Sorry for > being so hasty. > > So the only question left is, will I be able to perform > scoped searches using Solr? Is this already implemented in > Solr or is there a workaround? > > Thanks > Kumar > > > -----Original Message----- > From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] > Sent: Monday, December 22, 2008 6:27 PM > To: solr-user@lucene.apache.org > Subject: Re: Scoped searches in XML documents > > If your XML documents are of a fixed schema, you may want to > look at DataImportHandler with XPathEntityProcessor > > http://wiki.apache.org/solr/DataImportHandler > > On Mon, Dec 22, 2008 at 5:49 PM, Jana, Kumar Raja > <kj...@ptc.com> wrote: > > > Hi, > > > > > > > > I want to perform scoped searches in XML documents using Solr. I am > > using Solr-Cell to index my document files. I've noticed > that when I > > index an xml file to Solr (via Solr-Cell) the field tags > get stripped > > off and only the values are sent to Solr. > > > > i.e. Say I have an XML document which contains the following data: > > > > <test> > > > > <node1> > > > > <inner_node1>XYZ</inner_node1> > > > > <inner_node2>ABC</inner_node2> > > > > <sometag>PPPP</sometag> > > > > </node1> > > > > <node1> > > > > .... > > > > </node1> > > > > </test> > > > > > > > > When I index this xml file, only the field values(XYZ, ABC > and PPPP) > > seem to go to Solr and the tag elements are stripped off!!! > (Although > > probing a bit more into the cause seems to point out that > this is what > > Apache Tika does). > > > > > > > > Is there any setting or feature which would enable me to > preserve the > > field/tag information and hence allow me to perform scoped searches > > using Solr? > > > > > > > > Just to clear any confusion by the term "scoped search": > > > > What I mean by scoped search is when I index the above xml > document, > > Scoped search would allow me to find all occurrences of ABC > within the > > <inner_node2> XML tag. > > > > > > > > > > > > -Kumar > > > > > > > -- > Regards, > Shalin Shekhar Mangar. > >