my bad, it looks like XPathEntityProcessor doesn't support relative xpaths.
However, I quickly looked at the Slashdot example (which is pretty good actually) at http://wiki.apache.org/solr/DataImportHandler. >From that I infer that you use only 1 entity per xml-doc. And within that entity use multiple field declararations with xpath-attributes to extract the values you want. So even though your xml-dcoument is nested (like most xml's are) your field-declarations are not. I think your best bet is to read the slashdot example and go from there. For now, I'm not entirely sure what you want a solr-document to be in your example. i.e: - 1 solr-document per 1 xml-document (as supplied) - or 1 solr-doc per CHAP per PARA or per SUB? Once you know that, perhaps coming up with a decent pointer is easier. HTH, Geert-Jan <http://wiki.apache.org/solr/DataImportHandler> 2010/6/8 Tor Henning Ueland <tor.henn...@gmail.com> > I have tried both to change the datasource per child node to use the > parent nodes name, and tried to making the Xpath`s relative, all > causing either exceptions telling that Xpath must start with /, or > nullpointer exceptions ( nsfgrantsdir document : null). > > Best regards > > On Mon, Jun 7, 2010 at 4:12 PM, Geert-Jan Brits <gbr...@gmail.com> wrote: > > I'm guessing (I'm not familiar with the xml dataimport handler, but I am > > pretty familiar with Xpath) > > that your problem lies in having absolute xpath-queries, instead of > relative > > xpath queries to your parent node. > > > > e.g: /DOK/TEKST/KAP is absolute ( the prefixed '/' tells it to be). Try > > 'KAP' instead. > > The same for all xpaths deeper in the tree. > > > > Geert-Jan > > > > 2010/6/7 Tor Henning Ueland <tor.henn...@gmail.com> > > > >> Hi, > >> > >> I am doing some testing of dataimport to Solr from XML-documents with > >> many children in the children. To parse the children i some levels > >> down using Xpath goes fine, but the speed is very slow. (~1 minute per > >> document, on a quad Xeon server). When i do the same using the format > >> solr wants it, the parsing time is 0.02 seconds per document. > >> > >> I have published a quick example here: > >> http://pastebin.com/adhcEvRx > >> > >> My question is: > >> > >> I hope that i have done something wrong in the child-parsing (as you > >> can see, it goes down quite a few levels). Can anybody point me in the > >> right direction so i can speed up the process? I have been looking > >> around for some examples, but nobody gives examples of such deep data > >> indexing. > >> > >> PS: I know there are some bugs in the Xpath naming etc, but it is just > >> a rough example :) > >> > >> -- > >> Best regars > >> Tor Henning Ueland > >> > > > > > > -- > Mvh > Tor Henning Ueland >