The case changed to not using those xml-files at all, i ended up using some other datafiles as sources, witch had everything flat, so no recursion was needed afterall. But thanks for the input! :)
Best regards On Tue, Jun 8, 2010 at 11:08 AM, Geert-Jan Brits <gbr...@gmail.com> wrote: > my bad, it looks like XPathEntityProcessor doesn't support relative xpaths. > > However, I quickly looked at the Slashdot example (which is pretty good > actually) at http://wiki.apache.org/solr/DataImportHandler. > From that I infer that you use only 1 entity per xml-doc. And within that > entity use multiple field declararations with xpath-attributes to extract > the values you want. > So even though your xml-dcoument is nested (like most xml's are) your > field-declarations are not. > > I think your best bet is to read the slashdot example and go from there. > > For now, I'm not entirely sure what you want a solr-document to be in your > example. i.e: > - 1 solr-document per 1 xml-document (as supplied) > - or 1 solr-doc per CHAP per PARA or per SUB? > > Once you know that, perhaps coming up with a decent pointer is easier. > > HTH, > Geert-Jan > > > <http://wiki.apache.org/solr/DataImportHandler> > > 2010/6/8 Tor Henning Ueland <tor.henn...@gmail.com> > >> I have tried both to change the datasource per child node to use the >> parent nodes name, and tried to making the Xpath`s relative, all >> causing either exceptions telling that Xpath must start with /, or >> nullpointer exceptions ( nsfgrantsdir document : null). >> >> Best regards >> >> On Mon, Jun 7, 2010 at 4:12 PM, Geert-Jan Brits <gbr...@gmail.com> wrote: >> > I'm guessing (I'm not familiar with the xml dataimport handler, but I am >> > pretty familiar with Xpath) >> > that your problem lies in having absolute xpath-queries, instead of >> relative >> > xpath queries to your parent node. >> > >> > e.g: /DOK/TEKST/KAP is absolute ( the prefixed '/' tells it to be). Try >> > 'KAP' instead. >> > The same for all xpaths deeper in the tree. >> > >> > Geert-Jan >> > >> > 2010/6/7 Tor Henning Ueland <tor.henn...@gmail.com> >> > >> >> Hi, >> >> >> >> I am doing some testing of dataimport to Solr from XML-documents with >> >> many children in the children. To parse the children i some levels >> >> down using Xpath goes fine, but the speed is very slow. (~1 minute per >> >> document, on a quad Xeon server). When i do the same using the format >> >> solr wants it, the parsing time is 0.02 seconds per document. >> >> >> >> I have published a quick example here: >> >> http://pastebin.com/adhcEvRx >> >> >> >> My question is: >> >> >> >> I hope that i have done something wrong in the child-parsing (as you >> >> can see, it goes down quite a few levels). Can anybody point me in the >> >> right direction so i can speed up the process? I have been looking >> >> around for some examples, but nobody gives examples of such deep data >> >> indexing. >> >> >> >> PS: I know there are some bugs in the Xpath naming etc, but it is just >> >> a rough example :) >> >> >> >> -- >> >> Best regars >> >> Tor Henning Ueland >> >> >> > >> >> >> >> -- >> Mvh >> Tor Henning Ueland >> > -- Mvh Tor Henning Ueland