Thanks Alex for getting back to me. As per your suggestion I've gone down the xsl root. I've created a transformation that works fine in various test tools, but Solr is throwing errors such as:
Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed for xml, url:null rows processed:0 Processing Document # 1 ..... Caused by: java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?). Looks like I need to dig a bit deeper Regards, Alan. -----Original Message----- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: 23 October 2015 12:00 To: solr-user Subject: Re: Select sibling data via XPathEntityProcessor If you are stuck with DIH, it looks like you can specify xsl attribute to the XPathEntityProcessor and it will be used as a pre-procesor. I would probably use it to convert outer NamedAuthority tag into a corresponding Author or Subject tag. Looks easiest. If you are not sure how to generate good XSL, have a look at something like http://xmlstar.sourceforge.net/overview.php - it is sort of command line processor but can also emit XSL to show you what it should look like. I wrote about this tool many many moons ago at: http://www.freesoftwaremagazine.com/articles/xml_starlet Regards, Alex. ---- Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 23 October 2015 at 02:25, Routley, Alan <alan.rout...@bl.uk> wrote: > Hi Alex > > Thanks for the reply. > > I think I'm stuck with using the DIH as I'm initially using the > SqlEntityProcessor to extract records from SQL server, indexing some the > standard relational fields before handing the xml piece over to the > XPathEntityProcessor. > I'll look into adding an XSLT processor into the mix, but not used one > before, so if you could possibly point me at an example that could get me > started that would be a great help. > > Thanks > > Alan. > > -----Original Message----- > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > Sent: 22 October 2015 15:43 > To: solr-user > Subject: Re: Select sibling data via XPathEntityProcessor > > I don't think DIH supports siblings. Have you thought of using XSLT processor > before sending XML to Solr. Or using it instead of DIH during the update (not > a well know part of Solr): > https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+I > ndex+Handlers#UploadingDatawithIndexHandlers-UsingXSLTtoTransformXMLIn > dexUpdates > > With XSLT, you could just confirm your format directly into Solr XML Update > format and not bother with field mapping. > > Regards, > Alex. > ---- > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > http://www.solr-start.com/ > > > On 22 October 2015 at 10:17, Routley, Alan <alan.rout...@bl.uk> wrote: >> Hi, >> >> Given an xml structure: >> >> <Person> >> <Relationships> >> <NamedAuthority> <Type>Subject</Type> >> <Id>032-001946363</Id> </NamedAuthority> >> <NamedAuthority> <Type>Subject</Type> >> <Id>037-001946370</Id> </NamedAuthority> >> <NamedAuthority> <Type>Author</Type> >> <Id>040-001959713</Id> </NamedAuthority> >> <NamedAuthority> <Type>Author</Type> >> <Id>040-001959829</Id> </NamedAuthority> >> <NamedAuthority> <Type>Subject</Type> >> <Id>032-001961797</Id> </NamedAuthority> >> <NamedAuthority> <Type>Author</Type> >> <Id>040-001961798</Id> </NamedAuthority> >> </Relationships> >> </Person> >> >> I’m trying to use the XPathEntityProcessor to put all the Subject Id’s into >> one multiValued field and the Author Id’s into another. >> >> I was hoping I could use field’s with the following, but the XPath does not >> seem to be supported. >> >> <field column="SubjectRelationships" xpath=" >> /Person/Relationships/NamedAuthority >> /Type[.='Subject']/following-sibling::Id” /> <field >> column="AuthorRelationships" xpath=" >> /Person/Relationships/NamedAuthority >> /Type[.='Author']/following-sibling::Id” /> >> >> Could anyone suggest a way for me to achieve this. >> >> Many Thanks. >> >> >> >> >> >> >> >> ********************************************************************* >> * >> ******************************************** >> Experience the British Library online at www.bl.uk<http://www.bl.uk/> >> The British Library’s latest Annual Report and Accounts : >> www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/i >> n dex.html> Help the British Library conserve the world's knowledge. >> Adopt a Book. www.bl.uk/adoptabook<http://www.bl.uk/adoptabook> >> The Library's St Pancras site is WiFi - enabled >> ********************************************************************* >> * >> ******************************************* >> The information contained in this e-mail is confidential and may be legally >> privileged. It is intended for the addressee(s) only. If you are not the >> intended recipient, please delete this e-mail and notify the >> postmas...@bl.uk<mailto:postmas...@bl.uk> : The contents of this e-mail must >> not be disclosed or copied without the sender's consent. >> The statements and opinions expressed in this message are those of the >> author and do not necessarily reflect those of the British Library. The >> British Library does not take any responsibility for the views of the author. >> ********************************************************************* >> * >> ******************************************* >> Think before you print