Hi Alex, Thanks again for the reply. See my response below inline.
> Am 22.03.2015 um 20:14 schrieb Alexandre Rafalovitch <arafa...@gmail.com>: > > I am not entirely sure your problem is at the XSL level yet? > > *) I see problems with quotes in two places (in datasource, and in > outer entity). Did you paste definitions from MSWord by any chance? The file was created in a text editor. I am not sure which quotes you are referring to. They look fine to me and the XML file valides alright. Could you perhaps be more specific? > *) I see that you declare outer entity to be rootEntity=true, so you > will not get anything from inner documents That’s correct, I have set the value to „false" now > *) I don't see any XPath definitions in the inner entity, so the > processor does not know how to actually map to the fields (that's > different for SQLEntityProcessor which auto-maps). As far as I know, the explicit mappings are not required when the result of the transformation is in the Solr default import format. The documentation says: useSolrAddSchema - Set this to true if the content is in the form of the standard Solr update XML schema. (https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler <https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler>) But maybe my interpretation here is incorrect. I was assuming that setting this attribute to „true“ will allow the DIH to directly process the resulting XML file as if I was importing it with the command line Java tool. > > I would step back from inner DIH entity and make sure your outer > entity actually captures something. Maybe by enabling dynamicField "*" > with stored=true. See what you get into the schema. Then, add XPath > against original XML, just to make sure you capture _something_. Then, > XSLT and XPath. OK, I will try to debug the DIH like this. Thanks again. Cheers, Martin > > Regards, > Alex. > ---- > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > http://www.solr-start.com/ > > > On 22 March 2015 at 12:36, Martin Wunderlich <martin...@gmx.net> wrote: >> Hi Alex, >> >> Thanks a lot for the reply and apologies for being unclear. The >> XPathEntityProcessor provides an option to specify an XSLT file that should >> be applied to the XML input prior to the actual data import. I am including >> my current configuration below, with the respective attribute highlighted. >> >> I have checked various forums and documentation bits, but the config XML >> seems ok to me. And yet, nothing gets imported. >> >> Cheers, >> >> Martin >> >> >> <dataConfig> >> <dataSource encoding="UTF-8" >> type=„FileDataSource /> >> <entity >> name="pickupdir" >> processor="FileListEntityProcessor" >> rootEntity="true" >> fileName=".*xml" >> baseDir=„/abs/path/to/source/dir/for/import/" >> recursive="true" >> newerThan="${dataimporter.last_index_time}" >> dataSource="null"> >> >> <entity >> name="xml" >> processor="XPathEntityProcessor" >> stream="false" >> useSolrAddSchema="true" >> url="${pickupdir.fileAbsolutePath}" >> xsl="/abs/path/to/xslt/file/in/myCore/conf/transform.xsl"> >> </entity> >> </entity> >> </document> >> </dataConfig> >> >> >> >> >>> Am 22.03.2015 um 01:18 schrieb Alexandre Rafalovitch <arafa...@gmail.com >>> <mailto:arafa...@gmail.com>>: >>> >>> What do you mean using DIH with XSLT together? DIH uses a basic XPath >>> parser, but not full XSLT. >>> >>> So, it's not very clear what the question actually means. How did you >>> configure it all? >>> >>> Regards, >>> Alex. >>> ---- >>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: >>> http://www.solr-start.com/ <http://www.solr-start.com/> >>> >>> >>> On 21 March 2015 at 14:14, Martin Wunderlich <martin...@gmx.net> wrote: >>>> Hi all, >>>> >>>> I am trying to create a data import handler (DIH) to import XML files. The >>>> source XML should be transformed using XSLT into the standard Solr import >>>> format. I have tested the XSLT and successfully imported data using the >>>> Java-based simple import tool. However, when I try to import the same XML >>>> files with the same XSLT pre-processing using a DIH configured in >>>> solrconfig.xml, it doesn’t work. I can execute the DIH from the admin >>>> interface, but no documents get imported. The logging console doesn’t give >>>> any errors. >>>> >>>> Could someone who has managed to successfully set up a similar >>>> configuration (XML import via DIH with XSL pre-processing), provide with >>>> the basic configuration, so that I can check what might be wrong in mine? >>>> >>>> Thanks a lot. >>>> >>>> Cheers, >>>> >>>> Martin >>>> >>>> >>