Hi Alex, 

Thanks again for the reply. See my response below inline. 
> Am 22.03.2015 um 20:14 schrieb Alexandre Rafalovitch <arafa...@gmail.com>:
> 
> I am not entirely sure your problem is at the XSL level yet?
> 
> *) I see problems with quotes in two places (in datasource, and in
> outer entity). Did you paste definitions from MSWord by any chance?

The file was created in a text editor. I am not sure which quotes you are 
referring to. They look fine to me and the XML file valides alright. Could you 
perhaps be more specific?

> *) I see that you declare outer entity to be rootEntity=true, so you
> will not get anything from inner documents

That’s correct, I have set the value to „false" now 

> *) I don't see any XPath definitions in the inner entity, so the
> processor does not know how to actually map to the fields (that's
> different for SQLEntityProcessor which auto-maps).

As far as I know, the explicit mappings are not required when the result of the 
transformation is in the Solr default import format. The documentation says: 
useSolrAddSchema

- Set this to true if the content is in the form of the standard Solr update 
XML schema.

(https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler
 
<https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler>)

But maybe my interpretation here is incorrect. I was assuming that setting this 
attribute to „true“ will allow the DIH to directly process the resulting XML 
file as if I was importing it with the command line Java tool. 

> 
> I would step back from inner DIH entity and make sure your outer
> entity actually captures something. Maybe by enabling dynamicField "*"
> with stored=true. See what you get into the schema. Then, add XPath
> against original XML, just to make sure you capture _something_. Then,
> XSLT and XPath.

OK, I will try to debug the DIH like this. Thanks again. 

Cheers, 

Martin
 
 


> 
> Regards,
>   Alex.
> ----
> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> http://www.solr-start.com/
> 
> 
> On 22 March 2015 at 12:36, Martin Wunderlich <martin...@gmx.net> wrote:
>> Hi Alex,
>> 
>> Thanks a lot for the reply and apologies for being unclear. The 
>> XPathEntityProcessor provides an option to specify an XSLT file that should 
>> be applied to the XML input prior to the actual data import. I am including 
>> my current configuration below, with the respective attribute highlighted.
>> 
>> I have checked various forums and documentation bits, but the config XML 
>> seems ok to me. And yet, nothing gets imported.
>> 
>> Cheers,
>> 
>> Martin
>> 
>> 
>> <dataConfig>
>>    <dataSource encoding="UTF-8"
>>        type=„FileDataSource />
>>        <entity
>>            name="pickupdir"
>>            processor="FileListEntityProcessor"
>>            rootEntity="true"
>>            fileName=".*xml"
>>            baseDir=„/abs/path/to/source/dir/for/import/"
>>            recursive="true"
>>            newerThan="${dataimporter.last_index_time}"
>>            dataSource="null">
>> 
>>            <entity
>>                name="xml"
>>                processor="XPathEntityProcessor"
>>                stream="false"
>>                useSolrAddSchema="true"
>>                url="${pickupdir.fileAbsolutePath}"
>>                xsl="/abs/path/to/xslt/file/in/myCore/conf/transform.xsl">
>>            </entity>
>>        </entity>
>>    </document>
>> </dataConfig>
>> 
>> 
>> 
>> 
>>> Am 22.03.2015 um 01:18 schrieb Alexandre Rafalovitch <arafa...@gmail.com 
>>> <mailto:arafa...@gmail.com>>:
>>> 
>>> What do you mean using DIH with XSLT together? DIH uses a basic XPath
>>> parser, but not full XSLT.
>>> 
>>> So, it's not very clear what the question actually means. How did you
>>> configure it all?
>>> 
>>> Regards,
>>>  Alex.
>>> ----
>>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
>>> http://www.solr-start.com/ <http://www.solr-start.com/>
>>> 
>>> 
>>> On 21 March 2015 at 14:14, Martin Wunderlich <martin...@gmx.net> wrote:
>>>> Hi all,
>>>> 
>>>> I am trying to create a data import handler (DIH) to import XML files. The 
>>>> source XML should be transformed using XSLT into the standard Solr import 
>>>> format. I have tested the XSLT and successfully imported data using the 
>>>> Java-based simple import tool. However, when I try to import the same XML 
>>>> files with the same XSLT pre-processing using a DIH configured in 
>>>> solrconfig.xml, it doesn’t work. I can execute the DIH from the admin 
>>>> interface, but no documents get imported. The logging console doesn’t give 
>>>> any errors.
>>>> 
>>>> Could someone who has managed to successfully set up a similar 
>>>> configuration (XML import via DIH with XSL pre-processing), provide with 
>>>> the basic configuration, so that I can check what might be wrong in mine?
>>>> 
>>>> Thanks a lot.
>>>> 
>>>> Cheers,
>>>> 
>>>> Martin
>>>> 
>>>> 
>> 

Reply via email to