Hi Hoss,

thank you for the quick response and the explanations!

> My suggestion would be to modify the XPath expression you are using to 
> pull data out of your original XML files and ignore  "<estimated_hours/>"
> 

I don't think this is possible. That would include text() in the XPath which is 
not handled by the XPathRecordReader. I've checked in the code, as well, and 
the JavaDoc does not list this possibility. I've tried those patterns:

/issues/issue/estimated_hours[text()]
/issues/issue/estimated_hours/text()

No value at all will be added for that field for any of the documents 
(including those that do have a value in the XML).

> Alternatively: there are some new UpdateProcessors available in 4.0 that 
> let you easily prune field values based on various criteria (update 
> porcessors happen well before copyField)...
> 
> http://lucene.apache.org/solr/api-4_0_0-ALPHA/org/apache/solr/update/processor/RemoveBlankFieldUpdateProcessorFactory.html

Thanks for pointing me to it. I've switched to 4.0.0-ALPHA (hoping, the ALPHA 
doesn't show itself too often ;-) ).

For anyone interested, my DataImportHandler Setup in solrconfig.xml now reads:

        <updateRequestProcessorChain name="emptyFieldChain">
                <processor class="solr.RemoveBlankFieldUpdateProcessorFactory" 
/>
        </updateRequestProcessorChain>

        <requestHandler name="/dataimport"
                class="org.apache.solr.handler.dataimport.DataImportHandler">
                <lst name="defaults">
                        <str name="update.chain">emptyFieldChain</str>
                        <str name="config">data-config.xml</str>
                        <str name="clean">true</str>
                        <str name="commit">true</str>
                        <str name="optimize">true</str>
                </lst>
        </requestHandler>

Works as expected!

And kudos to those working on the admin frontend, as well! The new admin is 
indeed slick!



> But i can certainly understand the confusion, i've opened SOLR-3657 to try 
> and improve on this.  Ideally the error message should make it clear that 
> the "value" from "source" field was copied to "dest" field which then 
> encountered "error"
> 

Thank you! Good Exception messages are certainly helpful!

Chantal

Reply via email to