You should declare this
<str name="update.chain">nohtml</str>
in the "defaults" section of the RequestHandler that corresponds to your
dataimporthandler. You should have something like this:
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">dih-config.xml</str>
<str name="update.chain">nohtml/str>
</lst>
</requestHandler>
Otherwise the default update chain will be called (and your URP are not
part of that). The solrj, behind the scenes, is a client of the /update
request handler, that's the reason why using that you can see your URP
working.
Best,
Gazza
On 08/22/2013 05:35 PM, Shawn Heisey wrote:
I have an updateProcessor defined. It seems to work perfectly when I
index with SolrJ, but when I use DIH (which I do for a full index
rebuild), it doesn't work. This is the case with both Solr 4.4 and
Solr 4.5-SNAPSHOT, svn revision 1516342.
Here's a solrconfig.xml excerpt:
<updateRequestProcessorChain name="nohtml">
<!-- First pass converts entities and strips html. -->
<processor class="solr.HTMLStripFieldUpdateProcessorFactory">
<str name="fieldName">ft_text</str>
<str name="fieldName">ft_subject</str>
<str name="fieldName">keywords</str>
<str name="fieldName">text_preview</str>
</processor>
<!-- Second pass fixes dually-encoded stuff. -->
<processor class="solr.HTMLStripFieldUpdateProcessorFactory">
<str name="fieldName">ft_text</str>
<str name="fieldName">ft_subject</str>
<str name="fieldName">keywords</str>
<str name="fieldName">text_preview</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">nohtml</str>
</lst>
</requestHandler>
If I turn on DEBUG logging for FieldMutatingUpdateProcessorFactory, I
see "replace value" debugs, but the contents of the index are only
changed if the update happens with SolrJ, not with DIH.
A side issue. FieldMutatingUpdateProcessorFactory has the following
line in it, at about line 72:
if (destVal != srcVal) {
Shouldn't this be the following?
if (destVal.equals(srcVal)) {
Thanks,
Shawn