Thanks, Alex. I’ll experiment with it.
-R
On 3/22/17, 4:38 PM, "Alexandre Rafalovitch" wrote:
You could provide the URP chain name (or individual URPs) when you
index a particular document type, but that requires you to send all
document types to put signature on together.
You could provide the URP chain name (or individual URPs) when you
index a particular document type, but that requires you to send all
document types to put signature on together.
Or you could have a custom URP that skips other ones (they are
chained), though that's messier.
And I think you want
Thanks. I had seen that page but had passed it over since I don’t want to do
de-duping (text fields with the exact same text are possible and not cause for
de-dupe).
If I want just to store the signature, it looks like I define the
signatureField in the configuration and set overwriteDupes to t
You'd use CloneField URP
http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
Then you do your custom algorithm. Or - as I just remembered - use one
of the hash ones described in dedupe section:
https://cwiki.apache.org/confluence/dis
I suppose it could be, but the flexibility of using copy directives is
appealing for handling multiple fields as defined in the schema.
Since I have rarely looked at the UpdateRequestProcessor, I guess I don’t know
if it could take multiple fields to hash, and if so how that would be expressed.
Can this be done at the UpdateRequestProcessor stage?
Regards,
Alex
On 22 Mar 2017 1:48 PM, "Ronald Wood" wrote:
I have been mulling over the usefulness of a new Hash field type for being
able to validate data that is indexed but not stored. Basically, I’d use
copy directives to copy all f