Chris,

mucho thanks.  The

solr.RegexReplaceProcessorFactory

looks like what I need.  What a fantastic search engine this is!

thanks again,

Scott


On 8/10/2015 5:21 PM, Chris Hostetter wrote:
: <meta name="date" content="Unknown" />
: <meta name="dc.date.created" content="Unknown" />
:
: Most documents have a correctly formatted date string and I would like to keep
: that data available for search on the date field.
        ...
: I realize it is complaining because the date string isn't matching the
: data_driven_schema file. How can I coerce it into allowing the non-standard
: date strings while still using the correctly formatted ones?

If you want to preserve all of the data, and don't care about doing Date
operations (ie: date range queries, date faceting, etc...) on the field,
then you could always just define these fields to use a String based field
type.

If you want to only preserve the data that can be cleanly parsed as a
Date, then one workarround would be probably be to configure something
like this *after* the ParseDateFieldUpdateProcessorFactory...

   <processor class="solr.RegexReplaceProcessorFactory">
    <str name="typeClass">solr.TrieDateField</str>
    <str name="pattern">.*</str>
    <str name="replacement"></str>
    <bool name="literalReplacement">true</bool>
   </processor>
   <processor class="solr.RemoveBlankFieldUpdateProcessorFactory">
    <str name="typeClass">solr.TrieDateField</str>
   </processor>

...that should work because the RegexReplaceProcessorFactor will only
operate on _string_ values in the incoming docs -- if
ParseDateFieldUpdateProcessorFactory has already been able to parse the
string into a Date object, it will be ignored.

If you want *both* (ie: to do Date specific operations on docs that can be
parsed, but also know when docs provide other non-Date values in those
fields) you'll need to use more then one field --
CloneFieldUpdateProcessor can handle that for you...

https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html
https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/RemoveBlankFieldUpdateProcessorFactory.html
https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/FieldMutatingUpdateProcessorFactory.html

https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html



-Hoss
http://www.lucidworks.com/



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Reply via email to