Re: Reindexing using dataimporthandler

Emir Arnautović Mon, 27 Apr 2020 04:46:42 -0700

Hi Bjarke,
I don’t see a problem with that approach if you have enough resources to handle 
both cores at the same time, especially if you are doing that while serving 
production queries. The only issue is that if you plan to do that then you have 
to have all fields stored. Also note that cursorMark support was added a bit 
later to entity processor, so if you are running a bit older version of Solr, 
you might not have cursors - I’ve found it the hard way.


Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 27 Apr 2020, at 13:11, Bjarke Buur Mortensen <morten...@eluence.com> wrote:
> 
> Hi list,
> 
> Let's say I add a copyField to my solr schema, or change the analysis chain
> of a field or some other change.
> It seems to me to be an alluring choice to use a very simple
> dataimporthandler to reindex all documents, by using a SolrEntityProcessor
> that points to itself. I have just done this for a very small collection,
> but I was wondering what the caveats are, since this is not the recommended
> practice. What can go wrong using this approach?
> 
> <document> <entity name="all_from_self" processor="SolrEntityProcessor" url=
> "http://localhost:8983/solr/mycollection"; qt="lucene" query="*:*" wt=
> "javabin" rows="1000" cursorMark="true" sort="id asc" fl=
> "*,orig_version_l:_version_"/> </document>
> 
> PS: (It is probably necessary to add a version:[* TO
> <current_highest_version>] to ensure it terminates for large imports)
> PPS: (Obviously you shouldn't add the clean parameter)
> 
> /Bjarke

Re: Reindexing using dataimporthandler

Reply via email to