And (as an additive comment),

You may want to index into a completely separate collection and then
do alias switching to point to it when done. That indexing could even
be on a separate machine.

Regards,
   Alex.

On 18 May 2018 at 08:47, Emir Arnautović <emir.arnauto...@sematext.com> wrote:
> Hi Darko,
> There is no updating data in Solr. It is always written into new segment and 
> if some existing document has the same ID it will be flagged as deleted but 
> will not be removed until that segment is merged. While merging it will keep 
> old segments until new is done and searcher updated. So in any case there is 
> a change that Solr might need more space than index. In some extreme cases it 
> can be even three times the size of an index.
> I am bit rusty on DIH, but based on your comment it seems that full-import is 
> doing temp index and then switch. Delta import should update existing and if 
> you can use delta import you should be safe. Having 250GB index and max 
> segment of 5GB you should not reach 500GB even if you delta import all 
> documents.
> Please note that for full import it is advisable to create a new index so I 
> would suggest that you start asking for bigger disks.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 18 May 2018, at 14:28, Darko Todoric <todo...@mdpi.com> wrote:
>>
>> Hi guys,
>>
>> We have about 250gb solr data on one server and when we start full import 
>> solr doubling space on disk... This is problem for us because we have 500gb 
>> SSD on this server and we hit almost 100% disk usage when full import 
>> running.
>> Because we don't use "clean" option, are they are way to tell full/delta 
>> import that update data immediately and don't wait to finished and then 
>> update all? In that way, full import no need to create this tmp folder from 
>> the 250gb.
>>
>> Kind regards,
>> Darko Todoric
>

Reply via email to