And (as an additive comment), You may want to index into a completely separate collection and then do alias switching to point to it when done. That indexing could even be on a separate machine.
Regards, Alex. On 18 May 2018 at 08:47, Emir Arnautović <emir.arnauto...@sematext.com> wrote: > Hi Darko, > There is no updating data in Solr. It is always written into new segment and > if some existing document has the same ID it will be flagged as deleted but > will not be removed until that segment is merged. While merging it will keep > old segments until new is done and searcher updated. So in any case there is > a change that Solr might need more space than index. In some extreme cases it > can be even three times the size of an index. > I am bit rusty on DIH, but based on your comment it seems that full-import is > doing temp index and then switch. Delta import should update existing and if > you can use delta import you should be safe. Having 250GB index and max > segment of 5GB you should not reach 500GB even if you delta import all > documents. > Please note that for full import it is advisable to create a new index so I > would suggest that you start asking for bigger disks. > > HTH, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > >> On 18 May 2018, at 14:28, Darko Todoric <todo...@mdpi.com> wrote: >> >> Hi guys, >> >> We have about 250gb solr data on one server and when we start full import >> solr doubling space on disk... This is problem for us because we have 500gb >> SSD on this server and we hit almost 100% disk usage when full import >> running. >> Because we don't use "clean" option, are they are way to tell full/delta >> import that update data immediately and don't wait to finished and then >> update all? In that way, full import no need to create this tmp folder from >> the 250gb. >> >> Kind regards, >> Darko Todoric >