Thank you all for your comments and help - I kept the last days' worth of files form the /tmp folder and removed the rest - without any problems or difficulties.
Sas -----Original Message----- From: Walter underwood [mailto:wun...@wunderwood.org] Sent: Saturday, October 29, 2016 1:10 PM To: solr-user@lucene.apache.org Subject: [E] Re: Questions about Disk space Usage If it works the way I think it does, an empty segment should take the same amount of time to read in as a full segment, but zero time to write out. wunder > On Oct 29, 2016, at 9:21 AM, Erick Erickson <erickerick...@gmail.com> wrote: > > I would also expect a totally empty segment to be merged very quickly > as the percent deleted documents weighs heavily when determining > whether to merge a segment.... but that's based on principle, not deep > code knowledge. > > Best, > Erick > >> On Fri, Oct 28, 2016 at 6:02 PM, Walter Underwood <wun...@wunderwood.org> >> wrote: >> After the merge. That is what merges do, clean up segments. >> >> I expect it is very rare for a segment to be 100% deleted docs, so it >> isn’t worth handling that case. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >> >>> On Oct 28, 2016, at 5:54 PM, Alexandre Rafalovitch <arafa...@gmail.com> >>> wrote: >>> >>> Don't the segment that only has deleted documents just gets dropped? >>> Or does it get dropped _after_ the merge and therefore still sits >>> around? >>> >>> Regards, >>> Alex. >>> ---- >>> Solr Example reading group is starting November 2016, join us at >>> http://j.mp/SolrERG Newsletter and resources for Solr beginners and >>> intermediates: >>> http://www.solr-start.com/ >>> >>> >>>> On 29 October 2016 at 08:53, Walter Underwood <wun...@wunderwood.org> >>>> wrote: >>>> It is normal for disk usage to double. Under controlled >>>> circumstances, it can triple, but that probably won’t happen. >>>> >>>> This is the second time today that I’ve sent this information to the list. >>>> >>>> It can use nearly 2X the space whenever the largest segment(s) are >>>> merged, especially if there are only a few smaller segments. >>>> >>>> In order to use 3X the space, you need to: >>>> >>>> 1. Disable merging. >>>> 2. Delete all the documents. >>>> 3. Add all the documents. >>>> 4. Enable merging. >>>> >>>> This causes one complete set of segments that are 100% deletes, one >>>> set that is 0% deletes, then the merge creates another set that is >>>> 0% deletes. During the merge, the old files remain while the new >>>> one is created. >>>> >>>> wunder >>>> Walter Underwood >>>> wun...@wunderwood.org >>>> http://observer.wunderwood.org/ (my blog) >>>> >>>> >>>>> On Oct 28, 2016, at 2:41 PM, Alexandre Rafalovitch <arafa...@gmail.com> >>>>> wrote: >>>>> >>>>> 2) Is probably a merge operation. Lucene index segments are not >>>>> rewritable in place, so the merge creates a new file, does >>>>> everything to it, then switches to it. >>>>> >>>>> I remember the number was that the space could temporarily triple >>>>> (?!?) though that may have been before the tiered merge policy. >>>>> >>>>> 3) It should be safe to delete old log files. It is standard log4j stuff. >>>>> >>>>> ---- >>>>> Solr Example reading group is starting November 2016, join us at >>>>> http://j.mp/SolrERG Newsletter and resources for Solr beginners >>>>> and intermediates: >>>>> http://www.solr-start.com/ >>>>> >>>>> >>>>> On 29 October 2016 at 06:55, Jamal, Sarfaraz >>>>> <sarfaraz.ja...@verizonwireless.com.invalid> wrote: >>>>>> Hi Guys, >>>>>> >>>>>> I am currently investigating an instance of Solr's Disk space usage and >>>>>> I had a few questions I thought you guys might be able to help answer. >>>>>> >>>>>> First Question >>>>>> * There is 30 gb's worth of autosuggest data in the /tmp folder. >>>>>> Each file is half of a gigabyte Is it safe to delete those files? >>>>>> >>>>>> Second Question >>>>>> Also, we notice that at times the disk runs down to only having a few >>>>>> gigabytes available, and then goes back to having more space. (the index >>>>>> file literally grows and then shrinks). >>>>>> >>>>>> Third Question >>>>>> Is it also safe to delete the log files? >>>>>> >>>>>> We run a database indexer on a set interval, perhaps that is relevant to >>>>>> this discussion. >>>>>> >>>>>> Sas >>>> >>