Re: better anti OOM

2011-12-27 Thread Edward Capriolo
I do major companions and I have ran into bloom filters causing oom. One trick I did was using nodetool to lower the size of row/key caches before triggering the compact and raising them after companion finished. As suggested running with spare heap is a very good idea it lowers the chance of a sto

Re: better anti OOM

2011-12-27 Thread Peter Schuller
>> In general, don't expect to be able to run at close to heap capacity; >> there *will* be spikes. > i try to tune for 80% of heap. Just FYI, at 80% target heap usage you're likely to have fallbacks to full compacting GC:s is my guess. If you are doing analytics only and aren't latency critical,

Re: better anti OOM

2011-12-27 Thread Radim Kolar
> How large is the bloom filters in total? I.e., sizes of the *-Filter.db files. On moderate node about 6.5 GB, index sampling will be about 4 GB, heap 12 gb. > In general, don't expect to be able to run at close to heap capacity; there *will* be spikes. i try to tune for 80% of heap.

Re: better anti OOM

2011-12-27 Thread Peter Schuller
> I will investigate situation more closely using gc via jconsole, but isn't > bloom filter for new sstable entirely in memory? On disk there are only 2 > files Index and Data. > -rw-r--r--  1 root  wheel   1388969984 Dec 27 09:25 > sipdb-tmp-hc-4634-Index.db > -rw-r--r--  1 root  wheel  1096522137

Re: better anti OOM

2011-12-27 Thread Radim Kolar
I don't know what you are basing that on. It seems unlikely to me that the working set of a compaction is 600 MB. However, it may very well be that the allocation rate is such that it contributes to an additional 600 MB average heap usage after a CMS phase has completed. I will investigate situa

Re: better anti OOM

2011-12-26 Thread Peter Schuller
> I suggest you describe exactly what the problem is you have and why you > think stopping compaction/repair is the appropriate solution. > > compacting 41.7 GB CF with about 200 millions rows adds - 600 MB to heap, > node logs messages like: I don't know what you are basing that on. It seems unli

Re: better anti OOM

2011-12-26 Thread Radim Kolar
I suggest you describe exactly what the problem is you have and why you think stopping compaction/repair is the appropriate solution. compacting 41.7 GB CF with about 200 millions rows adds - 600 MB to heap, node logs messages like: WARN [ScheduledTasks:1] 2011-12-27 00:20:57,972 GCInspector

Re: better anti OOM

2011-12-26 Thread Peter Schuller
> If node is low on memory 0.95+ heap used it can do: > > 1. stop repair > 2. stop largest compaction > 3. reduce number of compaction slots > 4. switch compaction to single threaded > > flushing largest memtable/ cache reduce is not enough Note that the "emergency" flushing is just a stop-gap. Yo

better anti OOM

2011-12-26 Thread Radim Kolar
If node is low on memory 0.95+ heap used it can do: 1. stop repair 2. stop largest compaction 3. reduce number of compaction slots 4. switch compaction to single threaded flushing largest memtable/ cache reduce is not enough