Hi Alexander, Thanks for responding.
> How many nodes? We currently have 9 nodes in our cluster. > How much ram per node? Each node has 4GB of ram and 4GB of swap. The memory levels (ram + swap) on each node are currently between 4GB and 5.5GB. > How many objects (files)? What is the average file size? We currently have >30 million objects, and I analyzed the average object size before we migrated data into the cluster it was about 4KB/object, with some objects being much larger (multiple MB). Is there an easy way to get this information from a running cluster so I can give you more accurate information? On Tue, Nov 22, 2016 at 2:42 AM, Alexander Sicular <[email protected]> wrote: > Hi Daniel, > > How many nodes? > -You should be using 5 minimum if you using the default config. There > are reasons. > > How much ram per node? > -As you noted, in Riak CS, 1MB file chunks are stored in bitcask. > Their key names and some overhead consume memory. > > How many objects (files)? What is the average file size? > -If your size distribution significantly skews < 1MB that means you > will have a bunch of files in bitcask eating up ram. > > Kota was a former Basho engineer who worked on CS... That said, Basho > may not support a non standard deployment. > > -Alexander > > On Mon, Nov 21, 2016 at 2:45 PM, Daniel Miller <[email protected]> wrote: > > I found a similar question from over a year ago > > (http://lists.basho.com/pipermail/riak-users_lists. > basho.com/2015-July/017327.html), > > and it sounds like leveldb is the way to go, although possibly not well > > tested. Has anything changed with regard to Basho's (or anyone else) > > experience with using leveldb backend instead of the mutli backend for > CS? > > > > On Fri, Nov 4, 2016 at 11:48 AM, Daniel Miller <[email protected]> > wrote: > >> > >> Hi, > >> > >> I have a Riak CS cluster up and running, and am anticipating exponential > >> growth in the number of key/value pairs over the next few years. From > >> reading the documentation and experience, I've concluded that the > default > >> configuration of CS (with riak_cs_kv_multi_backend) keeps all keys in > RAM. > >> The OOM killer strikes when Riak uses too much RAM, which is not good > for my > >> sanity or sleep. Because of the amount of growth I am anticipating, it > seems > >> unlikely that I can allocate enough RAM to keep up with the load. Disk, > on > >> the other hand, is less constrained. > >> > >> A little background on the data set: I have a sparsely accessed key set. > >> By that I mean after a key is written, the more time passes with that > key > >> not being accessed, the less likely it is to be accessed any time soon. > At > >> any given time, most keys will be dormant. However, any given key > _could_ be > >> accessed at any time, so should be possible to retrieve it. > >> > >> I am currently running a smaller cluster (with smaller nodes: less RAM, > >> smaller disks) than I expect to use eventually. I am starting to hit > some > >> growth-related issues that are prompting me to explore more options > before > >> it becomes a dire situation. > >> > >> My question: Are there ways to tune Riak (CS) to support this scenario > >> gracefully? That is, are there ways to make Riak not load all keys into > RAM? > >> It looks like leveldb is just what I want, but I'm a little nervous > >> switching over to only leveldb when the default/recommended config uses > the > >> multi backend. > >> > >> As a stop-gap measure, I enabled swap (with swappiness = 0), which I > >> anticipated would kill performance, but was pleasantly surprised to see > it > >> return to effectively no-swap performance levels after a short period of > >> lower performance. I'm guessing this is not a good long-term solution > as my > >> dataset grows. The problem with using large amounts of swap is that each > >> time Riak starts it needs to read all keys into RAM. Long term, as our > >> dataset grows, the amount of time needed to read keys into RAM will > cause a > >> very long restart time (and thus period of unavailability), which could > >> endanger availability for a prolonged period if multiple nodes go down > at > >> once. > >> > >> Thanks! > >> Daniel Miller > >> Dimagi, Inc. > >> > > > > > > _______________________________________________ > > riak-users mailing list > > [email protected] > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
