Sounds interesting. You should open a JIRA and attach your code for discussion of it.
https://issues.apache.org/jira/browse/CASSANDRA/ -Jeremiah > On May 13, 2017, at 7:21 AM, Yuji Ito <y...@imagine-orb.com> wrote: > > Hi dev, > > I propose a new CommitLogService, GroupCommitLogService, to improve the > throughput when lots of requests are received. > It improved the throughput by maximum 94%. > I'd like to discuss about this CommitLogService. > > Currently, we can select either 2 CommitLog services; Periodic and Batch. > In Periodic, we might lose some commit log which hasn't written to the disk. > In Batch, we can write commit log to the disk every time. The size of commit > log to write is too small (< 4KB). When high concurrency, these writes are > gathered and persisted to the disk at once. But, when insufficient > concurrency, many small writes are issued and the performance decreases due > to the latency of the disk. Even if you use SSD, processes of many IO > commands decrease the performance. > > GroupCommitLogService writes some commitlog to the disk at once. > The patch adds GroupCommitLogService (It is enabled by setting > `commitlog_sync` and `commitlog_sync_group_window_in_ms` in cassandra.yaml). > The difference from Batch is just only waiting for the semaphore. > By waiting for the semaphore, some writes for commit logs are executed at the > same time. > In GroupCommitLogService, the latency becomes worse if the there is no > concurrency. > > I measured the performance with my microbench (MicroRequestThread.java) by > increasing the number of threads.The cluster has 3 nodes (Replication factor: > 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume. > The result is as below. The GroupCommitLogService with 10ms window improved > update with Paxos by 94% and improved select with Paxos by 76%. > > ==== SELECT / sec ==== > # of threads Batch 2ms Group 10ms > 1 192 103 > 2 163 212 > 4 264 416 > 8 454 800 > 16 744 1311 > 32 1151 1481 > 64 1767 1844 > 128 2949 3011 > 256 4723 5000 > > ==== UPDATE / sec ==== > # of threads Batch 2ms Group 10ms > 1 45 26 > 2 39 51 > 4 58 102 > 8 102 198 > 16 167 213 > 32 289 295 > 64 544 548 > 128 1046 1058 > 256 2020 2061 > > > Thanks, > Yuji > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org