date:20170513

Proposal - GroupCommitLogService

2017-05-13 Thread Yuji Ito

Hi dev,

I propose a new CommitLogService, GroupCommitLogService, to improve the
throughput when lots of requests are received.
It improved the throughput by maximum 94%.
I'd like to discuss about this CommitLogService.

Currently, we can select either 2 CommitLog services; Periodic and Batch.
In Periodic, we might lose some commit log which hasn't written to the disk.
In Batch, we can write commit log to the disk every time. The size of
commit log to write is too small (< 4KB). When high concurrency, these
writes are gathered and persisted to the disk at once. But, when
insufficient concurrency, many small writes are issued and the performance
decreases due to the latency of the disk. Even if you use SSD, processes of
many IO commands decrease the performance.

GroupCommitLogService writes some commitlog to the disk at once.
The patch adds GroupCommitLogService (It is enabled by setting
`commitlog_sync` and `commitlog_sync_group_window_in_ms` in cassandra.yaml).
The difference from Batch is just only waiting for the semaphore.
By waiting for the semaphore, some writes for commit logs are executed at
the same time.
In GroupCommitLogService, the latency becomes worse if the there is no
concurrency.

I measured the performance with my microbench (MicroRequestThread.java) by
increasing the number of threads.The cluster has 3 nodes (Replication
factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
The result is as below. The GroupCommitLogService with 10ms window improved
update with Paxos by 94% and improved select with Paxos by 76%.

 SELECT / sec 
# of threads Batch 2ms Group 10ms
1 192 103
2 163 212
4 264 416
8 454 800
16 744 1311
32 1151 1481
64 1767 1844
128 2949 3011
256 4723 5000

 UPDATE / sec 
# of threads Batch 2ms Group 10ms
1 45 26
2 39 51
4 58 102
8 102 198
16 167 213
32 289 295
64 544 548
128 1046 1058
256 2020 2061


Thanks,
Yuji

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Proposal - GroupCommitLogService

2017-05-13 Thread J. D. Jordan

Sounds interesting. You should open a JIRA and attach your code for discussion 
of it.

https://issues.apache.org/jira/browse/CASSANDRA/

-Jeremiah

> On May 13, 2017, at 7:21 AM, Yuji Ito  wrote:
> 
> Hi dev,
> 
> I propose a new CommitLogService, GroupCommitLogService, to improve the 
> throughput when lots of requests are received.
> It improved the throughput by maximum 94%.
> I'd like to discuss about this CommitLogService.
> 
> Currently, we can select either 2 CommitLog services; Periodic and Batch.
> In Periodic, we might lose some commit log which hasn't written to the disk.
> In Batch, we can write commit log to the disk every time. The size of commit 
> log to write is too small (< 4KB). When high concurrency, these writes are 
> gathered and persisted to the disk at once. But, when insufficient 
> concurrency, many small writes are issued and the performance decreases due 
> to the latency of the disk. Even if you use SSD, processes of many IO 
> commands decrease the performance.
> 
> GroupCommitLogService writes some commitlog to the disk at once.
> The patch adds GroupCommitLogService (It is enabled by setting 
> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in cassandra.yaml).
> The difference from Batch is just only waiting for the semaphore.
> By waiting for the semaphore, some writes for commit logs are executed at the 
> same time.
> In GroupCommitLogService, the latency becomes worse if the there is no 
> concurrency.
> 
> I measured the performance with my microbench (MicroRequestThread.java) by 
> increasing the number of threads.The cluster has 3 nodes (Replication factor: 
> 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
> The result is as below. The GroupCommitLogService with 10ms window improved 
> update with Paxos by 94% and improved select with Paxos by 76%.
> 
>  SELECT / sec 
> # of threads  Batch 2ms   Group 10ms
> 1 192 103
> 2 163 212
> 4 264 416
> 8 454 800
> 16744 1311
> 3211511481
> 6417671844
> 128   29493011
> 256   47235000
> 
>  UPDATE / sec 
> # of threads  Batch 2ms   Group 10ms
> 1 45  26
> 2 39  51
> 4 58  102
> 8 102 198
> 16167 213
> 32289 295
> 64544 548
> 128   10461058
> 256   20202061
> 
> 
> Thanks,
> Yuji
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Proposal - GroupCommitLogService

2017-05-13 Thread Yuji Ito

Thanks Jeremiah,

I've opened a ticket on JIRA.

https://issues.apache.org/jira/browse/CASSANDRA-13530

Best,
Yuji


On Sat, May 13, 2017 at 9:38 PM, J. D. Jordan 
wrote:

> Sounds interesting. You should open a JIRA and attach your code for
> discussion of it.
>
> https://issues.apache.org/jira/browse/CASSANDRA/
> 
>
> -Jeremiah
>
> On May 13, 2017, at 7:21 AM, Yuji Ito  wrote:
>
> Hi dev,
>
> I propose a new CommitLogService, GroupCommitLogService, to improve the
> throughput when lots of requests are received.
> It improved the throughput by maximum 94%.
> I'd like to discuss about this CommitLogService.
>
> Currently, we can select either 2 CommitLog services; Periodic and Batch.
> In Periodic, we might lose some commit log which hasn't written to the
> disk.
> In Batch, we can write commit log to the disk every time. The size of
> commit log to write is too small (< 4KB). When high concurrency, these
> writes are gathered and persisted to the disk at once. But, when
> insufficient concurrency, many small writes are issued and the performance
> decreases due to the latency of the disk. Even if you use SSD, processes of
> many IO commands decrease the performance.
>
> GroupCommitLogService writes some commitlog to the disk at once.
> The patch adds GroupCommitLogService (It is enabled by setting
> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in
> cassandra.yaml).
> The difference from Batch is just only waiting for the semaphore.
> By waiting for the semaphore, some writes for commit logs are executed at
> the same time.
> In GroupCommitLogService, the latency becomes worse if the there is no
> concurrency.
>
> I measured the performance with my microbench (MicroRequestThread.java) by
> increasing the number of threads.The cluster has 3 nodes (Replication
> factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
> The result is as below. The GroupCommitLogService with 10ms window
> improved update with Paxos by 94% and improved select with Paxos by 76%.
>
>  SELECT / sec 
> # of threads Batch 2ms Group 10ms
> 1 192 103
> 2 163 212
> 4 264 416
> 8 454 800
> 16 744 1311
> 32 1151 1481
> 64 1767 1844
> 128 2949 3011
> 256 4723 5000
>
>  UPDATE / sec 
> # of threads Batch 2ms Group 10ms
> 1 45 26
> 2 39 51
> 4 58 102
> 8 102 198
> 16 167 213
> 32 289 295
> 64 544 548
> 128 1046 1058
> 256 2020 2061
>
>
> Thanks,
> Yuji
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Integrating vendor-specific code and developing plugins

2017-05-13 Thread Jonathan Haddad

In accordance with the idea that the codebase should be better tested, it
seems to me like things shouldn't be added that aren't testable.  If
there's a million unit tests that are insanely comprehensive but for some
reason can never be run, they serve exactly the same value as no tests.

It may be better to figure out how to foster a plugin ecosystem, which is a
bit better than "there's an API go deal with it".  This is what Spark is
doing and it seems like a pretty reasonable approach to me:
https://spark-packages.org/

On Fri, May 12, 2017 at 9:03 PM Jeff Jirsa  wrote:

> I think the status quo is insufficient - even if it doesn't go in tree, we
> should do more than just say "the API exists, ship your own jar"
>
> What's the real risk of having it in tree? We break it because nobody can
> test it? How's that any worse than breaking it outside the tree? Finger
> pointing?
>
> --
> Jeff Jirsa
>
>
> > On May 12, 2017, at 12:25 PM, Jason Brown  wrote:
> >
> > I agree the plugins route is the safest and best. However, we already
> have
> > platform-specific code in-tree that is semi-unmaintained: the Windows
> > support. To Sylvain's point, I have little to no idea if I'm going to
> break
> > the Windows builds as I don't have access to a Windows machine, nor are
> we
> > as a community (as best as I can tell) actively running unit tests or
> > dtests on Windows.
> >
> > Further, we support snitches for clouds I suspect we don't typically
> > run/test on, as well: CloudstackSnitch, GoogleCloudSnitch.
> >
> > This being said, I don't think we should remove support for Windows or
> > those snitches. Instead, what I think would be more beneficial, and
> > certainly more reflecting the Apache Way, is to see if someone in the
> > community would be willing to maintain those components. Looking at
> another
> > Apache project, Mesos has an interesting breakdown of maintainers for
> > specific components [1]. We might consider adopting a similar idea for
> > platforms/OSes/architectures/whatevers.
> >
> > As for where to put the custom code, there's a few different options:
> >
> > bare minimum: we should have docs pointing to all known third party
> > implementations of pluggable interfaces
> > slightly more involved: contrib/ section of third-party contributed
> plugins
> > even more involved: in tree like gcp / aws snitches
> >
> > I'm not really thrilled on the contribs repo, and in-tree certainly has
> > drawbacks, as well. As I initially stated, it can be on a case-by-case
> > basis.
> >
> > Honestly, I don't want to push away contributors if they can add
> something
> > to the project - as long as it is maintainable.
> >
> > Thanks,
> >
> > -Jason
> >
> >
> > [1] https://mesos.apache.org/documentation/latest/committers/
> >
> > On Fri, May 12, 2017 at 4:35 AM, Sylvain Lebresne 
> > wrote:
> >
> >> On Fri, May 12, 2017 at 12:29 AM, Jason Brown 
> >> wrote:
> >>
> >>> Hey all,
> >>>
> >>> I'm on-board with what Rei is saying. I think we should be open to, and
> >>> encourage, other platforms/architectures for integration. Of course, it
> >>> will come down to specific maintainers/committers to do the testing and
> >>> verification on non-typical platforms. Hopefully those maintainers will
> >>> also contribute to other parts of the code base, as well, so I see
> this as
> >>> another way to bring more folks into the project.
> >>>
> >>
> >> Without going so far as to say we shouldn't merge any
> >> platform/architecture/vendor specific code ever and for no reason, I
> >> personally think we should avoid doing so as much as practical and
> >> encourage the "plugins" route instead. It's just much cleaner imo on
> >> principle and amounts to good software development hygiene.
> >>
> >> I don't want to not be able to commit some changes because it breaks the
> >> build because there is code for X number of super specific
> >> platform/architecture I don't have access to/don't know anything about
> >> and the maintainers are on vacation or hasn't been reachable in a while.
> >> And what if such maintainer do go away? Sure we can have some "process"
> >> to remove the code in such cases, but why add that burden on us? Plus
> >> we, the Apache Cassandra project, would still be seen as the ones that
> >> drop support for said platform/architecture even though we really have
> >> no choice if it's something we don't have access to anyway.
> >>
> >> And sure, I'm painting a bleak picture here, and we would probably have
> >> none of those problems in most cases. But if we do start encourage
> >> actual merge of such code, you can be sure we'll have some of those
> >> problems at some point.
> >>
> >> Encouraging plugins have imo pretty much all the benefits with none of
> >> the risks. In particular, I'm unconvinced that someone will be
> >> much more likely to meaningfully contribute to other part of the code
> >> if his "plugins" is in-tree versus out of it.
> >>
> >> *But* I can certainly agree with the part about u

Re: Proposal - GroupCommitLogService

2017-05-13 Thread Jonathan Ellis

Can we replace Batch entirely with this, or are there situations where
Batch would outperform (in latency, for instance)?

On Sat, May 13, 2017 at 7:21 AM, Yuji Ito  wrote:

> Hi dev,
>
> I propose a new CommitLogService, GroupCommitLogService, to improve the
> throughput when lots of requests are received.
> It improved the throughput by maximum 94%.
> I'd like to discuss about this CommitLogService.
>
> Currently, we can select either 2 CommitLog services; Periodic and Batch.
> In Periodic, we might lose some commit log which hasn't written to the
> disk.
> In Batch, we can write commit log to the disk every time. The size of
> commit log to write is too small (< 4KB). When high concurrency, these
> writes are gathered and persisted to the disk at once. But, when
> insufficient concurrency, many small writes are issued and the performance
> decreases due to the latency of the disk. Even if you use SSD, processes of
> many IO commands decrease the performance.
>
> GroupCommitLogService writes some commitlog to the disk at once.
> The patch adds GroupCommitLogService (It is enabled by setting
> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in
> cassandra.yaml).
> The difference from Batch is just only waiting for the semaphore.
> By waiting for the semaphore, some writes for commit logs are executed at
> the same time.
> In GroupCommitLogService, the latency becomes worse if the there is no
> concurrency.
>
> I measured the performance with my microbench (MicroRequestThread.java) by
> increasing the number of threads.The cluster has 3 nodes (Replication
> factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
> The result is as below. The GroupCommitLogService with 10ms window
> improved update with Paxos by 94% and improved select with Paxos by 76%.
>
>  SELECT / sec 
> # of threads Batch 2ms Group 10ms
> 1 192 103
> 2 163 212
> 4 264 416
> 8 454 800
> 16 744 1311
> 32 1151 1481
> 64 1767 1844
> 128 2949 3011
> 256 4723 5000
>
>  UPDATE / sec 
> # of threads Batch 2ms Group 10ms
> 1 45 26
> 2 39 51
> 4 58 102
> 8 102 198
> 16 167 213
> 32 289 295
> 64 544 548
> 128 1046 1058
> 256 2020 2061
>
>
> Thanks,
> Yuji
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>



-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced

Re: Proposal - GroupCommitLogService

2017-05-13 Thread Yuji Ito

Batch outperforms when there is no concurrency.
Because GroupCommit should wait the window time, the throughput and the
latency are worse in a single request.
GroupCommit can gather the commitlog writes which are requested in the
window time.
Actually, the throughput of a single thread was bounded by the window time.

Yuji

On Sat, May 13, 2017 at 11:49 PM, Jonathan Ellis  wrote:

> Can we replace Batch entirely with this, or are there situations where
> Batch would outperform (in latency, for instance)?
>
> On Sat, May 13, 2017 at 7:21 AM, Yuji Ito  wrote:
>
>> Hi dev,
>>
>> I propose a new CommitLogService, GroupCommitLogService, to improve the
>> throughput when lots of requests are received.
>> It improved the throughput by maximum 94%.
>> I'd like to discuss about this CommitLogService.
>>
>> Currently, we can select either 2 CommitLog services; Periodic and Batch.
>> In Periodic, we might lose some commit log which hasn't written to the
>> disk.
>> In Batch, we can write commit log to the disk every time. The size of
>> commit log to write is too small (< 4KB). When high concurrency, these
>> writes are gathered and persisted to the disk at once. But, when
>> insufficient concurrency, many small writes are issued and the performance
>> decreases due to the latency of the disk. Even if you use SSD, processes of
>> many IO commands decrease the performance.
>>
>> GroupCommitLogService writes some commitlog to the disk at once.
>> The patch adds GroupCommitLogService (It is enabled by setting
>> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in
>> cassandra.yaml).
>> The difference from Batch is just only waiting for the semaphore.
>> By waiting for the semaphore, some writes for commit logs are executed at
>> the same time.
>> In GroupCommitLogService, the latency becomes worse if the there is no
>> concurrency.
>>
>> I measured the performance with my microbench (MicroRequestThread.java)
>> by increasing the number of threads.The cluster has 3 nodes (Replication
>> factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
>> The result is as below. The GroupCommitLogService with 10ms window
>> improved update with Paxos by 94% and improved select with Paxos by 76%.
>>
>>  SELECT / sec 
>> # of threads Batch 2ms Group 10ms
>> 1 192 103
>> 2 163 212
>> 4 264 416
>> 8 454 800
>> 16 744 1311
>> 32 1151 1481
>> 64 1767 1844
>> 128 2949 3011
>> 256 4723 5000
>>
>>  UPDATE / sec 
>> # of threads Batch 2ms Group 10ms
>> 1 45 26
>> 2 39 51
>> 4 58 102
>> 8 102 198
>> 16 167 213
>> 32 289 295
>> 64 544 548
>> 128 1046 1058
>> 256 2020 2061
>>
>>
>> Thanks,
>> Yuji
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>
>
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>

Re: Integrating vendor-specific code and developing plugins

2017-05-13 Thread Nate McCall

> It may be better to figure out how to foster a plugin ecosystem, which is a
> bit better than "there's an API go deal with it".  This is what Spark is
> doing and it seems like a pretty reasonable approach to me:
> https://spark-packages.org/
>

In thinking about this a bit, we have: Mesos, Beam and Spark as
examples of other Apache projects managing "plugins" (maybe a better
word around?).

Anybody have other examples that come to mind in ASF ecosystem?

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Proposal - GroupCommitLogService

2017-05-13 Thread Jonathan Ellis

Does that mean that Batch is not working as designed?  If there are other
pending writes, Batch should also group them together.  (Did you test with
giving Batch the same window size as Group?)

On Sat, May 13, 2017 at 10:08 AM, Yuji Ito  wrote:

> Batch outperforms when there is no concurrency.
> Because GroupCommit should wait the window time, the throughput and the
> latency are worse in a single request.
> GroupCommit can gather the commitlog writes which are requested in the
> window time.
> Actually, the throughput of a single thread was bounded by the window time.
>
> Yuji
>
> On Sat, May 13, 2017 at 11:49 PM, Jonathan Ellis 
> wrote:
>
>> Can we replace Batch entirely with this, or are there situations where
>> Batch would outperform (in latency, for instance)?
>>
>> On Sat, May 13, 2017 at 7:21 AM, Yuji Ito  wrote:
>>
>>> Hi dev,
>>>
>>> I propose a new CommitLogService, GroupCommitLogService, to improve the
>>> throughput when lots of requests are received.
>>> It improved the throughput by maximum 94%.
>>> I'd like to discuss about this CommitLogService.
>>>
>>> Currently, we can select either 2 CommitLog services; Periodic and Batch.
>>> In Periodic, we might lose some commit log which hasn't written to the
>>> disk.
>>> In Batch, we can write commit log to the disk every time. The size of
>>> commit log to write is too small (< 4KB). When high concurrency, these
>>> writes are gathered and persisted to the disk at once. But, when
>>> insufficient concurrency, many small writes are issued and the performance
>>> decreases due to the latency of the disk. Even if you use SSD, processes of
>>> many IO commands decrease the performance.
>>>
>>> GroupCommitLogService writes some commitlog to the disk at once.
>>> The patch adds GroupCommitLogService (It is enabled by setting
>>> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in
>>> cassandra.yaml).
>>> The difference from Batch is just only waiting for the semaphore.
>>> By waiting for the semaphore, some writes for commit logs are executed
>>> at the same time.
>>> In GroupCommitLogService, the latency becomes worse if the there is no
>>> concurrency.
>>>
>>> I measured the performance with my microbench (MicroRequestThread.java)
>>> by increasing the number of threads.The cluster has 3 nodes (Replication
>>> factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
>>> The result is as below. The GroupCommitLogService with 10ms window
>>> improved update with Paxos by 94% and improved select with Paxos by 76%.
>>>
>>>  SELECT / sec 
>>> # of threads Batch 2ms Group 10ms
>>> 1 192 103
>>> 2 163 212
>>> 4 264 416
>>> 8 454 800
>>> 16 744 1311
>>> 32 1151 1481
>>> 64 1767 1844
>>> 128 2949 3011
>>> 256 4723 5000
>>>
>>>  UPDATE / sec 
>>> # of threads Batch 2ms Group 10ms
>>> 1 45 26
>>> 2 39 51
>>> 4 58 102
>>> 8 102 198
>>> 16 167 213
>>> 32 289 295
>>> 64 544 548
>>> 128 1046 1058
>>> 256 2020 2061
>>>
>>>
>>> Thanks,
>>> Yuji
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> co-founder, http://www.datastax.com
>> @spyced
>>
>
>


-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced

Re: Proposal - GroupCommitLogService

2017-05-13 Thread Yuji Ito

That's exactly the motivation of this proposal.

Batch can group only writes which are not persisted (not kept waiting) at
that time.
In Batch, a write of commitlog isn't kept waiting because the thread lock
(the semaphore in 2.2 and 3.0) for sync is released immediately.

So, the window size means 'the maximum length of time that queries may be
batched together for, not the minimum'.
The Batch window size doesn't almost affect the performance.
https://issues.apache.org/jira/browse/CASSANDRA-12864

I tested the throughput of SELECT with Batch window 10ms.
The result was the same as Batch window 2ms as expected.

 SELECT / sec 
# of threads batch 2ms batch 10ms
1 192 192
2 163 169
4 264 263
8 454 454
16 744 744
32 1151 1155
64 1767 1772
128 2949 2962
256 4723 4785

Yuji

On Sun, May 14, 2017 at 2:51 AM, Jonathan Ellis  wrote:

> Does that mean that Batch is not working as designed?  If there are other
> pending writes, Batch should also group them together.  (Did you test with
> giving Batch the same window size as Group?)
>
> On Sat, May 13, 2017 at 10:08 AM, Yuji Ito  wrote:
>
>> Batch outperforms when there is no concurrency.
>> Because GroupCommit should wait the window time, the throughput and the
>> latency are worse in a single request.
>> GroupCommit can gather the commitlog writes which are requested in the
>> window time.
>> Actually, the throughput of a single thread was bounded by the window
>> time.
>>
>> Yuji
>>
>> On Sat, May 13, 2017 at 11:49 PM, Jonathan Ellis 
>> wrote:
>>
>>> Can we replace Batch entirely with this, or are there situations where
>>> Batch would outperform (in latency, for instance)?
>>>
>>> On Sat, May 13, 2017 at 7:21 AM, Yuji Ito  wrote:
>>>
 Hi dev,

 I propose a new CommitLogService, GroupCommitLogService, to improve the
 throughput when lots of requests are received.
 It improved the throughput by maximum 94%.
 I'd like to discuss about this CommitLogService.

 Currently, we can select either 2 CommitLog services; Periodic and
 Batch.
 In Periodic, we might lose some commit log which hasn't written to the
 disk.
 In Batch, we can write commit log to the disk every time. The size of
 commit log to write is too small (< 4KB). When high concurrency, these
 writes are gathered and persisted to the disk at once. But, when
 insufficient concurrency, many small writes are issued and the performance
 decreases due to the latency of the disk. Even if you use SSD, processes of
 many IO commands decrease the performance.

 GroupCommitLogService writes some commitlog to the disk at once.
 The patch adds GroupCommitLogService (It is enabled by setting
 `commitlog_sync` and `commitlog_sync_group_window_in_ms` in
 cassandra.yaml).
 The difference from Batch is just only waiting for the semaphore.
 By waiting for the semaphore, some writes for commit logs are executed
 at the same time.
 In GroupCommitLogService, the latency becomes worse if the there is no
 concurrency.

 I measured the performance with my microbench (MicroRequestThread.java)
 by increasing the number of threads.The cluster has 3 nodes (Replication
 factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
 The result is as below. The GroupCommitLogService with 10ms window
 improved update with Paxos by 94% and improved select with Paxos by 76%.

  SELECT / sec 
 # of threads Batch 2ms Group 10ms
 1 192 103
 2 163 212
 4 264 416
 8 454 800
 16 744 1311
 32 1151 1481
 64 1767 1844
 128 2949 3011
 256 4723 5000

  UPDATE / sec 
 # of threads Batch 2ms Group 10ms
 1 45 26
 2 39 51
 4 58 102
 8 102 198
 16 167 213
 32 289 295
 64 544 548
 128 1046 1058
 256 2020 2061

 Thanks,
 Yuji

 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org

>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> co-founder, http://www.datastax.com
>>> @spyced
>>>
>>
>>
>
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>

Proposal - GroupCommitLogService

Re: Proposal - GroupCommitLogService

Re: Proposal - GroupCommitLogService

Re: Integrating vendor-specific code and developing plugins

Re: Proposal - GroupCommitLogService

Re: Proposal - GroupCommitLogService

Re: Integrating vendor-specific code and developing plugins

Re: Proposal - GroupCommitLogService

Re: Proposal - GroupCommitLogService

9 matches

Site Navigation

Mail list logo

Footer information