Computing ranges takes time

2016-05-31 Thread Cyril Scetbon
Hi C* developers,

Digging in the code because of a time issue during  a repair full on our ~ 60 
nodes cluster, I've been able to see that this stage can be significant (up to 
60 percent of) :

https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997

It's merely caused by the fact that 
https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
 calls `ss.getLocalRanges(keyspaceName)` everytime and that it takes more than 
99% of the time. This call takes 600ms when there is no load on the cluster and 
more if there is. So for 10k ranges, you can imagine that it takes at least 1.5 
hours just to compute ranges. Don't you think that caching this call would make 
sense ?
 
-- 
Cyril SCETBON



Re: Computing ranges takes time

2016-05-31 Thread Paulo Motta
Good catch! It definitely makes sense to cache this call for a single
repair job as it calls ReplicationStrategy.getAddressRanges underneath
which can get pretty inefficient (
https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170
).

Would you mind creating a ticket and submitting a patch?

Thanks!

2016-05-31 12:59 GMT-03:00 Cyril Scetbon :

> Hi C* developers,
>
> Digging in the code because of a time issue during  a repair full on our ~
> 60 nodes cluster, I've been able to see that this stage can be significant
> (up to 60 percent of) :
>
>
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
>
> It's merely caused by the fact that
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
> calls `ss.getLocalRanges(keyspaceName)` everytime and that it takes more
> than 99% of the time. This call takes 600ms when there is no load on the
> cluster and more if there is. So for 10k ranges, you can imagine that it
> takes at least 1.5 hours just to compute ranges. Don't you think that
> caching this call would make sense ?
>
> --
> Cyril SCETBON
>
>


Re: Computing ranges takes time

2016-05-31 Thread Cyril Scetbon
Ticket created at https://issues.apache.org/jira/browse/CASSANDRA-11933

Thanks
> On May 31, 2016, at 13:54, Paulo Motta  wrote:
> 
> Good catch! It definitely makes sense to cache this call for a single
> repair job as it calls ReplicationStrategy.getAddressRanges underneath
> which can get pretty inefficient (
> https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170
> ).
> 
> Would you mind creating a ticket and submitting a patch?
> 
> Thanks!
> 
> 2016-05-31 12:59 GMT-03:00 Cyril Scetbon :
> 
>> Hi C* developers,
>> 
>> Digging in the code because of a time issue during  a repair full on our ~
>> 60 nodes cluster, I've been able to see that this stage can be significant
>> (up to 60 percent of) :
>> 
>> 
>> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
>> 
>> It's merely caused by the fact that
>> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>> calls `ss.getLocalRanges(keyspaceName)` everytime and that it takes more
>> than 99% of the time. This call takes 600ms when there is no load on the
>> cluster and more if there is. So for 10k ranges, you can imagine that it
>> takes at least 1.5 hours just to compute ranges. Don't you think that
>> caching this call would make sense ?
>> 
>> --
>> Cyril SCETBON
>> 
>>