Re: Pluggable throttling of read and write queries

2017-02-22 Thread Abhishek Verma
We have lots of dedicated Cassandra clusters for large use cases, but we
have a long tail of (~100) of internal customers who want to store < 200GB
of data with < 5k qps and non-critical data. It does not make sense to
create a 3 node dedicated cluster for each of these small use cases. So we
have a shared cluster into which we onboard these users.

But once in a while, one of the customers will run a ingest job from HDFS
which will pound the shared cluster and break our SLA for the cluster for
all the other customers. Currently, I don't see anyway to signal back
pressure to the ingestion jobs or throttle their requests. Another example
is one customer doing a large number of range queries which has the same
effect.

A simple way to avoid this is to throttle the read or write requests based
on some quota limits for each keyspace or user.

Please see replies inlined:

On Mon, Feb 20, 2017 at 11:46 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Aren't you using mesos Cassandra framework to manage your multiple
> clusters ? (Seen a presentation in cass summit)
>
Yes we are using https://github.com/mesosphere/dcos-cassandra-service and
contribute heavily to it. I am aware of the presentation (
https://www.youtube.com/watch?v=4Ap-1VT2ChU) at the Cassandra summit as I
was the one who gave it :)
This has helped us automate the creation and management of these clusters.

> What's wrong with your current mesos approach ?
>
Hardware efficiency: Spinning up dedicated clusters for each use case
wastes a lot of hardware resources. One of the approaches we have taken is
spinning up multiple Cassandra nodes belonging to different clusters on the
same physical machine. However, we still have overhead of managing these
separate multi-tenant clusters.

> I am also thinking it's better to split a large cluster into smallers
> except if you also manage client layer that query cass and you can put some
> backpressure or rate limit in it.
>
We have an internal storage API layer that some of the clients use, but
there are many customers who use the vanilla DataStax Java or Python
driver. Implementing throttling in each of those clients does not seem like
a viable approach.

Le 21 févr. 2017 2:46 AM, "Edward Capriolo"  a
> écrit :
>
>> Older versions had a request scheduler api.
>
> I am not aware of the history behind it. Can you please point me to the
JIRA tickets and/or why it was removed?

On Monday, February 20, 2017, Ben Slater  wrote:
>>
>>> We’ve actually had several customers where we’ve done the opposite -
>>> split large clusters apart to separate uses cases. We found that this
>>> allowed us to better align hardware with use case requirements (for example
>>> using AWS c3.2xlarge for very hot data at low latency, m4.xlarge for more
>>> general purpose data) we can also tune JVM settings, etc to meet those uses
>>> cases.
>>>
>> There have been several instances where we have moved customers out of
the shared cluster to their own dedicated clusters because they outgrew our
limitations. But I don't think it makes sense to move all the small use
cases into their separate clusters.

On Mon, 20 Feb 2017 at 22:21 Oleksandr Shulgin 
>>> wrote:
>>>
 On Sat, Feb 18, 2017 at 3:12 AM, Abhishek Verma  wrote:

> Cassandra is being used on a large scale at Uber. We usually create
> dedicated clusters for each of our internal use cases, however that is
> difficult to scale and manage.
>
> We are investigating the approach of using a single shared cluster
> with 100s of nodes and handle 10s to 100s of different use cases for
> different products in the same cluster. We can define different keyspaces
> for each of them, but that does not help in case of noisy neighbors.
>
> Does anybody in the community have similar large shared clusters
> and/or face noisy neighbor issues?
>

 Hi,

 We've never tried this approach and given my limited experience I would
 find this a terrible idea from the perspective of maintenance (remember the
 old saying about basket and eggs?)

>>> What if you have a limited number of baskets and several eggs which are
not critical if they break rarely.


> What potential benefits do you see?

>>> The main benefit of sharing a single cluster among several small use
cases is increasing the hardware efficiency and decreasing the management
overhead of a large number of clusters.

Thanks everyone for your replies and questions.

-Abhishek.


RemoveNode Behavior Question

2017-02-22 Thread Anubhav Kale
Hello,

Recently, I started noticing an interesting pattern. When I execute 
"removenode", a subset of the nodes that now own the tokens result it in a CPU 
spike / disk activity, and sometimes SSTables on those nodes shoot up.

After looking through the code, it appears to me that below function forces 
data to be streamed from some of the new nodes to the node from where 
"removenode" is kicked in. Is my understanding correct ?

https://github.com/apache/cassandra/blob/d384e781d6f7c028dbe88cfe9dd3e966e72cd046/src/java/org/apache/cassandra/service/StorageService.java#L2548

Our nodes don't run very hot, but it appears this streaming causes them to have 
issues. If I understand the code correctly, the node that's initiated 
removenode may still not get all the data for moved over ranges. So, what is 
the rationale behind trying to build a "partial replica" ?

Maybe, I am not following this correctly so hoping someone can explain.

Thanks !



Re: RemoveNode Behavior Question

2017-02-22 Thread Brandon Williams
Every topology operation tries to respect/restore the RF except for
assassinate.

On Wed, Feb 22, 2017 at 12:45 PM, Anubhav Kale <
anubhav.k...@microsoft.com.invalid> wrote:

> Hello,
>
> Recently, I started noticing an interesting pattern. When I execute
> "removenode", a subset of the nodes that now own the tokens result it in a
> CPU spike / disk activity, and sometimes SSTables on those nodes shoot up.
>
> After looking through the code, it appears to me that below function
> forces data to be streamed from some of the new nodes to the node from
> where "removenode" is kicked in. Is my understanding correct ?
>
> https://github.com/apache/cassandra/blob/d384e781d6f7c028dbe88cfe9dd3e9
> 66e72cd046/src/java/org/apache/cassandra/service/StorageService.java#L2548
>  2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd3
> e966e72cd046%2Fsrc%2Fjava%2Forg%2Fapache%2Fcassandra%
> 2Fservice%2FStorageService.java%23L2548&data=02%7C01%
> 7CAnubhav.Kale%40microsoft.com%7C173daa48fcaf4ca6498d08d43982318c%
> 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636196678720784947&sdata=
> JZ9zWh%2FtJJ%2FbhXXkT41yQhANKaUSBHfP53WraY2vL8M%3D&reserved=0>
>
> Our nodes don't run very hot, but it appears this streaming causes them to
> have issues. If I understand the code correctly, the node that's initiated
> removenode may still not get all the data for moved over ranges. So, what
> is the rationale behind trying to build a "partial replica" ?
>
> Maybe, I am not following this correctly so hoping someone can explain.
>
> Thanks !
>
>


RE: RemoveNode Behavior Question

2017-02-22 Thread Anubhav Kale
But I don't understand how the replica count is getting restored here. The node 
that invoked removenode only owns partial ranges.

-Original Message-
From: Brandon Williams [mailto:dri...@gmail.com] 
Sent: Wednesday, February 22, 2017 10:49 AM
To: dev@cassandra.apache.org
Subject: Re: RemoveNode Behavior Question

Every topology operation tries to respect/restore the RF except for assassinate.

On Wed, Feb 22, 2017 at 12:45 PM, Anubhav Kale < 
anubhav.k...@microsoft.com.invalid> wrote:

> Hello,
>
> Recently, I started noticing an interesting pattern. When I execute 
> "removenode", a subset of the nodes that now own the tokens result it 
> in a CPU spike / disk activity, and sometimes SSTables on those nodes shoot 
> up.
>
> After looking through the code, it appears to me that below function 
> forces data to be streamed from some of the new nodes to the node from 
> where "removenode" is kicked in. Is my understanding correct ?
>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
> b.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd3e9&dat
> a=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cf22f2e33447f46c5e82a08d45b5
> 38008%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636233861574178675&
> sdata=NGkgls2RTfWTM7MBJ4MuKdxd7pRZiSRGcWDVUmXwG5Q%3D&reserved=0
> 66e72cd046/src/java/org/apache/cassandra/service/StorageService.java#L
> 2548  2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd
> 3 e966e72cd046%2Fsrc%2Fjava%2Forg%2Fapache%2Fcassandra%
> 2Fservice%2FStorageService.java%23L2548&data=02%7C01%
> 7CAnubhav.Kale%40microsoft.com%7C173daa48fcaf4ca6498d08d43982318c%
> 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636196678720784947&sdata=
> JZ9zWh%2FtJJ%2FbhXXkT41yQhANKaUSBHfP53WraY2vL8M%3D&reserved=0>
>
> Our nodes don't run very hot, but it appears this streaming causes 
> them to have issues. If I understand the code correctly, the node 
> that's initiated removenode may still not get all the data for moved 
> over ranges. So, what is the rationale behind trying to build a "partial 
> replica" ?
>
> Maybe, I am not following this correctly so hoping someone can explain.
>
> Thanks !
>
>


Re: RemoveNode Behavior Question

2017-02-22 Thread Brandon Williams
The node that invoked removenode is entirely irrelevant, any node can
invoke it.

On Wed, Feb 22, 2017 at 12:51 PM, Anubhav Kale <
anubhav.k...@microsoft.com.invalid> wrote:

> But I don't understand how the replica count is getting restored here. The
> node that invoked removenode only owns partial ranges.
>
> -Original Message-
> From: Brandon Williams [mailto:dri...@gmail.com]
> Sent: Wednesday, February 22, 2017 10:49 AM
> To: dev@cassandra.apache.org
> Subject: Re: RemoveNode Behavior Question
>
> Every topology operation tries to respect/restore the RF except for
> assassinate.
>
> On Wed, Feb 22, 2017 at 12:45 PM, Anubhav Kale <
> anubhav.k...@microsoft.com.invalid> wrote:
>
> > Hello,
> >
> > Recently, I started noticing an interesting pattern. When I execute
> > "removenode", a subset of the nodes that now own the tokens result it
> > in a CPU spike / disk activity, and sometimes SSTables on those nodes
> shoot up.
> >
> > After looking through the code, it appears to me that below function
> > forces data to be streamed from some of the new nodes to the node from
> > where "removenode" is kicked in. Is my understanding correct ?
> >
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
> > b.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd3e9&dat
> > a=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cf22f2e33447f46c5e82a08d45b5
> > 38008%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636233861574178675&
> > sdata=NGkgls2RTfWTM7MBJ4MuKdxd7pRZiSRGcWDVUmXwG5Q%3D&reserved=0
> > 66e72cd046/src/java/org/apache/cassandra/service/StorageService.java#L
> > 2548  > 2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd
> > 3 e966e72cd046%2Fsrc%2Fjava%2Forg%2Fapache%2Fcassandra%
> > 2Fservice%2FStorageService.java%23L2548&data=02%7C01%
> > 7CAnubhav.Kale%40microsoft.com%7C173daa48fcaf4ca6498d08d43982318c%
> > 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636196678720784947&sdata=
> > JZ9zWh%2FtJJ%2FbhXXkT41yQhANKaUSBHfP53WraY2vL8M%3D&reserved=0>
> >
> > Our nodes don't run very hot, but it appears this streaming causes
> > them to have issues. If I understand the code correctly, the node
> > that's initiated removenode may still not get all the data for moved
> > over ranges. So, what is the rationale behind trying to build a "partial
> replica" ?
> >
> > Maybe, I am not following this correctly so hoping someone can explain.
> >
> > Thanks !
> >
> >
>


RE: RemoveNode Behavior Question

2017-02-22 Thread Anubhav Kale
Nevermind. I figured out this is happening on all nodes where the tokens got 
moved. So, explains the big streaming going around in the cluster.

-Original Message-
From: Brandon Williams [mailto:dri...@gmail.com] 
Sent: Wednesday, February 22, 2017 10:53 AM
To: dev@cassandra.apache.org
Subject: Re: RemoveNode Behavior Question

The node that invoked removenode is entirely irrelevant, any node can invoke it.

On Wed, Feb 22, 2017 at 12:51 PM, Anubhav Kale < 
anubhav.k...@microsoft.com.invalid> wrote:

> But I don't understand how the replica count is getting restored here. 
> The node that invoked removenode only owns partial ranges.
>
> -Original Message-
> From: Brandon Williams [mailto:dri...@gmail.com]
> Sent: Wednesday, February 22, 2017 10:49 AM
> To: dev@cassandra.apache.org
> Subject: Re: RemoveNode Behavior Question
>
> Every topology operation tries to respect/restore the RF except for 
> assassinate.
>
> On Wed, Feb 22, 2017 at 12:45 PM, Anubhav Kale < 
> anubhav.k...@microsoft.com.invalid> wrote:
>
> > Hello,
> >
> > Recently, I started noticing an interesting pattern. When I execute 
> > "removenode", a subset of the nodes that now own the tokens result 
> > it in a CPU spike / disk activity, and sometimes SSTables on those 
> > nodes
> shoot up.
> >
> > After looking through the code, it appears to me that below function 
> > forces data to be streamed from some of the new nodes to the node 
> > from where "removenode" is kicked in. Is my understanding correct ?
> >
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> > hu 
> > b.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd3e9&d
> > at
> > a=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cf22f2e33447f46c5e82a08d45
> > b5 
> > 38008%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63623386157417867
> > 5&
> > sdata=NGkgls2RTfWTM7MBJ4MuKdxd7pRZiSRGcWDVUmXwG5Q%3D&reserved=0
> > 66e72cd046/src/java/org/apache/cassandra/service/StorageService.java
> > #L
> > 2548 
> >  > 2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9
> > dd
> > 3 e966e72cd046%2Fsrc%2Fjava%2Forg%2Fapache%2Fcassandra%
> > 2Fservice%2FStorageService.java%23L2548&data=02%7C01%
> > 7CAnubhav.Kale%40microsoft.com%7C173daa48fcaf4ca6498d08d43982318c%
> > 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636196678720784947&sdat
> > a= JZ9zWh%2FtJJ%2FbhXXkT41yQhANKaUSBHfP53WraY2vL8M%3D&reserved=0>
> >
> > Our nodes don't run very hot, but it appears this streaming causes 
> > them to have issues. If I understand the code correctly, the node 
> > that's initiated removenode may still not get all the data for moved 
> > over ranges. So, what is the rationale behind trying to build a 
> > "partial
> replica" ?
> >
> > Maybe, I am not following this correctly so hoping someone can explain.
> >
> > Thanks !
> >
> >
>


Truncate operation not available in Mutation Object

2017-02-22 Thread Sanal Vasudevan
Hi Folks,

I am trying to read Mutations from commit log files through an
implementation of CommitLogReadHandler interface.

For a truncate CQL operation, I do not see a Mutation object.

Does C* skip writing the truncate operation into the commit log file?


Thanks for your help.

Best regards,
Sanal


Re: Truncate operation not available in Mutation Object

2017-02-22 Thread Jeremy Hanna
Everything in that table is deleted. There's no mutation or anything in the 
commitlog. It's a deletion of all the sstables for that table. To make sure 
everything is gone, it first does a flush, then a snapshot to protect against a 
mistake, then the truncate itself.

> On Feb 22, 2017, at 6:05 PM, Sanal Vasudevan  wrote:
> 
> Hi Folks,
> 
> I am trying to read Mutations from commit log files through an
> implementation of CommitLogReadHandler interface.
> 
> For a truncate CQL operation, I do not see a Mutation object.
> 
> Does C* skip writing the truncate operation into the commit log file?
> 
> 
> Thanks for your help.
> 
> Best regards,
> Sanal


Re: Truncate operation not available in Mutation Object

2017-02-22 Thread Sanal Vasudevan
Thanks Jeremy.
Any way I could detect that such a truncate operation was performed on the
table? Does it leave a trace that the truncate happened anywhere?


Best regards,
Sanal

On Thu, Feb 23, 2017 at 11:47 AM, Jeremy Hanna 
wrote:

> Everything in that table is deleted. There's no mutation or anything in
> the commitlog. It's a deletion of all the sstables for that table. To make
> sure everything is gone, it first does a flush, then a snapshot to protect
> against a mistake, then the truncate itself.
>
> > On Feb 22, 2017, at 6:05 PM, Sanal Vasudevan 
> wrote:
> >
> > Hi Folks,
> >
> > I am trying to read Mutations from commit log files through an
> > implementation of CommitLogReadHandler interface.
> >
> > For a truncate CQL operation, I do not see a Mutation object.
> >
> > Does C* skip writing the truncate operation into the commit log file?
> >
> >
> > Thanks for your help.
> >
> > Best regards,
> > Sanal
>



-- 
Sanal Vasudevan Nair