Re: Evolving the client protocol

2018-04-19 Thread Ben Bromhead
Re #3:

Yup I was thinking each shard/port would appear as a discrete server to the
client.

If the per port suggestion is unacceptable due to hardware requirements,
remembering that Cassandra is built with the concept scaling *commodity*
hardware horizontally, you'll have to spend your time and energy convincing
the community to support a protocol feature it has no (current) use for or
find another interim solution.

Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the protocol
change and shard concept you are looking for ;P


On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:

> Hi,
>
> So at technical level I don't understand this yet.
>
> So you have a database consisting of single threaded shards and a socket
> for accept that is generating TCP connections and in advance you don't know
> which connection is going to send messages to which shard.
>
> What is the mechanism by which you get the packets for a given TCP
> connection delivered to a specific core? I know that a given TCP connection
> will normally have all of its packets delivered to the same queue from the
> NIC because the tuple of source address + port and destination address +
> port is typically hashed to pick one of the queues the NIC presents. I
> might have the contents of the tuple slightly wrong, but it always includes
> a component you don't get to control.
>
> Since it's hashing how do you manipulate which queue packets for a TCP
> connection go to and how is it made worse by having an accept socket per
> shard?
>
> You also mention 160 ports as bad, but it doesn't sound like a big number
> resource wise. Is it an operational headache?
>
> RE tokens distributed amongst shards. The way that would work right now is
> that each port number appears to be a discrete instance of the server. So
> you could have shards be actual shards that are simply colocated on the
> same box, run in the same process, and share resources. I know this pushes
> more of the complexity into the server vs the driver as the server expects
> all shards to share some client visible like system tables and certain
> identifiers.
>
> Ariel
> On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
> > Port-per-shard is likely the easiest option but it's too ugly to
> > contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
> > IIRC), it will be just horrible to have 160 open ports.
> >
> >
> > It also doesn't fit will with the NICs ability to automatically
> > distribute packets among cores using multiple queues, so the kernel
> > would have to shuffle those packets around. Much better to have those
> > packets delivered directly to the core that will service them.
> >
> >
> > (also, some protocol changes are needed so the driver knows how tokens
> > are distributed among shards)
> >
> > On 2018-04-19 19:46, Ben Bromhead wrote:
> > > WRT to #3
> > > To fit in the existing protocol, could you have each shard listen on a
> > > different port? Drivers are likely going to support this due to
> > > https://issues.apache.org/jira/browse/CASSANDRA-7544 (
> > > https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
> > > familiar with the ticket so their might be something I'm missing but it
> > > sounds like a potential approach.
> > >
> > > This would give you a path forward at least for the short term.
> > >
> > >
> > > On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg 
> wrote:
> > >
> > >> Hi,
> > >>
> > >> I think that updating the protocol spec to Cassandra puts the onus on
> the
> > >> party changing the protocol specification to have an implementation
> of the
> > >> spec in Cassandra as well as the Java and Python driver (those are
> both
> > >> used in the Cassandra repo). Until it's implemented in Cassandra we
> haven't
> > >> fully evaluated the specification change. There is no substitute for
> trying
> > >> to make it work.
> > >>
> > >> There are also realities to consider as to what the maintainers of the
> > >> drivers are willing to commit.
> > >>
> > >> RE #1,
> > >>
> > >> I am +1 on the fact that we shouldn't require an extra hop for range
> scans.
> > >>
> > >> In JIRA Jeremiah made the point that you can still do this

Re: Evolving the client protocol

2018-04-23 Thread Ben Bromhead
> >> This doesn't work without additional changes, for RF>1. The token ring
> could place two replicas of the same token range on the same physical
> server, even though those are two separate cores of the same server. You
> could add another element to the hierarchy (cluster -> datacenter -> rack
> -> node -> core/shard), but that generates unneeded range movements when a
> node is added.
> > I have seen rack awareness used/abused to solve this.
> >
>
> But then you lose real rack awareness. It's fine for a quick hack, but
> not a long-term solution.
>
> (it also creates a lot more tokens, something nobody needs)
>

I'm having trouble understanding how you loose "real" rack awareness, as
these shards are in the same rack anyway, because the address and port are
on the same server in the same rack. So it behaves as expected. Could you
explain a situation where the shards on a single server would be in
different racks (or fault domains)?

If you wanted to support a situation where you have a single rack per DC
for simple deployments, extending NetworkTopologyStrategy to behave the way
it did before https://issues.apache.org/jira/browse/CASSANDRA-7544 with
respect to treating InetAddresses as servers rather than the address and
port would be simple. Both this implementation in Apache Cassandra and the
respective load balancing classes in the drivers are explicitly designed to
be pluggable so that would be an easier integration point for you.

I'm not sure how it creates more tokens? If a server normally owns 256
tokens, each shard on a different port would just advertise ownership of
256/# of cores (e.g. 4 tokens if you had 64 cores).


>
> > Regards,
> > Ariel
> >
> >> On Apr 22, 2018, at 8:26 AM, Avi Kivity  wrote:
> >>
> >>
> >>
> >>> On 2018-04-19 21:15, Ben Bromhead wrote:
> >>> Re #3:
> >>>
> >>> Yup I was thinking each shard/port would appear as a discrete server
> to the
> >>> client.
> >> This doesn't work without additional changes, for RF>1. The token ring
> could place two replicas of the same token range on the same physical
> server, even though those are two separate cores of the same server. You
> could add another element to the hierarchy (cluster -> datacenter -> rack
> -> node -> core/shard), but that generates unneeded range movements when a
> node is added.
> >>
> >>> If the per port suggestion is unacceptable due to hardware
> requirements,
> >>> remembering that Cassandra is built with the concept scaling
> *commodity*
> >>> hardware horizontally, you'll have to spend your time and energy
> convincing
> >>> the community to support a protocol feature it has no (current) use
> for or
> >>> find another interim solution.
> >> Those servers are commodity servers (not x86, but still commodity). In
> any case 60+ logical cores are common now (hello AWS i3.16xlarge or even
> i3.metal), and we can only expect logical core count to continue to
> increase (there are 48-core ARM processors now).
> >>
> >>> Another way, would be to build support and consensus around a clear
> >>> technical need in the Apache Cassandra project as it stands today.
> >>>
> >>> One way to build community support might be to contribute an Apache
> >>> licensed thread per core implementation in Java that matches the
> protocol
> >>> change and shard concept you are looking for ;P
> >> I doubt I'll survive the egregious top-posting that is going on in this
> list.
> >>
> >>>
> >>>> On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg 
> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> So at technical level I don't understand this yet.
> >>>>
> >>>> So you have a database consisting of single threaded shards and a
> socket
> >>>> for accept that is generating TCP connections and in advance you
> don't know
> >>>> which connection is going to send messages to which shard.
> >>>>
> >>>> What is the mechanism by which you get the packets for a given TCP
> >>>> connection delivered to a specific core? I know that a given TCP
> connection
> >>>> will normally have all of its packets delivered to the same queue
> from the
> >>>> NIC because the tuple of source address + port and destination
> address +
> >>>> port is typically hashed to pick one of the queues the NIC presents. I
> >>>> might hav

Re: Testing 4.0 Post-Freeze

2018-07-10 Thread Ben Bromhead
Well put Mick

+1

On Tue, Jul 10, 2018 at 1:06 PM Aleksey Yeshchenko 
wrote:

> +1 from me too.
>
> —
> AY
>
> On 10 July 2018 at 04:17:26, Mick Semb Wever (m...@apache.org) wrote:
>
>
> > We have done all this for previous releases and we know it has not
> worked
> > well. So how giving it one more try is going to help here. Can someone
> > outline what will change for 4.0 which will make it more successful?
>
>
> I (again) agree with you Sankalp :-)
>
> Why not try something new?
> It's easier to discuss these things more genuinely after trying it out.
>
> One of the differences in the branching approaches: to feature-freeze on a
> 4.0 branch or on trunk; is who it is that has to then merge and work with
> multiple branches.
>
> Where that small but additional effort is placed I think becomes a signal
> to what the community values most: new features or stability.
>
> I think most folk would vote for stability, so why not give this approach
> a go and to learn from it.
> It also creates an incentive to make the feature-freeze period as short as
> possible, moving us towards an eventual goal of not needing to
> feature-freeze at all.
>
> regards,
> Mick
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: [VOTE] Branching Change for 4.0 Freeze

2018-07-13 Thread Ben Bromhead
+1 nb

On Fri, Jul 13, 2018, 09:40 Jordan West  wrote:

> +1 (non-binding)
>
> On Fri, Jul 13, 2018 at 5:02 AM, J. D. Jordan 
> wrote:
>
> > -0 (non-binding) as well for similar reasons to Gary.
> >
> > > On Jul 12, 2018, at 8:23 AM, Gary Dusbabek 
> wrote:
> > >
> > > -0
> > >
> > > I'm not interested in sparking a discussion on this, because a) it has
> > > already happened and b) it seems I am in a minority. But thought I
> should
> > > at least include the rationale for my vote:
> > > * This proposal goes against the "scratch an itch" philosophy of making
> > > contributions to an Apache project and IMO will discourage
> contributions
> > > that are casual or new.
> > > * It feels dictatorial. IMO the right way to do this would be for
> > > impassioned committers to -1 any patch that goes against elements a, b,
> > or
> > > c of what this vote is for.
> > >
> > > Gary.
> > >
> > >
> > > On Wed, Jul 11, 2018 at 4:46 PM sankalp kohli 
> > > wrote:
> > >
> > >> Hi,
> > >>As discussed in the thread[1], we are proposing that we will not
> > branch
> > >> on 1st September but will only allow following merges into trunk.
> > >>
> > >> a. Bug and Perf fixes to 4.0.
> > >> b. Critical bugs in any version of C*.
> > >> c. Testing changes to help test 4.0
> > >>
> > >> If someone has a change which does not fall under these three, we can
> > >> always discuss it and have an exception.
> > >>
> > >> Vote will be open for 72 hours.
> > >>
> > >> Thanks,
> > >> Sankalp
> > >>
> > >> [1]
> > >>
> > >> https://lists.apache.org/thread.html/494c3ced9e83ceeb53fa127e44eec6
> > e2588a01b769896b25867fd59f@%3Cdev.cassandra.apache.org%3E
> > >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


NGCC 2018?

2018-07-23 Thread Ben Bromhead
The year has gotten away from us a little bit, but now is as good a time as
any to put out a general call for interest in an NGCC this year.

Last year Gary and Eric did an awesome job organizing it in San Antonio.
This year it might be a good idea to do it in another city?

We at Instaclustr are happy to sponsor/organize/run it, but ultimately this
is a community event and we only want to do it if there is a strong desire
to attend from the community and it meets the wider needs.

Here are a few thoughts we have had in no particular order:

   - I was thinking it might be worth doing it in SF/Bay Area around the
   dates of distributed data day (14th of September) as I know a number of
   folks will be in town for it.
   - Typically NGCC has focused on being a single day, single track
   conference with scheduled sessions and an unconference set of ad-hoc talks
   at the end. It may make sense to change this up given the pending freeze
   (maybe make this more like a commit/review fest)? Or keep it in the same
   format but focus on the 4.0 work at hand.
   - Any community members who want to get involved again in the more
   organizational side of it (Gary, Eric)?
   - Any other sponsors (doesn't have to be monetary, can be space,
   resource etc) who want to get involved?

If folks are generally happy with the end approach we'll post details as
soon as possible (given its July right now)!

Ben


-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: NGCC 2018?

2018-07-24 Thread Ben Bromhead
That was a factor in our thinking and my suggested timing/city, but as you
know such an event is more than just space in a conference room :)

On Tue, Jul 24, 2018 at 1:03 PM Patrick McFadin  wrote:

> Ben,
>
> Lynn Bender had offered a space the day before Distributed Data Summit in
> September (http://distributeddatasummit.com/) since we are both platinum
> sponsors. I thought he and Nate had talked about that being a good place
> for NGCC since many of us will be in town already.
>
> Nate, now that I've spoken for you, you can clarify, :D
>
> Patrick
>
>
> On Mon, Jul 23, 2018 at 2:25 PM Ben Bromhead  wrote:
>
> > The year has gotten away from us a little bit, but now is as good a time
> as
> > any to put out a general call for interest in an NGCC this year.
> >
> > Last year Gary and Eric did an awesome job organizing it in San Antonio.
> > This year it might be a good idea to do it in another city?
> >
> > We at Instaclustr are happy to sponsor/organize/run it, but ultimately
> this
> > is a community event and we only want to do it if there is a strong
> desire
> > to attend from the community and it meets the wider needs.
> >
> > Here are a few thoughts we have had in no particular order:
> >
> >- I was thinking it might be worth doing it in SF/Bay Area around the
> >dates of distributed data day (14th of September) as I know a number
> of
> >folks will be in town for it.
> >- Typically NGCC has focused on being a single day, single track
> >conference with scheduled sessions and an unconference set of ad-hoc
> > talks
> >at the end. It may make sense to change this up given the pending
> freeze
> >(maybe make this more like a commit/review fest)? Or keep it in the
> same
> >format but focus on the 4.0 work at hand.
> >- Any community members who want to get involved again in the more
> >organizational side of it (Gary, Eric)?
> >- Any other sponsors (doesn't have to be monetary, can be space,
> >resource etc) who want to get involved?
> >
> > If folks are generally happy with the end approach we'll post details as
> > soon as possible (given its July right now)!
> >
> > Ben
> >
> >
> > --
> > Ben Bromhead
> > CTO | Instaclustr <https://www.instaclustr.com/>
> > +1 650 284 9692 <(650)%20284-9692>
> > Reliability at Scale
> > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: NGCC 2018?

2018-07-26 Thread Ben Bromhead
It sounds like there may be an appetite for something, but the NGCC in its
current format is likely to not be that useful?

Is a bay area event focused on C* developers something that is interesting
for the broader dev community? In whatever format that may be?

On Tue, Jul 24, 2018 at 5:02 PM Nate McCall  wrote:

> This was discussed amongst the PMC recently. We did not come to a
> conclusion and there were not terribly strong feelings either way.
>
> I don't feel like we need to hustle to get "NGCC" in place,
> particularly given our decided focus on 4.0. However, that should not
> stop us from doing an additional 'c* developer' event in sept. to
> coincide with distributed data summit.
>
> On Wed, Jul 25, 2018 at 5:03 AM, Patrick McFadin 
> wrote:
> > Ben,
> >
> > Lynn Bender had offered a space the day before Distributed Data Summit in
> > September (http://distributeddatasummit.com/) since we are both platinum
> > sponsors. I thought he and Nate had talked about that being a good place
> > for NGCC since many of us will be in town already.
> >
> > Nate, now that I've spoken for you, you can clarify, :D
> >
> > Patrick
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: NGCC 2018?

2018-09-10 Thread Ben Bromhead
I like Jon's idea, plenty of food for thought in terms of a 4.0
retrospective + what's on the future.

On Wed, Sep 5, 2018 at 1:18 PM sankalp kohli  wrote:

> Another thing to  discuss will be how to improve testing further from the
> learning of finding bugs in C* 4.0.
>
> On Wed, Sep 5, 2018 at 9:57 AM Jason Brown  wrote:
>
> > +1 to Jon's sentiment. Further, perhaps we should use that time after
> > GA'ing 4.0 to poll our users what they need/want from the next major
> > release of the database.
> >
> > On Wed, Sep 5, 2018 at 9:31 AM, Jonathan Haddad 
> wrote:
> >
> > > I'm thinking a month or two after 4.0 would give us time to unwind
> after
> > > the release and start to give real thought to big changes coming in the
> > > next release.  Let's focus on one thing at a time.
> > >
> > > On Wed, Sep 5, 2018 at 12:29 PM sankalp kohli 
> > > wrote:
> > >
> > > > A good time for NGCC will be closer to 4.0 release where we can plan
> > what
> > > > we can put it on 4.0-next. I am not sure doing it now is going to
> help
> > > when
> > > > we are months away from 4.0 release.
> > > >
> > > > On Fri, Aug 31, 2018 at 7:42 AM Jay Zhuang 
> > > wrote:
> > > >
> > > > > Are we going to have a dev event next month? Or anything this year?
> > We
> > > > may
> > > > > also be able to provide space in bay area and help to organize it.
> > > > (Please
> > > > > let us know, so we could get final approval for that).
> > > > >
> > > > > On Fri, Jul 27, 2018 at 10:05 AM Jonathan Haddad <
> j...@jonhaddad.com>
> > > > > wrote:
> > > > >
> > > > > > My interpretation of Nate's statement was that since there would
> > be a
> > > > > bunch
> > > > > > of us at Lynn's event, we might as well do NGCC at the same time.
> > > > > >
> > > > > > On Thu, Jul 26, 2018 at 9:03 PM Ben Bromhead <
> b...@instaclustr.com>
> > > > > wrote:
> > > > > >
> > > > > > > It sounds like there may be an appetite for something, but the
> > NGCC
> > > > in
> > > > > > its
> > > > > > > current format is likely to not be that useful?
> > > > > > >
> > > > > > > Is a bay area event focused on C* developers something that is
> > > > > > interesting
> > > > > > > for the broader dev community? In whatever format that may be?
> > > > > > >
> > > > > > > On Tue, Jul 24, 2018 at 5:02 PM Nate McCall <
> zznat...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > This was discussed amongst the PMC recently. We did not come
> > to a
> > > > > > > > conclusion and there were not terribly strong feelings either
> > > way.
> > > > > > > >
> > > > > > > > I don't feel like we need to hustle to get "NGCC" in place,
> > > > > > > > particularly given our decided focus on 4.0. However, that
> > should
> > > > not
> > > > > > > > stop us from doing an additional 'c* developer' event in
> sept.
> > to
> > > > > > > > coincide with distributed data summit.
> > > > > > > >
> > > > > > > > On Wed, Jul 25, 2018 at 5:03 AM, Patrick McFadin <
> > > > pmcfa...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > Ben,
> > > > > > > > >
> > > > > > > > > Lynn Bender had offered a space the day before Distributed
> > Data
> > > > > > Summit
> > > > > > > in
> > > > > > > > > September (http://distributeddatasummit.com/) since we are
> > > both
> > > > > > > platinum
> > > > > > > > > sponsors. I thought he and Nate had talked about that
> being a
> > > > good
> > > > > > > place
> > > > > > > > > for NGCC since many of us will be in town already.
> > > > > > > > >
> > > > > > > > > Nate, now that I've spoken for you, you can clarify, :D
> > > > > > > > >
> > > > > > > > > Patrick
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > -
> > > > > > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > > > > > For additional commands, e-mail:
> dev-h...@cassandra.apache.org
> > > > > > > >
> > > > > > > > --
> > > > > > > Ben Bromhead
> > > > > > > CTO | Instaclustr <https://www.instaclustr.com/>
> > > > > > > +1 650 284 9692 <(650)%20284-9692>
> > > > > > > Reliability at Scale
> > > > > > > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and
> Softlayer
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Jon Haddad
> > > > > > http://www.rustyrazorblade.com
> > > > > > twitter: rustyrazorblade
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Jon Haddad
> > > http://www.rustyrazorblade.com
> > > twitter: rustyrazorblade
> > >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: [VOTE] Accept GoCQL driver donation and begin incubation process

2018-09-12 Thread Ben Bromhead
rough the incubator per the process outlined here:
> > > >>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__incubator.apache.org_guides_ip-5Fclearance.html&d=DwIFAg&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=g-MlYFZVJ7j5Dj_ZfPfa0Ik8Nxco7QsJhTG1TnJH7xI&s=rk5T_t1HZY6PAhN5XgflBhfEtNrcZkVTIvQxixDlw9o&e=
> > > >>>>>>
> > > >>>>>> Pending the outcome of this vote, we will create the JIRA issues
> > for
> > > >>>>>> tracking and after we go through the process, and discuss adding
> > > >>>>>> committers in a separate thread (we need to do this atomically
> > > >> anyway
> > > >>>>>> per general ASF committer adding processes).
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>> -Nate
> > > >>>>>>
> > > >>>>>>
> > > >>
> -
> > > >>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > -
> > > >>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > -
> > > >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-11 Thread Ben Bromhead
This is something that's bugged me for ages, tbh the performance gain for
most use cases far outweighs the increase in memory usage and I would even
be in favor of changing the default now, optimizing the storage cost later
(if it's found to be worth it).

For some anecdotal evidence:
4kb is usually what we end setting it to, 16kb feels more reasonable given
the memory impact, but what would be the point if practically, most folks
set it to 4kb anyway?

Note that chunk_length will largely be dependent on your read sizes, but 4k
is the floor for most physical devices in terms of ones block size.

+1 for making this change in 4.0 given the small size and the large
improvement to new users experience (as long as we are explicit in the
documentation about memory consumption).


On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg  wrote:

> Hi,
>
> This is regarding https://issues.apache.org/jira/browse/CASSANDRA-13241
>
> This ticket has languished for a while. IMO it's too late in 4.0 to
> implement a more memory efficient representation for compressed chunk
> offsets. However I don't think we should put out another release with the
> current 64k default as it's pretty unreasonable.
>
> I propose that we lower the value to 16kb. 4k might never be the correct
> default anyways as there is a cost to compression and 16k will still be a
> large improvement.
>
> Benedict and Jon Haddad are both +1 on making this change for 4.0. In the
> past there has been some consensus about reducing this value although maybe
> with more memory efficiency.
>
> The napkin math for what this costs is:
> "If you have 1TB of uncompressed data, with 64k chunks that's 16M chunks
> at 8 bytes each (128MB).
> With 16k chunks, that's 512MB.
> With 4k chunks, it's 2G.
> Per terabyte of data (pre-compression)."
>
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
>
> By way of comparison memory mapping the files has a similar cost per 4k
> page of 8 bytes. Multiple mappings makes this more expensive. With a
> default of 16kb this would be 4x less expensive than memory mapping a file.
> I only mention this to give a sense of the costs we are already paying. I
> am not saying they are directly related.
>
> I'll wait a week for discussion and if there is consensus make the change.
>
> Regards,
> Ariel
>
> -----
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: [VOTE] Change Jira Workflow

2018-12-17 Thread Ben Bromhead
+1 non-binding

On Mon, Dec 17, 2018 at 12:31 PM jay.zhu...@yahoo.com.INVALID
 wrote:

>  +1
>
> On Monday, December 17, 2018, 9:10:55 AM PST, Jason Brown <
> jasedbr...@gmail.com> wrote:
>
>  +1.
>
> On Mon, Dec 17, 2018 at 7:36 AM Michael Shuler 
> wrote:
>
> > +1
> >
> > --
> > Michael
> >
> > On 12/17/18 9:19 AM, Benedict Elliott Smith wrote:
> > > I propose these changes <
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/JIRA+Workflow+Proposals
> >*
> > to the Jira Workflow for the project.  The vote will be open for 72
> hours**.
> > >
> > > I am, of course, +1.
> > >
> > > * With the addendum of the mailing list discussion <
> >
> https://lists.apache.org/thread.html/e4668093169aa4ef52f2bea779333f04a0afde8640c9a79a8c86ee74@%3Cdev.cassandra.apache.org%3E
> >;
> > in case of any conflict arising from a mistake on my part in the wiki,
> the
> > consensus reached by polling the mailing list will take precedence.
> > > ** I won’t be around to close the vote, as I will be on vacation.
> > Everyone is welcome to ignore the result until I get back in a couple of
> > weeks, or if anybody is eager feel free to close the vote and take some
> > steps towards implementation.
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >



-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692


Re: [VOTE] remove the old wiki

2019-06-04 Thread Ben Bromhead
+1 to removing once appropriate content is moved over

On Tue, Jun 4, 2019 at 5:08 PM Valerie Parham-Thompson <
vale...@tortugatech.com> wrote:

> +1 for cleaning up.
>
> I like the format of these write path and read path images, if they can be
> updated/added to the new docs:
> https://wiki.apache.org/cassandra/WritePathForUsers <
> https://wiki.apache.org/cassandra/WritePathForUsers>
> https://wiki.apache.org/cassandra/ReadPathForUsers <
> https://wiki.apache.org/cassandra/ReadPathForUsers>
>
> Valerie
>
> > On Jun 4, 2019, at 2:46 PM, Jon Haddad  wrote:
> >
> > I assume everyone here knows the old wiki hasn't been maintained, and is
> > years out of date.  I propose we sunset it completely and delete it
> forever
> > from the world.
> >
> > I'm happy to file the INFRA ticket to delete it, I'd just like to give
> > everyone the opportunity to speak up in case there's something I'm not
> > aware of.
> >
> > In favor of removing the wiki?  That's a +1.
> > -1 if you think we're better off migrating the entire thing to cwiki.
> >
> > If you only need couple pages, feel free to move the content to the
> > documentation.  I'm sure we can also export the wiki in its entirety and
> > put it somewhere offline, if there's a concern about maybe needing some
> of
> > the content at some point in the future.
> >
> > I think 72 hours is enough time to leave a vote open on this topic.
> >
> > Jon
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692


Re: Cassandra image for Kubernetes

2019-09-20 Thread Ben Bromhead
e tree.
> >>>
> >>> There are, however, some caveats:
> >>>
> >>
> https://issues.apache.org/jira/browse/LEGAL-270?focusedCommentId=15524446&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15524446
> >>>
> >>> As long as we could adhere to those guidelines (which I don't see as
> >> hard)
> >>> we could do this.
> >>>
> >>> In looking through your specific image (thanks for posting this, btw),
> I
> >>> would personally prefer something with a lot fewer dependencies
> >> (basically
> >>> from just our source tree) and a lot more replacement properties
> >> available
> >>> for config files.
> >>>
> >>> Curious what other folks think?
> >>>
> >>> Cheers,
> >>> -Nate
> >>>
> >>>> On Wed, Sep 18, 2019 at 6:43 AM Cyril Scetbon 
> >> wrote:
> >>>>
> >>>> Hey guys,
> >>>>
> >>>> I heard that at the last summit there were discussions about providing
> >> an
> >>>> official docker image to run Cassandra on Kubernetes. Is it something
> >> that
> >>>> you’ve started to work on ? We have our own at
> >>>> https://github.com/Orange-OpenSource/cassandra-image <
> >>>> https://github.com/Orange-OpenSource/cassandra-image> but I think
> >>>> providing an official image makes sense. As long as we can easily do
> >>>> everything we do today. We could also collaborate.
> >>>>
> >>>> Thank you
> >>>> —
> >>>> Cyril Scetbon
> >>>>
> >>>>
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692


Re: [VOTE] Cassandra Enhancement Proposal (CEP) documentation

2019-11-04 Thread Ben Bromhead
+1

On Mon, Nov 4, 2019 at 10:56 AM Vinay Chella 
wrote:

> +1
>
> -Vinay Chella
>
> On Sat, Nov 2, 2019 at 5:09 PM Jeff Carpenter 
> wrote:
>
> > FYI the audio of the session with Ben Bromhead / Scott Andreas is
> > available:
> >
> >
> https://feathercast.apache.org/2019/09/12/apache-cassandra-community-health-ben-bromhead-scott-andreas/
> > <
> >
> https://feathercast.apache.org/2019/09/12/apache-cassandra-community-health-ben-bromhead-scott-andreas/
> > >
> >
> > Jeff
> >
> > > On Nov 1, 2019, at 11:19 AM, Scott Andreas 
> wrote:
> > >
> > > Hi Michael,
> > >
> > > Unfortunately not; ASF recorded the keynotes but video/streaming
> > facilities weren't available for the individual tracks.
> > >
> > > A copy of the slides are uploaded here:
> > >
> >
> https://github.com/ngcc/ngcc2019/blob/master/Committing%20to%20our%20Users%20-%20Product%20and%20Release%20Management%20in%20Apache%20Cassandra.pdf
> > >
> > > I also have a version with messy presenters' notes. I'll clean these up
> > and put them on Confluence. The outline and notes cover all of the
> > presentation's content in detail so the delta between "being there" and
> > "reading the notes" should be minimal.
> > >
> > > 
> > > From: Michael Shuler  on behalf of Michael
> > Shuler 
> > > Sent: Friday, November 1, 2019 9:09 AM
> > > To: dev@cassandra.apache.org
> > > Subject: Re: [VOTE] Cassandra Enhancement Proposal (CEP) documentation
> > >
> > > +1
> > >
> > > Scott, was your NGCC talk videoed and uploaded anywhere? I would love
> to
> > > watch, since I missed the event.
> > >
> > > Kind regards,
> > > Michael
> > >
> > > On 11/1/19 9:07 AM, Scott Andreas wrote:
> > >> +1 nb
> > >>
> > >>> On Nov 1, 2019, at 5:36 AM, Benedict Elliott Smith <
> > bened...@apache.org> wrote:
> > >>>
> > >>> +1
> > >>>
> > >>> On 01/11/2019, 12:33, "Mick Semb Wever"  wrote:
> > >>>
> > >>>Please vote on accepting the Cassandra Enhancement Proposal (CEP)
> > document as a starting point and guide towards improving collaboration
> on,
> > and success of, new features and significant improvements. In combination
> > with the recently accepted Cassandra Release Lifecycle documentation this
> > should help us moving forward as a project and community.
> > >>>
> > >>>
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652201
> > >>>
> > >>>Past discussions on the document/process have been touched on in a
> > number of threads in the dev ML.  The most recent thread was
> >
> https://lists.apache.org/thread.html/b5d1b1ca99324f84e4a40b9cba879e8f858f5f6e18447775fcf32155@%3Cdev.cassandra.apache.org%3E
> > >>>
> > >>>regards,
> > >>>Mick
> > >>>
> > >>>
> > -
> > >>>To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >>>For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> -----
> > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>>
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> >
> > --
>
> Thanks,
> Vinay Chella
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-29 Thread Ben Bromhead
+1 to reducing the number of tokens as low as possible for availability
issues. 4 lgtm

On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  wrote:

> Thanks for restarting this discussion Jeremy. I personally think 4 is a
> good number as a default. I think whatever we pick, we should have enough
> documentation for operators to make sense of the new defaults in 4.0.
>
> Dinesh
>
> > On Jan 28, 2020, at 9:25 PM, Jeremy Hanna 
> wrote:
> >
> > I wanted to start a discussion about the default for num_tokens that
> we'd like for people starting in Cassandra 4.0.  This is for ticket
> CASSANDRA-13701 <https://issues.apache.org/jira/browse/CASSANDRA-13701>
> (which has been duplicated a number of times, most recently by me).
> >
> > TLDR, based on availability concerns, skew concerns, operational
> concerns, and based on the fact that the new allocation algorithm can be
> configured fairly simply now, this is a proposal to go with 4 as the new
> default and the allocate_tokens_for_local_replication_factor set to 3.
> That gives a good experience out of the box for people and is the most
> conservative.  It does assume that racks and DCs have been configured
> correctly.  We would, of course, go into some detail in the NEWS.txt.
> >
> > Joey Lynch and Josh Snyder did an extensive analysis of availability
> concerns with high num_tokens/virtual nodes in their paper <
> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E>.
> This worsens as clusters grow larger.  I won't quote the paper here but in
> order to have a conservative default and with the accompanying new
> allocation algorithm, I think it makes sense as a default.
> >
> > The difficulties have always been that virtual nodes have been
> beneficial for operations but that 256 is too high for the purposes of
> repair and as Joey and Josh cover, for availability.  Going lower with the
> original allocation algorithm has produced skew in allocation in its naive
> distribution.  Enter CASSANDRA-7032 <
> https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new token
> allocation algorithm.  CASSANDRA-15260 <
> https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new
> algorithm operationally simpler.
> >
> > One other item of note - since Joey and Josh's analysis, there have been
> improvements in streaming and other considerations that can reduce the
> probability of more than one node representing some token range being
> unavailable, but it would still be good to be conservative.
> >
> > Please chime in with any concerns with having num_tokens=4 and
> allocate_tokens_for_local_replication_factor=3 and the accompanying
> rationale so we can improve the experience for all users.
> >
> > Other resources:
> >
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
> >
> https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
> >
> https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692


Kubernetes operator unification

2020-03-30 Thread Ben Bromhead
Hi All

With the announcement of a C* Sidecar and K8s operator from Datastax
(congrats btw), Jake and Stefan discussed moving to a more
standardised/unified implementation of an Apache Cassandra operator for
Kubernetes. Based on discussions with other folks either using our
operator, building/running their own or just getting started, there appears
to be some broader enthusiasm to a more unified approach outside of just
that thread.

The current state of play for folks looking to run Apache Cassandra,
particularly on Kubernetes, is fairly fragmented. There are multiple
projects all doing similar things from large companies operating C* at
scale on kubernetes, individual contributors and commercialising entities.
Each one of these projects also have similar but diverse implementations
and capabilities. From an end user perspective, it makes it very hard to
figure out what path to take and from someone who supports these end users,
I'd much rather support one implementation than 3 even if it's not the one
we wrote :)

To that end, I'd like to indicate that we (Instaclustr) are open to working
towards a project owned standardized K8s operators/sidecar/etc. How that
looks and how it gets implemented will surely be the subject of debate,
especially amongst those with existing implementations.

Before engaging in CEP process (
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652201)
it might be useful to informally discuss an approach to unifying
implementations.

To that end I'd like to circulate the following ideas to kick off the
discussion of an approach that might be useful:

We should look to start off with a new implementation in a separate repo
(as the sidecar project has done), leveraging the experience and
contributions from existing operator implementations and a framework like
the operator-framework, with the initial scope of just supporting our
distributed testing in our CI pipeline.

Targeting our own distributed test cases (e.g. dtests) brings a number of
benefits:

   - Defines a common environment and goals that minimizes each
   organisations unique kubernetes challenges.
   - Follows the spirit of the 4.0 release to be more dba/operator aligned,
   more production ready and easier to get right in a production setting OOB
   - Our test environment over time will look more and more like how users
   run Cassandra in production. This will be the biggest win IMHO.
   - The distributed tests will also serve as functional tests for the
   operator itself.

The main drawback I can see with this approach is it will potentially be a
longer path to getting a useable project based operator out the door. It
will also involve a ton of reworking dtests, which for some is going to a
hard blocker. From there we can start to expand and support more and more
real life use cases. Hopefully this is not a huge leap as our testing
should be covering most of those cases!

This is largely my personal gut feel on the approach and I'm looking
forward to folks other suggestions!

Cheers

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692


Re: [DISCUSS] Next steps for Kubernetes operator SIG

2020-09-22 Thread Ben Bromhead
For what it's worth, a quick update from me:

CassKop now has at least two organisations working on it substantially
(Orange and Instaclustr) as well as the numerous other contributors.

Internally we will also start pointing others towards CasKop once a few
things get merged. While we are not yet sunsetting our operator yet, it is
certainly looking that way.

I'd love to see the community adopt it as a starting point for working
towards whatever level of functionality is desired.

Cheers

Ben



On Fri, Sep 11, 2020 at 2:37 PM John Sanda  wrote:

> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie 
> wrote:
>
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
> more
> > operators in the ecosystem. Has one of them hit a clear supermajority of
> > adoption that makes it the de facto default and makes sense to pull it
> into
> > the project?
> >
> > We as a project community were pretty slow to move on building a PoV
> around
> > kubernetes so we find ourselves in a situation with a bunch of contenders
> > for inclusion in the project. It's not clear to me what heuristics we'd
> use
> > to gauge which one would be the best fit for inclusion outside letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> >
> >
> We actually talked a good bit on the SIG call earlier today about
> heuristics. We need to document what functionality an operator should
> include at level 0, level 1, etc. We did discuss this a good bit during
> some of the initial SIG meetings, but I guess it wasn't really a focal
> point at the time. I think we should also provide references to existing
> operator projects and possibly other related projects. This would benefit
> both community users as well as people working on these projects.
>
> - John
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692


Re: [DISCUSS] Next steps for Kubernetes operator SIG

2020-09-22 Thread Ben Bromhead
I think there is certainly an appetite to donate and standardise on a given
operator (as mentioned in this thread).

I personally found the SIG hard to participate in due to time zones and the
synchronous nature of it.

So while it was a great forum to dive into certain details for a subset of
participants and a worthwhile endeavour, I wouldn't paint it as an accurate
reflection of community intent.

I don't think that any participants want to continue down the path of  "let
a thousand flowers bloom". That's why we are looking towards CasKop (as
well as a number of technical reasons).

Some of the recorded meetings and outputs can also be found if you are
interested in some primary sources
https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Kubernetes+Operator+SIG
.

>From what I understand second-hand from talking to people on the SIG calls,
> there was a general inability to agree on an existing operator as a
> starting point and not much engagement on taking best of breed from the
> various to combine them. Seems to leave us in the "let a thousand flowers
> bloom" stage of letting operators grow in the ecosystem and seeing which
> ones meet the needs of end users before talking about adopting one into the
> foundation.
>
> Great to hear that you folks are joining forces though! Bodes well for C*
> users that are wanting to run things on k8s.
>
>
>
> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead  wrote:
>
> > For what it's worth, a quick update from me:
> >
> > CassKop now has at least two organisations working on it substantially
> > (Orange and Instaclustr) as well as the numerous other contributors.
> >
> > Internally we will also start pointing others towards CasKop once a few
> > things get merged. While we are not yet sunsetting our operator yet, it
> is
> > certainly looking that way.
> >
> > I'd love to see the community adopt it as a starting point for working
> > towards whatever level of functionality is desired.
> >
> > Cheers
> >
> > Ben
> >
> > On Fri, Sep 11, 2020 at 2:37 PM John Sanda  wrote:
> >
> > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie 
> > wrote:
> >
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
> >
> > more
> >
> > operators in the ecosystem. Has one of them hit a clear supermajority of
> > adoption that makes it the de facto default and makes sense to pull it
> >
> > into
> >
> > the project?
> >
> > We as a project community were pretty slow to move on building a PoV
> >
> > around
> >
> > kubernetes so we find ourselves in a situation with a bunch of contenders
> > for inclusion in the project. It's not clear to me what heuristics we'd
> >
> > use
> >
> > to gauge which one would be the best fit for inclusion outside letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> > We actually talked a good bit on the SIG call earlier today about
> > heuristics. We need to document what functionality an operator should
> > include at level 0, level 1, etc. We did discuss this a good bit during
> > some of the initial SIG meetings, but I guess it wasn't really a focal
> > point at the time. I think we should also provide references to existing
> > operator projects and possibly other related projects. This would benefit
> > both community users as well as people working on these projects.
> >
> > - John
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692


Re: [DISCUSS] Next steps for Kubernetes operator SIG

2020-09-27 Thread Ben Bromhead
that's not "The Apache
> > > Way"
> > >
> > > If there's a consensus (or even strong majority) amongst invested
> parties,
> > > I don't see why we could not adopt an operator directly into the
> project.
> > >
> > > It's possible a green field approach might lead to fewer hard
> feelings, as
> > > everyone is in the same boat. Perhaps all operators are also suboptimal
> > > and
> > > could be improved with a rewrite? But I think coordinating a lot of
> > > different entities around an empty codebase is particularly
> challenging. I
> > > actually think it could be better for cohesion and collaboration to
> have a
> > > suboptimal but substantive starting point.
> > >
> > > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > > instaclustr.com<mailto:stefan.mikloso...@instaclustr.com>> wrote:
> > >
> > > I think that from Instaclustr it was stated quite clearly multiple
> > > times that we are "fine to throw it away" if there is something better
> > > and more wide-spread.Indeed, we have invested a lot of time in the
> > > operator but it was not useless at all, we gained a lot of quite unique
> > > knowledge how to put all pieces together. However, I think that
> > > this space is going to be quite fragmented and "balkanized", which is
> > > not always a bad thing, but in a quite narrow area as Kubernetes
> operator
> > > is, I just do not see how 4 operators are going to be beneficial for
> > > ordinary people ("official" from community, ours, Datastax one and
> CassKop
> > > (without any significant order)). Sure, innovation and healthy
> competition
> > > is important but to what extent ...
> > > One can start a Cassandra cluster on Kubernetes just so many times
> > > differently and nobody really likes a vendor lock-in. People wanting
> > > to run a cluster on K8S realise that there are three operators, each
> > > backed by a private business entity, and the community operator is not
> > > there ... Huh, interesting ... One may even start to question what is
> > > wrong with these folks that it takes three companies to build their
> > > own solution.
> > >
> > > Having said that, to my perception, Cassandra community just does not
> > > have enough engineers nor contributors to keep 4 operators alive at
> > > the same time (I wish I was wrong) so the idea of selecting the best
> > > one or to merge obvious things and approaches together is
> understandable,
> > > even if it meant we eventually sunset ours. In addition, nobody from
> big
> > > players is going to contribute to the code
> > > base of the other one, for obvious reasons, so channeling and directing
> > > this effort into something common for a community seems to
> > > be the only reasonable way of cooperation.
> > >
> > > It is quite hard to bootstrap this if the donation of the code in big
> > > chunks / whole repo is out of question as it is not the "Apache way"
> > > (there was some thread running here about this in more depth a while
> > > ago) and we basically need to start from scratch which is quite
> > > demotivating, we are just inventing the wheel and nobody is up to it.
> > > It is like people are waiting for that to happen so they can jump in
> > > "once it is the thing" but it will never materialise or at least the
> > > hurdle to kick it off is unnecessarily high. Nobody is going to invest
> > > in this heavily if there is already a working operator from companies
> > > mentioned above. As I understood it, one reason of not choosing the
> > > way of donating it all is that "the learning and community building
> > > should happen in organic manner and we just can not accept the
> donation",
> > > but is not it true that it is easier to build a community
> > > around something which is already there rather than trying to build it
> > > around an idea which is quite hard to dedicate to?
> > >
> > > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie < jmcken...@apache.org
> > > <mailto:jmcken...@apache.org>> wrote:
> > >
> > > I think there's significant value to the community in trying to
> > > coalesce
> > > on a single approach,
> > > I agree. Unfortunately in this case, the parties with a vested interest
> > > and
> > > written

Re: [DISCUSS] Next steps for Kubernetes operator SIG

2020-10-05 Thread Ben Bromhead
 this requires support at a networking layer for pod to pod IP
> > connectivity. This may be accomplished within the cluster with CNIs like
> > Cilium or externally via traditional networking tools.
> >
> > Differentiators
> >
> > -
> >
> > OSS Ecosystem / Components
> > -
> >
> > Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
> > Cycle Manager to provide automated configuration file rendering
> > -
> >
> > Cass Config Definitions - definitions files for cass-config-builder,
> > defines all configuration files, their parameters, and templates
> > -
> >
> > Management API for Apache Cassandra (MAAC)
> > -
> >
> > Metrics Collector for Apache Cassandra (MCAC)
> > -
> >
> > Reference Prometheus Operator CRDs
> > -
> >
> > ServiceMonitor
> > -
> >
> > Instance
> > -
> >
> > Reference Grafana Operator CRDs
> > -
> >
> > Instance
> > -
> >
> > Dashboards
> > -
> >
> > Datasource
> > -
> >
> > PodTemplateSpec
> > -
> >
> > Customization of existing pods including support for adding containers,
> > volumes, etc
> > -
> >
> > Advanced Networking
> > -
> >
> > Node Port
> > -
> >
> > Host Network
> > -
> >
> > Simple security
> > -
> >
> > Management API mTLS support
> > -
> >
> > Automated generation of keystore and truststore for internode and client
> > to node TLS
> > -
> >
> > Automated superuser account configuration
> > -
> >
> > The default superuser (cassandra/cassandra) is disabled and never
> > available to clients
> > -
> >
> > Cluster administration account may be automatically (or provided) with
> > values stored in a k8s secret
> > -
> >
> > Automatic application of NetworkTopologyStrategy with appropriate RF for
> > system keyspaces
> > -
> >
> > Validating webhook
> > -
> >
> > Invalid changes are rejected with a helpful message
> > -
> >
> > Rolling cluster updates
> > -
> >
> > Change in binary (C* upgrade)
> > -
> >
> > Change in configuration
> > -
> >
> > Canary deployments - single rack application of changes for validation
> > before broader deployment
> > -
> >
> > Rolling restart
> > -
> >
> > Platform Integration / Testing / Certification
> > -
> >
> > Red Hat Openshift compatible and certified
> > -
> >
> > Secure, Universal Base Image (UBI) foundation images with security
> > scanning performed by Red Hat
> > -
> >
> > cass-operator
> > -
> >
> > cass-config-builder
> > -
> >
> > apache-cassandra w/ MCAC and MAAC
> > -
> >
> > Integration with Red Hat certification pipeline / marketplace
> > -
> >
> > Presence in Red Hat Operator Hub built into OpenShift interface
> > -
> >
> > VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
> > -
> >
> > Security scanning for images performed by VMware
> > -
> >
> > Amazon EKS
> > -
> >
> > Google GKE
> > -
> >
> > Azure AKS
> > -
> >
> > Documentation / Reference Implementations
> > -
> >
> > Cloud storage classes
> > -
> >
> > Ingress solutions
> > -
> >
> > Sample connection validation application with reference implementations
> of
> > Java Driver client connection parameters
> > -
> >
> > Cluster-level Stop / Resume - stop all running instances while keeping
> > persistent storage. Allows for scaling compute down to zero. Bringing the
> > cluster back up follows expected startup procedures
> >
> > Road Map / Inflight
> >
> > 1.
> >
> > Repair
> > 1.
> >
> > Reaper integration
> > 2.
> >
> > Backups
> > 1.
> >
> > Velero integration
> > 2. Medusa integration
> > 3.
> >
> > Advanced Networking via sidecar
> > 1.
> >
> > Combination of proxy sidecars (a la Envoy) to allow for persistent IP
> > addresses despite Kubernetes' best efforts to shuffle them.
> > 4.
> >
> > Single pod canary deployments
> > 5.
> >
> > Platform Certification
> > 1. VMware Project Pacific
> >
> > 2.
> >
> > Rancher Kubernetes Engine (K3s)
> > 6.
> &

Re: New Cassandra website for review

2021-02-28 Thread Ben Bromhead
Awesome stuff, looks great!

On Mon, Mar 1, 2021 at 9:33 AM Nate McCall  wrote:

> Thanks Melissa! This looks really good. Excited to see it happen.
>
> On Sat, Feb 27, 2021 at 10:36 AM Melissa Logan 
> wrote:
>
> > Hi all,
> >
> > We are excited to share the almost-complete Cassandra website design
> > (CASSANDRA-16115). Huge thanks to Lorina Poland, Anthony Grosso, Mick
> Semb
> > Weaver, Josh Levy, Chris Thornett, Diogenese Topper, and a few others who
> > contributed to this effort.
> >
> > Note: There are a few updates to be made prior to launch, but we wanted
> to
> > share to get initial input and signoff to begin the final port to Antora.
> >
> > To be completed:
> >
> >- *Homepage: *The logos are placeholders -- they're being updated and
> >resized (pulled from case studies page).
> >- *Docs* will be added once 4.0 documentation is complete. Design
> >will match new site.
> >- *Case Studies * logos are being updated and resized, so ignore
> broken
> >links.
> >
> > If you have case studies or resources -- or community photos --
> > please reply to me and we'll add.
> >
> > Site for review: https://cassandra.staged.apache.org/
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-16115
> >
> > Melissa Logan
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: Proprietary Replication Strategies: Cassandra Driver Support

2016-10-10 Thread Ben Bromhead
p;gt;&gt;

> &gt;&gt;

> &gt;&gt; class LocalStrategy(ReplicationStrategy):

> &gt;&gt; def __init__(self, options_map):

> &gt;&gt; pass

> &gt;&gt; def make_token_replica_map(self, token_to_host_owner,
ring):

> &gt;&gt; return {}

> &gt;&gt; def export_for_schema(self):

> &gt;&gt; """

> &gt;&gt; Returns a string version of these replication options
which are

> &gt;&gt; suitable for use in a CREATE KEYSPACE statement.

> &gt;&gt; """

> &gt;&gt; return "{'class': 'LocalStrategy'}"

> &gt;&gt; def __eq__(self, other):

> &gt;&gt; return isinstance(other, LocalStrategy)

> &gt;&gt;

> &gt;&gt; On Fri, Oct 7, 2016 at 11:56 AM, Jeremiah D Jordan
&amp;lt;

> &gt;&gt; jeremiah.jor...@gmail.com&amp;gt; wrote:

> &gt;&gt;

> &gt;&gt; &amp;gt; What kind of support are you thinking
of? All drivers should support

> &gt;&gt; them

> &gt;&gt; &amp;gt; already, drivers shouldn’t care about
replication strategy except when

> &gt;&gt; &amp;gt; trying to do token aware routing.

> &gt;&gt; &amp;gt; But since anyone can make a custom
replication strategy, drivers that

> &gt;&gt; do

> &gt;&gt; &amp;gt; token aware routing just need to handle
falling back to not doing

> &gt;&gt; token

> &gt;&gt; &amp;gt; aware routing if a replication strategy
they don’t know about is in

> &gt;&gt; use.

> &gt;&gt; &amp;gt; All the open sources drivers I know of
do this, so they should all

> &gt;&gt; &amp;gt; “support” those strategies already.

> &gt;&gt; &amp;gt;

> &gt;&gt; &amp;gt; -Jeremiah

> &gt;&gt; &amp;gt;

> &gt;&gt; &amp;gt; &amp;gt; On Oct 7, 2016, at 1:02 PM,
Prasenjit Sarkar &amp;

> &gt;&gt; lt;prasenjit.sar...@datos.io&amp;gt;

> &gt;&gt; &amp;gt; wrote:

> &gt;&gt; &amp;gt; &amp;gt;

> &gt;&gt; &amp;gt; &amp;gt; Hi everyone,

> &gt;&gt; &amp;gt; &amp;gt;

> &gt;&gt; &amp;gt; &amp;gt; To the best of my
understanding that Datastax has proprietary

> &gt;&gt; replication

> &gt;&gt; &amp;gt; &amp;gt; strategies: Local and
Everywhere which are not part of the open

> &gt;&gt; source

> &gt;&gt; &amp;gt; &amp;gt; Apache Cassandra project.

> &gt;&gt; &amp;gt; &amp;gt;

> &gt;&gt; &amp;gt; &amp;gt; Do we know of any plans in
the open source Cassandra driver

> &gt;&gt; community to

> &gt;&gt; &amp;gt; &amp;gt; support these two
replication strategies? Would Datastax have a

> &gt;&gt; licensing

> &gt;&gt; &amp;gt; &amp;gt; concern if the open source
driver community supported these

> &gt;&gt; strategies?

> &gt;&gt; &amp;gt; I'm

> &gt;&gt; &amp;gt; &amp;gt; fairly new here and would
like to understand the dynamics.

> &gt;&gt; &amp;gt; &amp;gt;

> &gt;&gt; &amp;gt; &amp;gt; Thanks,

> &gt;&gt; &amp;gt; &amp;gt; Prasenjit

> &gt;&gt; &amp;gt;

> &gt;&gt; &amp;gt;

> &gt;&gt;

> &gt;&gt;

> &gt;&gt;

> &gt;&gt;

> &gt;&gt;

> &gt;&gt;

>

>

>

>

>











-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Proposal - 3.5.1

2016-10-20 Thread Ben Bromhead
For reference we have released https://github.com/instaclustr/cassandra ,
with the end goal that people have a stable target on the 3.x branch while
this is all worked out.

We are likely to continue our releases even with a release cadence change,
but we would track official versions much more closely and our repository
will end up just being a public view of what we do internally rather than
something we advocate over official releases.

For further details on our thoughts around this see:

   - https://www.instaclustr.com/blog/2016/10/19/patched-cassandra-3-7/
   - https://github.com/instaclustr/cassandra#faq


On Thu, 20 Oct 2016 at 09:38 Jeremy Hanna 
wrote:

> Is there consensus on a way forward with this?  Is there going to be a
> three branch plan with “features”, “testing”, and “stable” starting with
> 4.0?  Or is this still in the discussion mode?  External to this thread
> there have been decisions made to create third party LTS releases and hopes
> that the project would decide to address the concerns in this thread.  It
> seems like this is the place to complete the discussion.
>
> > On Sep 26, 2016, at 10:52 AM, Jonathan Haddad  wrote:
> >
> > Not yet. I hadn't seen any Jirsa before to release a specific version,
> only
> > discussion on the ML.
> >
> > I'll put up a Jira with my patch that back ports the bug fix.
> > On Mon, Sep 26, 2016 at 8:26 AM Michael Shuler 
> > wrote:
> >
> >> Jon, is there a JIRA ticket for this request? I appreciate everyone's
> >> input, and I think this is a fine proposal.
> >>
> >> --
> >> Kind regards,
> >> Michael
> >>
> >> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
> >>> Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back ported
> to
> >>> 3.5 as well, and it makes Cassandra effectively unusable if someone is
> >>> using any of the 4 types affected in any of their schema.
> >>>
> >>> I have cherry picked & merged the patch back to here and will put it
> in a
> >>> JIRA as well tonight, I just wanted to get the ball rolling asap on
> this.
> >>>
> >>>
> >>
> https://github.com/rustyrazorblade/cassandra/tree/fix_commitlog_exception
> >>>
> >>> Jon
> >>>
> >>
> >>
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Backports to 2.1.16

2016-10-20 Thread Ben Bromhead
this is awesome

On Thu, 20 Oct 2016 at 14:19 sankalp kohli  wrote:

> Hi,
>  We backport a lot of patches in Cassandra at Apple. We contribute all
> the patches to the community and port them to 2.1 if we think they will
> help. We will soon start focusing on 3.0 and won't back port to 2.1 unless
> critical.
>
> I want to list them in this email in random order and not based on
> importance. They are back ported for various reasons.
>
> *NOTE: This list is an FYI and I dont suggest that we port these to 2.1.*
>
> 1. Writes should be sent to a replacement node while it is streaming in
> data (CASSANDRA-8523)
> 2. Send source sstable level when bootstrapping or replacing
> nodes(CASSANDRA-7460)
> 3. Add an API to request the size of a CQL partition (CASSANDRA-12367. We
> expose this via CQL)
> 4. Add ability to blacklist a CQL partition so all requests are ignored
> (CASSANDRA-12106)
> 5. Allow compaction throttle to be real time(CASSANDRA-10025)
> 6. Add a credentials cache to the PasswordAuthenticator (CASSANDRA-7715)
> 7. SliceQueryFilter warnings should print the partition
> key(CASSANDRA-10211)
> 8. Make LZ4 Compression Level Configurable(CASSANDRA-11051)
> 9. Include info about sstable on "Compacting large row” message
> (CASSANDRA-12384)
> 10. Should be able to override compaction space check (CASSANDRA-12180)
> 11. updateJobs in PendingRangeCalculatorService should be decremented in
> finally block(CASSANDRA-12554)
> 12. Unresolved hostname in replace address (CASSANDRA-11210)
> 13. Range.compareTo() violates the contract of Comparable (CASSANDRA-11216)
> 14. range metrics are not updated for timeout and unavailable in
> StorageProxy (CASSANDRA-9507)
> 15. Range tombstones that are masked by row tombstones should not be
> written out (CASSANDRA-12030)
> 16. Disk failure policy should not be invoked on out of space
> (CASSANDRA-12385)
> 17. Add metrics for authentication failures (CASSANDRA-10635)
> 18. Implement compaction for a specific token range (CASSANDRA-10643)
> 19. RangeStreamer should be smarter when picking endpoints for
> streaming(CASSANDRA-4650)
> 20. Reject empty options and invalid DC names in replication configuration
> while creating or altering a keyspace (CASSANDRA-12681)
> 21. One way targeted repair (CASSANDRA-9876)
> 22. Rebuild from targeted replica (CASSANDRA-9875)
> 23. Add prefixes to the name of snapshots created before a truncate or
> drop(CASSANDRA-12178)
> 24. Improve CAS propose CQL query(CASSANDRA-7929)
> 25. Repair -force CASSANDRA-10446
> 26. Include table name in "Cannot get comparator"
> exception(CASSANDRA-12181)
> 27. Collect metrics on queries by consistency level (CASSANDRA-7384)
> 28. processs restarts are failing becase native port and jmx ports are in
> use(CASSANDRA-11093)
>
> Thanks,
> Sankalp
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: [VOTE] Close client-...@cassandra.apache.org mailing list

2016-11-08 Thread Ben Bromhead
+1 (non-binding)

On Tue, 8 Nov 2016 at 10:05 Jake Farrell  wrote:

> +1
>
> -Jake
>
> On Mon, Nov 7, 2016 at 12:11 AM, Jeff Jirsa  wrote:
>
> > There exists a nearly unused mailing list,
> client-...@cassandra.apache.org
> > [0].
> >
> > This is a summary of the email threads over the past 12 months on that
> > list:
> >
> > 1) ApacheCon Seville CFP Close notice
> > 2) Datastax .NET driver question
> > 3) Datastax Java driver question
> > 4) FOSDEM announce
> > 5) ApacheCon NA CFP Open noticed
> >
> > In order to avoid confusion, and given the lack of relevant and
> > appropriate traffic, I propose we close the client-dev@ list entirely.
> > Any traffic appropriate for the client-dev@ list would likely be better
> > served if it were directed at dev@, which is more active.
> >
> > This vote will remain open for 72 hours.
> >
> > 0: https://lists.apache.org/list.html?client-...@cassandra.apache.org
> >
> >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Board report and feedback from such.

2016-11-16 Thread Ben Bromhead
ter.
>
> ## JIRA activity:
>
> As discussed above in Issues we would like the number of resolved issues
> to be higher, but over 75% is not a bad place to be and we are actively
> working on getting more community involvement.
>
>  - 480 JIRA tickets created in the last 3 months
>  - 372 JIRA tickets closed/resolved in the last 3 months
>
> ## Trademark Enforcement
>
> Three requests were sent out regarding our trademarks, all of which were
> immediately complied with by the recipients:
> - Clarified appropriate use of trademark and guidelines for a third party
>   vendor providing backports of Apache Cassandra patches to a custom long
> term release version [3]
> - Use of trademark in project name [4]
> - Use of trademark in project name [5]
>
> ## 'dtest' Project Contribution
>
> DataStax recently offered to donate the dtest distributed testing suite to
> the project [6],[7]. This was voted on in the dev mailing list and passed
> [8]. Filing the appropriate forms with the Incubator folks for review will
> be
> done shortly.
>
> ## References
>
> [0]
>
> https://www.apache.org/foundation/records/minutes/2016/board_minutes_2016_08_17.txt
> [1]
>
> https://www.apache.org/foundation/records/minutes/2016/board_minutes_2016_09_21.txt
> [2]
> http://www.datastax.com/2016/11/serving-customers-serving-the-community
> [3]
>
> https://lists.apache.org/thread.html/69d87a0d59a23a4ee7578563785581d5daa5cb248987d67a70c44b86@%3Cprivate.cassandra.apache.org%3E
> [4]
>
> https://lists.apache.org/list.html?priv...@cassandra.apache.org:lte=1M:Apache%20trademark%20and%20Spring%20project%20names
> [5] https://mesosphere.github.io/cassandra-mesos/
> [6] https://github.com/riptano/cassandra-dtest
> [7]
>
> https://lists.apache.org/thread.html/d43300016d3871587c43eea8cd4223221904fddc7916d9d6d858bd29@%3Cprivate.cassandra.apache.org%3E
> [8]
>
> https://lists.apache.org/thread.html/d9e694ba8eaac8e8c70cbfd3f6ee249d43f8c67279882ffc65e56cac@%3Cdev.cassandra.apache.org%3E
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Rough roadmap for 4.0

2016-11-17 Thread Ben Bromhead
t; > >> > m...@thelastpickle.com 
> > > >> > > )
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > > On 4 November 2016 at 13:47, Nate McCall  > > >> > wrote:
> > > >> > > >
> > > >> > > > > Specifically, this should be "new stuff that could/will
> break
> > > >> things"
> > > >> > > > > given we are upping
> > > >> > > > > the major version.
> > > >> > > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > How does this co-ordinate with the tick-tock versioning¹
> leading
> > > up
> > > >> to
> > > >> > > the
> > > >> > > > 4.0 release?
> > > >> > > >
> > > >> > > > To just stop tick-tock and then say yeehaa let's jam in all
> the
> > > >> > breaking
> > > >> > > > changes we really want seems to be throwing away some of the
> > > learnt
> > > >> > > wisdom,
> > > >> > > > and not doing a very sane transition from tick-tock to
> > > >> > > > features/testing/stable². I really hope all this is done in a
> > way
> > > >> that
> > > >> > > > continues us down the path towards a stable-master.
> > > >> > > >
> > > >> > > > For example, are we fixing the release of 4.0 to November? or
> > > >> > continuing
> > > >> > > > tick-tocks until we complete the 4.0 roadmap? or starting the
> > > >> > > > features/testing/stable branching approach with 3.11?
> > > >> > > >
> > > >> > > >
> > > >> > > > Background:
> > > >> > > > ¹) Sylvain wrote in an earlier thread titled "A Home for 4.0"
> > > >> > > >
> > > >> > > > > And as 4.0 was initially supposed to come after 3.11, which
> is
> > > >> > coming,
> > > >> > > > it's probably time to have a home for those tickets.
> > > >> > > >
> > > >> > > > ²) The new versioning scheme slated for 4.0, per the
> "Proposal -
> > > >> 3.5.1"
> > > >> > > > thread
> > > >> > > >
> > > >> > > > > three branch plan with “features”, “testing”, and “stable”
> > > >starting
> > > >> > > with
> > > >> > > > 4.0?
> > > >> > > >
> > > >> > > >
> > > >> > > > Mick
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
> --
>
>
> --
>
>
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Rough roadmap for 4.0

2016-11-17 Thread Ben Bromhead
s/materialised views/aggregates/

also we expect to have our first larger production 3.7 LTS cluster in the
next few months.

On Thu, 17 Nov 2016 at 15:38 Ben Bromhead  wrote:

> We have a few small customers clusters running on our 3.7 LTS release...
> though we are not calling it production ready yet.
>
> We also just moved our internal metrics cluster from 2.2 to 3.7 LTS to get
> materialised views and to get some 3.x production experience.
>
> On Thu, 17 Nov 2016 at 14:27 Carlos Rolo  wrote:
>
> No Cluster in tick-tock.
>
> Actually reverted a couple to 3.0.x
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> *linkedin.com/in/carlosjuzarterolo
> <http://linkedin.com/in/carlosjuzarterolo>*
> Mobile: +351 918 918 100 <+351%20918%20918%20100>
> www.pythian.com
>
> On Thu, Nov 17, 2016 at 10:20 PM, DuyHai Doan 
> wrote:
>
> > Be very careful, there is a serious bug about AND/OR semantics, not
> solved
> > yet and not going to be solved any soon:
> > https://issues.apache.org/jira/browse/CASSANDRA-12674
> >
> > On Thu, Nov 17, 2016 at 7:32 PM, Jeff Jirsa 
> > wrote:
> >
> > >
> > > We’ll be voting in the very near future on timing of major releases and
> > > release strategy. 4.0 won’t happen until that vote takes place.
> > >
> > > But since you asked, I have ONE tick/tock (3.9) cluster being qualified
> > > for production because it needs SASI.
> > >
> > > - Jeff
> > >
> > > On 11/17/16, 9:59 AM, "Jonathan Haddad"  wrote:
> > >
> > > >I think it might be worth considering adopting the release strategy
> > before
> > > >4.0 release.  Are any PMC members putting tick tock in prod? Does
> anyone
> > > >even trust it?  What's the downside of changing the release cycle
> > > >independently from 4.0?
> > > >
> > > >On Thu, Nov 17, 2016 at 9:03 AM Jason Brown 
> > wrote:
> > > >
> > > >Jason,
> > > >
> > > >That's a separate topic, but we will have a different vote on how the
> > > >branching/release strategy should be for the future.
> > > >
> > > >On Thursday, November 17, 2016, jason zhao yang <
> > > zhaoyangsingap...@gmail.com
> > > >>
> > > >wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> Will we still use tick-tock release for 4.x and 4.0.x ?
> > > >>
> > > >> Stefan Podkowinski >于2016年11月16日周三
> > > >> 下午4:52写道:
> > > >>
> > > >> > From my understanding, this will also effect EOL dates of other
> > > >branches.
> > > >> >
> > > >> > "We will maintain the 2.2 stability series until 4.0 is released,
> > and
> > > >3.0
> > > >> > for six months after that.".
> > > >> >
> > > >> >
> > > >> > On Wed, Nov 16, 2016 at 5:34 AM, Nate McCall  > > >> > wrote:
> > > >> >
> > > >> > > Agreed. As long as we have a goal I don't see why we have to
> > adhere
> > > to
> > > >> > > arbitrary date for 4.0.
> > > >> > >
> > > >> > > On Nov 16, 2016 1:45 PM, "Aleksey Yeschenko" <
> > alek...@datastax.com
> > > >> >
> > > >> > wrote:
> > > >> > >
> > > >> > > > I’ll comment on the broader issue, but right now I want to
> > > elaborate
> > > >> on
> > > >> > > > 3.11/January/arbitrary cutoff date.
> > > >> > > >
> > > >> > > > Doesn’t matter what the original plan was. We should continue
> > with
> > > >> 3.X
> > > >> > > > until all the 4.0 blockers have been
> > > >> > > > committed - and there are quite a few of them remaining yet.
> > > >> > > >
> > > >> > > > So given all the holidays, and the tickets remaining, I’ll
> > > >personally
> > > >> > be
> > > >> > > > surprised if 4.0 comes out before
> > > >> > > > February/March and 3.13/3.14. Nor do I think it’s an issue.
> > > >> > > >
&

Re: Summary of 4.0 Large Features/Breaking Changes (Was: Rough roadmap for 4.0)

2016-11-17 Thread Ben Bromhead
We are happy to start testing against completed features. Ideally once
everything is ready for an RC (to catch interaction bugs), but we can do
sooner for features where it make sense and are finished earlier.

On Thu, 17 Nov 2016 at 16:47 Nate McCall  wrote:

> To sum up that other thread (I very much appreciate everyone's input,
> btw), here is an aggregate list of large, breaking 4.0 proposed
> changes:
>
> CASSANDRA-9425 Immutable node-local schema
> CASSANDRA-10699 Strongly consistent schema alterations
> --
> CASSANDRA-12229 NIO streaming
> CASSANDRA-8457 NIO messaging
> CASSANDRA-12345 Gossip 2.0
> CASSANDRA-9754 Birch trees
> CASSANDRA-11559 enhanced node representation
> CASSANDRA-6246 epaxos
> CASSANDRA-7544 storage port configurable per node
> --
> CASSANDRA-5 remove thrift support
> CASSANDRA-10857 dropping compact storage
>
> Again, this is the "big things that will probably break stuff" list
> and thus should happen with a major (did I miss anything?). There
> were/are/will be other smaller issues, but we don't really need to
> keep them in front of us for this discussion as they can/will just
> kind of happen w/o necessarily affecting anything else.
>
> That all said, since we are 'doing a software' we need to start
> thinking about the above in balance with resources and time. However,
> a lot of the above items do have a substantial amount of code written
> against them so it's not as daunting as it seems.
>
> What I would like us to discuss is rough timelines and what is needed
> to get these out the door.
>
> One thing that sticks out to me: that big chunk in the middle there is
> coming out of the same shop in Cupertino. I'm nervous about that. Not
> that that ya'll are not capable, I'm solely looking at it from the
> "that is a big list of some pretty hard shit" perspective.
>
> So what else do we need to discuss to get these completed? How and
> where can other folks pitch in?
>
> -Nate
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Where do I find EverywhereStrategy?

2016-11-30 Thread Ben Bromhead
Also apparently the Everywhere Strategy is a bad idea (tm) according to
comments in https://issues.apache.org/jira/browse/CASSANDRA-12629 but no
reason has been given why...

On Wed, 30 Nov 2016 at 07:07 James Carman 
wrote:

> A, well that stinks.  And, renaming it now would/could break backward
> compatibility with existing clusters.  Lesson learned on package-private
> constructors for abstract classes, especially those used for extension
> points. :(
>
>
> On Wed, Nov 30, 2016 at 10:04 AM J. D. Jordan 
> wrote:
>
> > Prior to https://issues.apache.org/jira/browse/CASSANDRA-12788 that was
> > the only way to implement a new replication strategy.
> >
> > > On Nov 30, 2016, at 8:46 AM, James Carman 
> > wrote:
> > >
> > > Oh, ok, thanks.  Why would a DSE class be in the "org.apache.cassandra"
> > > package structure?  That seems a bit misleading
> > >
> > >
> > >
> > > On Wed, Nov 30, 2016 at 9:44 AM Jacques-Henri Berthemet <
> > > jacques-henri.berthe...@genesys.com> wrote:
> > >
> > >> Hi James,
> > >>
> > >> It looks like it's a DSE class, not OSS Cassandra:
> > >>
> > >>
> >
> https://support.datastax.com/hc/en-us/articles/208026816-DSE-EverywhereStrategy-is-not-understood-by-COSS-nodes-and-can-cause-restart-failures
> > >>
> > >> Regards,
> > >> --
> > >> Jacques-Henri Berthemet
> > >>
> > >> -Original Message-
> > >> From: James Carman [mailto:ja...@carmanconsulting.com]
> > >> Sent: mercredi 30 novembre 2016 15:30
> > >> To: dev@cassandra.apache.org
> > >> Subject: Where do I find EverywhereStrategy?
> > >>
> > >> I came across the class name
> > >> "org.apache.cassandra.locator.EverywhereStrategy" in an error message,
> > so I
> > >> started searching through the code for it. I can't seem to find it.
> Any
> > >> pointers?
> > >>
> > >>
> > >> Thanks,
> > >>
> > >>
> > >> James
> > >>
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Where do I find EverywhereStrategy?

2016-11-30 Thread Ben Bromhead
for sure, but with auth for another user (LOCAL_ONE), you still want auth
info replicated to all nodes.

The system default of an RF of 1 can cause no access at all for a single
node going down (even with caching) and for the average user is a worse
solution than replicated to all nodes.

Also given that you should not be using the cassandra user (it's only there
to bootstrap auth setup) and that I would claim it is best to have auth
details local to each node, short of a complete rewrite of how system_auth
is distributed (e.g. via gossip, which I suspect it should be) I would
propose that an everywhere strategy is helpful in this regard?

On the other end of the spectrum operators with very large clusters are
already customizing auth to suit their needs (and not using the Cassandra
user).  I've seen far more people shoot themselves in the foot with
system_auth as it currently stands than large operators getting this wrong.
I would also claim that an everywhere strategy is about as dangerous as
secondary indexes...

Sorry to keep flogging a dead horse, but keeping auth replicated properly
has been super helpful for us. Of course replication strategies are
pluggable, so easy for us to maintain separately, I'm just trying to figure
out where I'm missing the point and if we need to re-evaluate the way we do
things or if the fear of misuse is the primary concern :)

On Wed, 30 Nov 2016 at 10:32 Jeff Jirsa  wrote:

>
>
> On 2016-11-30 10:02 (-0800), Ben Bromhead  wrote:
> > Also apparently the Everywhere Strategy is a bad idea (tm) according to
> > comments in https://issues.apache.org/jira/browse/CASSANDRA-12629 but no
> > reason has been given why...
> >
>
> It's touched on in that thread, but it's REALLY EASY to misuse, and most
> people who want it are probably going to shoot themselves in the foot with
> it.
>
> The one example Jeremiah gave is admin logins with system_auth - requires
> QUORUM, quorum on a large cluster is almost impossible to satisfy like that
> (imagine a single digest mismatch triggering a blocking read repair on a
> hundred nodes, and what that does to the various thread pools).
>
>
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Where do I find EverywhereStrategy?

2016-12-03 Thread Ben Bromhead
Yup the core issue is that system_auth is not great from a HA perspective
at the moment for new users.

Given the general opinion is that RF=N is not appropriate / too rife for
abuse. Would something similar to min(rf=N, X per DC) be more appropriate?
The only thing with that is implementing that replication strategy so it is
aware of all current DCs sounds tricky (without looking into it) + I can
imagine some gnarly corner cases / complexity with adding / removing DCs.

On Sat, 3 Dec 2016 at 07:02 Aleksey Yeschenko  wrote:

> It isn’t, but you are supposed to change it.
>
> The reason it cannot be set higher by default is that out of the box
> single-node clusters should still work,
> and setting the default RF to higher than 1 would break this, as it
> performs some queries at quorum CL.
>
> --
> AY
>
> On 3 December 2016 at 06:47:12, sankalp kohli (kohlisank...@gmail.com)
> wrote:
>
> The point Ben is saying is that for auth keyspace, default o​f RF=1 is not
> good for any type of cluster whether it is small or large.
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Wrapping up tick-tock

2017-01-10 Thread Ben Bromhead
+1 on killing tick/tock
+1 on six months

What is the appetite for a longer bug fix period for some releases (e.g.
every second release gets 18 - 24 months critical bug fixes)?

Currently only vendors / large users are maintaining long running releases,
given this work is already happening I would rather the effort happen under
the Apache umbrella and be available for all user if existing long term
release maintainers are happy to do so.

If this question is to outside the topic and more appropriate for a
different thread I'm happy to put a hold on it until the release cadence is
agreed.



On Tue, 10 Jan 2017 at 09:27 Nate McCall  wrote:

> > I agreed with you at the time that the yearly cycle was too long to be
> > adding features before cutting a release, and still do now.  Instead of
> > elastic banding all the way back to a process which wasn't working
> before,
> > why not try somewhere in the middle?  A release every 6 months (with
> > monthly bug fixes for a year) gives:
> >
> > 1. long enough time to stabilize (1 year vs 1 month)
> > 2. not so long things sit around untested forever
> > 3. only 2 releases (current and previous) to do bug fix support at any
> > given time.
>
> The third reason is particularly appealing.
>
> +1 on six months.
> +1 on killing tick/tock at 3.10 (with a potential bugfix follow up per
> the other thread).
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Instaclustr Masters scholarship

2017-02-07 Thread Ben Bromhead
As part of our commitment to contributing back to the Apache Cassandra open
source project and the wider community we are always looking for ways we
can foster knowledge sharing and improve usability of Cassandra itself. One
of the ways we have done so previously was to open up our internal builds
and versions of Cassandra (https://github.com/instaclustr/cassandra).

We have also been looking at a few novel or outside the box ways we can
further contribute back to the community. As such, we are sponsoring a
masters project in conjunction with the Australian based University of
Canberra. Instaclustr’s staff will be available to provide advice and
feedback to the successful candidate.

*Scope*
Distributed database systems are relatively new technology compared to
traditional relational databases. Distributed advantages provide
significant advantages in terms of reliability and scalability but often at
a cost of increased complexity. This complexity presents challenges for
testing of these systems to prove correct operation across all possible
system states. The scope of this masters scholarship is to use the Apache
Cassandra repair process as an example to consider and improve available
approaches to distributed database systems testing.

The repair process in Cassandra is a scheduled process that runs to ensure
the multiple copies of each piece of data that is maintained by Cassandra
are kept synchronised. Correct operation of repairs has been an ongoing
challenge for the Cassandra project partly due to the difficulty in
designing and developing  comprehensive automated tests for this
functionality.

The expected scope of this project is to:

   - survey and understand the existing testing framework available as part
   of the Cassandra project, particularly as it pertains to testing repairs
   - consider, research and develop enhanced approaches to testing of
   repairs
   - submit any successful approaches to the Apache Cassandra project for
   feedback and inclusion in the project code base

Australia is a pretty great place to advance your education and is
welcoming of foreign students.

We are also open to sponsoring a PhD project with a more in depth focus for
the right candidate.

For more details please don't hesitate to get in touch with myself or reach
out to i...@instaclustr.com.

Cheers

Ben
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Instaclustr Masters scholarship

2017-02-09 Thread Ben Bromhead
1975 is totally recent :s

On Wed, Feb 8, 2017, 04:29 Jason Brown  wrote:

> Ben,
>
> This is pretty cool! Almost wish I could apply ;)
>
> >> "Distributed database systems are relatively new technology"
> https://www.ietf.org/rfc/rfc677.txt.pdf - lol, 1975 :)
>
> -Jason
>
> On Tue, Feb 7, 2017 at 7:26 PM, Ben Bromhead  wrote:
>
> > As part of our commitment to contributing back to the Apache Cassandra
> open
> > source project and the wider community we are always looking for ways we
> > can foster knowledge sharing and improve usability of Cassandra itself.
> One
> > of the ways we have done so previously was to open up our internal builds
> > and versions of Cassandra (https://github.com/instaclustr/cassandra).
> >
> > We have also been looking at a few novel or outside the box ways we can
> > further contribute back to the community. As such, we are sponsoring a
> > masters project in conjunction with the Australian based University of
> > Canberra. Instaclustr’s staff will be available to provide advice and
> > feedback to the successful candidate.
> >
> > *Scope*
> > Distributed database systems are relatively new technology compared to
> > traditional relational databases. Distributed advantages provide
> > significant advantages in terms of reliability and scalability but often
> at
> > a cost of increased complexity. This complexity presents challenges for
> > testing of these systems to prove correct operation across all possible
> > system states. The scope of this masters scholarship is to use the Apache
> > Cassandra repair process as an example to consider and improve available
> > approaches to distributed database systems testing.
> >
> > The repair process in Cassandra is a scheduled process that runs to
> ensure
> > the multiple copies of each piece of data that is maintained by Cassandra
> > are kept synchronised. Correct operation of repairs has been an ongoing
> > challenge for the Cassandra project partly due to the difficulty in
> > designing and developing  comprehensive automated tests for this
> > functionality.
> >
> > The expected scope of this project is to:
> >
> >- survey and understand the existing testing framework available as
> part
> >of the Cassandra project, particularly as it pertains to testing
> repairs
> >- consider, research and develop enhanced approaches to testing of
> >repairs
> >- submit any successful approaches to the Apache Cassandra project for
> >feedback and inclusion in the project code base
> >
> > Australia is a pretty great place to advance your education and is
> > welcoming of foreign students.
> >
> > We are also open to sponsoring a PhD project with a more in depth focus
> for
> > the right candidate.
> >
> > For more details please don't hesitate to get in touch with myself or
> reach
> > out to i...@instaclustr.com.
> >
> > Cheers
> >
> > Ben
> > --
> > Ben Bromhead
> > CTO | Instaclustr <https://www.instaclustr.com/>
> > +1 650 284 9692
> > Managed Cassandra / Spark on AWS, Azure and Softlayer
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: New committers announcement

2017-02-14 Thread Ben Bromhead
Congrats!!

On Tue, 14 Feb 2017 at 13:37 Joaquin Casares 
wrote:

> Congratulations!
>
> +1 John's sentiments. That's a great list of new committers! :)
>
> Joaquin Casares
> Consultant
> Austin, TX
>
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On Tue, Feb 14, 2017 at 3:34 PM, Jonathan Haddad 
> wrote:
>
> > Congratulations! Definitely a lot of great contributions from everyone on
> > the list.
> > On Tue, Feb 14, 2017 at 1:31 PM Jason Brown 
> wrote:
> >
> > > Hello all,
> > >
> > > It's raining new committers here in Apache Cassandra!  I'd like to
> > announce
> > > the following individuals are now committers for the project:
> > >
> > > Branimir Lambov
> > > Paulo Motta
> > > Stefan Pokowinski
> > > Ariel Weisberg
> > > Blake Eggleston
> > > Alex Petrov
> > > Joel Knighton
> > >
> > > Congratulations all! Please keep the excellent contributions coming.
> > >
> > > Thanks,
> > >
> > > -Jason Brown
> > >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: [VOTE] self-assignment of jira tickets

2017-03-29 Thread Ben Bromhead
+1 nb

On Wed, 29 Mar 2017 at 12:20 Aleksey Yeschenko  wrote:

> +1 super binding.
>
> --
> AY
>
> On 29 March 2017 at 16:22:03, Jason Brown (jasedbr...@gmail.com) wrote:
>
> Hey all,
>
> Following up my thread from a week or two ago (
>
> https://lists.apache.org/thread.html/0665f40c7213654e99817141972c003a2131aba7a1c63d6765db75c5@%3Cdev.cassandra.apache.org%3E
> ),
> I'd like to propose a vote to change to allow any potential contributor to
> assign a jira to themselves without needing to be added to the contributors
> group first.
>
> https://issues.apache.org/jira/browse/INFRA-11950 is an example of how to
> get this done with INFRA.
>
> Vote will be open for 72 hours.
>
> Thanks,
>
> -Jason Brown
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Cassandra on RocksDB experiment result

2017-04-19 Thread Ben Bromhead
This looks super cool would love to see more details.

On a general note, a pluggable storage layer allows other storage engines
(and possibly datastores) to leverage Cassandras distributed primitives
(dynamo, gossip, paxsos?, drivers, cql etc). This could allow Cassandra to
fill similar use cases as Dynomite from Netflix.

Also as Sankalp mentioned we get some other benefits including better
testability.

In my experience with pluggable storage engines (in the MySQL world), the
> engine manages all storage that it "owns." The higher tiers in the
> architecture don't need to get involved unless multiple storage engines
> have to deal with compaction (or similar) issues over the entire database,
> e.g., every storage engine has read/write access to every piece of data,
> even if that data is owned by another storage engine.
>
> I don't know enough about Cassandra internals to have an opinion as to
> whether or not the above scenario makes sense in the Cassandra context. But
> "sharing" (processes or data) between storage engines gets pretty hairy,
> easily deadlocky (!), even in something as relatively straightforward as
> MySQL.


This would be an implementation detail, but given that tables in Cassandra
don't know about each other (no joins, foreign keys etc... ignore mv for
the moment), but storage engine interactions probably wouldn't be an issue.


> This was a long and old debate we had several times in the past. One of
> the difficulty of pluggable storage engine is that we need to manage the
> differences between the LSMT of native C* and RockDB engine for compaction,
> repair, streaming etc...
>
> Right now all the compaction strategies share the assumption that the data
> structure and layout on disk is fixed. With pluggable storage engine, we
> need to special case each compaction strategy (or at least the Abstract
> class of compaction strategy) for each engine.


> The current approach is one storage engine, many compaction strategies for
> different use-cases (TWCS for time series, LCS for heavy update...).
>
> With pluggable storage engine, we'll have a matrix of storage engine x
> compaction strategies.
>

Compaction is part of the storage engine, and if I understand Dikangs
design spec, it is bypassed?

Cassandras currently storage engine is a log structured merge tree. RocksDB
does it's own thing.

Again this is an implementation detail about where the storage engine
interface line is drawn, but from the above example compaction I think it
is a non issue?


> And not even mentioning the other operations to handle like streaming and
> repair.
>

Streaming and repair would be the harder problem to solve than compaction
imho.
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: NGCC?

2017-06-02 Thread Ben Bromhead
We are more than happy to donate some resources (both people and materials)
to putting on NGCC.

I would suggest some sort of committee of folks who are willing to do the
groundwork and who act as the executive so it gets done :)

On Fri, 2 Jun 2017 at 11:37 Patrick McFadin  wrote:

> A couple years ago we tried to attach it to an ApacheCon event, but the
> feedback from the ASF was fairly negative on attaching events. I don't
> think making it a part of Apache Big Data would work.
>
> The alternative as an independent event with sponsorship will be really
> hard to coordinate and a potential political minefield.
>
> Tl;DR Cassandra Summit Japan could be an ideal place.
>
> Oh and Russ? You win for most awesome event activity ever.
>
> Patrick
>
> On Fri, Jun 2, 2017 at 12:29 PM, Eric Evans 
> wrote:
>
> > On Thu, Jun 1, 2017 at 11:02 AM, Russell Bradberry  >
> > wrote:
> > >> read: developers *of* Cassandra
> > >
> > > Historically it has not only been developers of Cassandra but also
> > specific power users, influencers, and experts.  If you want to be
> > successful in deciding the direction of a product, you need more than its
> > developers present.  For instance, I am not a Cassandra developer, but I
> > have been invited every year. Same with folks like Peter Bailis, Rick
> > Branson, and others who come in with a wealth of knowledge around varying
> > use cases.  Unfortunately, I missed that last two years, but was hoping
> to
> > make it this year.
> >
> > You are right, of course.  I was (clumsily) trying to create a
> > distinction between this and a regular user conference.
> >
> > --
> > Eric Evans
> > john.eric.ev...@gmail.com
> >
> > ---------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: NGCC?

2017-06-02 Thread Ben Bromhead
+1

On Fri, 2 Jun 2017 at 13:17 Eric Evans  wrote:

> On Fri, Jun 2, 2017 at 2:34 PM, Ben Bromhead  wrote:
> > We are more than happy to donate some resources (both people and
> materials)
> > to putting on NGCC.
> >
> > I would suggest some sort of committee of folks who are willing to do the
> > groundwork and who act as the executive so it gets done :)
>
> Gary Dusbabek and myself are both willing to shoulder the grunt work
> of organizing this.
>
> Since we're both in San Antonio, that's the location that would be
> most practical for us.  It's also easy to get to, centrally located,
> and the weather should be fantastic that time of year.
>
> We were thinking we'd put together some options for venue, and propose
> some dates, and circle back to the list in search of consensus.
>
> If this seems reasonable for now, then we'll get back to everyone with
> more info in a weeks time.
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: State of Materialized Views

2017-09-22 Thread Ben Bromhead
Just saw that https://issues.apache.org/jira/browse/CASSANDRA-11500 got
commited 4 days ago, awesome stuff and a huge thank you to everyone who
worked on it!

Looking forward to what happens in
https://issues.apache.org/jira/browse/CASSANDRA-13826 :)

I don't know if we are waiting on anything other than
https://issues.apache.org/jira/browse/CASSANDRA-13808 for 3.11.1 ?

On Tue, 25 Jul 2017 at 04:58 Josh McKenzie  wrote:

> Status of above is on our collective radars. As always, interleaving
> reviews with other work is a challenge.
>
> On Mon, Jul 24, 2017 at 7:05 PM, Nate McCall  wrote:
>
> > >
> > > We're working on the following MV-related issues in the 4.0 time-frame:
> > > CASSANDRA-13162
> > > CASSANDRA-13547
> > Patch Available
> >
> > > CASSANDRA-13127
> > Patch Available
> >
> > > CASSANDRA-13409
> > Patch Available
> >
> > > CASSANDRA-12952
> > Patch Available
> >
> > > CASSANDRA-13069
> > > CASSANDRA-12888
> > >
> >
> > Josh - want to make sure folks are not duplicating effort here, is the
> > status of the above on your radar? Regardless, I appreciate the
> > communication. Thanks for that!
> >
> > -----
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Proposal to retroactively mark materialized views experimental

2017-09-29 Thread Ben Bromhead
I'm a fan of introducing experimental flags in general as well, +1



On Fri, 29 Sep 2017 at 13:22 Jon Haddad  wrote:

> I’m very much +1 on this, and to new features in general.
>
> I think having a clear line in which we classify something as production
> ready would be nice.  It would be great if committers were using the
> feature in prod and could vouch for it’s stability.
>
> > On Sep 29, 2017, at 1:09 PM, Blake Eggleston 
> wrote:
> >
> > Hi dev@,
> >
> > I’d like to propose that we retroactively classify materialized views as
> an experimental feature, disable them by default, and require users to
> enable them through a config setting before using.
> >
> > Materialized views have several issues that make them (effectively)
> unusable in production. Some of the issues aren’t just implementation
> problems, but problems with the design that aren’t easily fixed. It’s
> unfair of us to make features available to users in this state without
> providing a clear warning that bad or unexpected things are likely to
> happen if they use it.
> >
> > Obviously, this isn’t great news for users that have already adopted
> MVs, and I don’t have a great answer for that. I think that’s sort of a
> sunk cost at this point. If they have any MV related problems, they’ll have
> them whether they’re marked experimental or not. I would expect this to
> reduce the number of users adopting MVs in the future though, and if they
> do, it would be opt-in.
> >
> > Once MVs reach a point where they’re usable in production, we can remove
> the flag. Specifics of how the experimental flag would work can be hammered
> out in a forthcoming JIRA, but I’d imagine it would just prevent users from
> creating new MVs, and maybe log warnings on startup for existing MVs if the
> flag isn’t enabled.
> >
> > Let me know what you think.
> >
> > Thanks,
> >
> > Blake
>
>
> -----
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread Ben Bromhead
 >>> explicit option because they really aren’t ready for general use.
> > >>>
> > >>> Claiming disabling by default == removal is not helpful to the
> > >>> conversation and is very misleading.
> > >>>
> > >>> Let’s be practical here. The people that are most likely to put MVs
> in
> > >>> production right now are people new to Cassandra that don’t know any
> > >>> better. The people that *should* be using MVs are the contributors to
> > >> the
> > >>> project. People that actually wrote Cassandra code that can do a
> patch
> > >> and
> > >>> push it into prod, and get it submitted upstream when they fix
> > something.
> > >>> Yes, a lot of this stuff requires production usage to shake out the
> > bugs,
> > >>> that’s fine, but we shouldn’t lie to people and say “feature X is
> > ready”
> > >>> when it’s not. That’s a great way to get a reputation as “unstable”
> or
> > >>> “not fit for production."
> > >>>
> > >>> Jon
> > >>>
> > >>>
> > >>>> On Oct 2, 2017, at 11:54 AM, DuyHai Doan 
> > wrote:
> > >>>>
> > >>>> "I would (in a patch release) disable MV CREATE statements, and emit
> > >>>> warnings for ALTER statements and on schema load if they’re not
> > >>> explicitly
> > >>>> enabled"
> > >>>>
> > >>>> --> I find this pretty extreme. Now we have an existing feature
> > sitting
> > >>>> there in the base code but forbidden from version xxx onward.
> > >>>>
> > >>>> Since when do we start removing feature in a patch release ?
> > >> (forbidding
> > >>> to
> > >>>> create new MV == removing the feature, defacto)
> > >>>>
> > >>>> Even the Thrift protocol has gone through a long process of
> > deprecation
> > >>> and
> > >>>> will be removed on 4.0
> > >>>>
> > >>>> And if we start opening the Pandora box like this, what's next ?
> > >>> Forbidding
> > >>>> to create SASI index too ? Removing Vnodes ?
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <
> > >>> jeremiah.jor...@gmail.com
> > >>>>> wrote:
> > >>>>
> > >>>>>> Only emitting a warning really reduces visibility where we need
> it:
> > >> in
> > >>>>> the development process.
> > >>>>>
> > >>>>> How does emitting a native protocol warning reduce visibility
> during
> > >> the
> > >>>>> development process? If you run CREATE MV and cqlsh then prints
> out a
> > >>>>> giant warning statement about how it is an experimental feature I
> > >> think
> > >>>>> that is pretty visible during development?
> > >>>>>
> > >>>>> I guess I can see just blocking new ones without a flag set, but we
> > >> need
> > >>>>> to be careful here. We need to make sure we don’t cause a problem
> for
> > >>>>> someone that is using them currently, even with all the edge cases
> > >>> issues
> > >>>>> they have now.
> > >>>>>
> > >>>>> -Jeremiah
> > >>>>>
> > >>>>>
> > >>>>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston  >
> > >>>>> wrote:
> > >>>>>>
> > >>>>>> Yeah, I'm not proposing that we disable MVs in existing clusters.
> > >>>>>>
> > >>>>>>
> > >>>>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (
> > >>> alek...@apple.com)
> > >>>>> wrote:
> > >>>>>>
> > >>>>>> The idea is to check the flag in CreateViewStatement, so creation
> of
> > >>> new
> > >>>>> MVs doesn’t succeed without that flag flipped.
> > >>>>>>
> > >>>>>> Obviously, just disabling existing MVs working in a minor would be
> > >>> silly.
> > >>>>>>
> > >>>>>> As for the warning - yes, that should also be emitted.
> > >> Unconditionally.
> > >>>>>>
> > >>>>>> —
> > >>>>>> AY
> > >>>>>>
> > >>>>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (
> > >>>>> jeremiah.jor...@gmail.com) wrote:
> > >>>>>>
> > >>>>>> These things are live on clusters right now, and I would not want
> > >>>>> someone to upgrade their cluster to a new *patch* release and
> > suddenly
> > >>>>> something that may have been working for them now does not
> function.
> > >>>>> Anyway, we need to be careful about how this gets put into practice
> > if
> > >>> we
> > >>>>> are going to do it retroactively.
> > >>>>>
> > >>>>>
> > >>>>> 
> > -
> > >>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>>>>
> > >>>>>
> > >>>
> > >>>
> > >>> -
> > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>>
> > >>>
> > >>
> >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Proposal to retroactively mark materialized views experimental

2017-10-03 Thread Ben Bromhead
Lot's of hard work by folks on MVs and I don't think this proposal is a
commentary or reflection on that. What it is, is about signalling to users
that this feature has more edge cases and caveats than other tried and true
features (like all new features).

MVs are still a feature in a "stable" release and if it solves the end
users problem but they are more aware of the edge cases because it is an
explicit opt-in I think that would be quite beneficial. However this
argument is more about user behaviour and guiding first time adopters so
they have a better first time experience, which is a more nebulous concept
with many approaches.

The other side to proposal touches on the idea feature flags that operators
can enable and disable depending on their organisational requirements and
risk appetite. This is strongly related to the first point about guiding
user behaviour, however it allows an organisation or operator to make that
decision independent of their own end users.

Whilst personally I would advocate for off by default for experimental /
dangerous features (even retroactively doing it as suggested in the
proposal), I do see the other side of the argument and we do need to give
the quality control processes in place a chance to show fruit. I think the
compromise suggested by Aleksey is fair.

+1 to either A) or B)

On Tue, 3 Oct 2017 at 09:29 Blake Eggleston  wrote:

> The remaining issues are:
>
> * There's no way to determine if a view is out of sync with the base table.
> * If you do determine that a view is out of sync, the only way to fix it
> is to drop and rebuild the view.
> * There are liveness issues with updates being reflected in the view.
>
> On October 3, 2017 at 9:00:32 AM, Sylvain Lebresne (sylv...@datastax.com)
> wrote:
>
> On Tue, Oct 3, 2017 at 5:54 PM, Aleksey Yeshchenko 
> wrote:
> > There are a couple compromise options here:
> >
> > a) Introduce the flag (enalbe_experimental_features, or maybe one per
> experimental feature), set it to ‘false’ in the yaml, but have the default
> be ‘true’. So that if you are upgrading from a previous minor to the next
> without updating the yaml, you notice nothing.
> >
> > b) Introduce the flag in the minor, and set it to ‘true’ in the yaml in
> 3.0 and 3.11, but to ‘false’ in 4.0. So the operators and in general people
> who know better can still disable it with one flip, but nobody would be
> affected by it in a minor otherwise.
> >
> > B might be more correct, and I’m okay with it
>
> Does feel more correct to me as well
>
> > although I do feel that we are behaving irresponsibly as developers by
> allowing MV creation by default in their current state
>
> You're giving little credit to the hard work that people have put into
> getting MV in a usable state. To quote Kurt's email:
>
> > And finally, back onto the original topic. I'm not convinced that MV's
> need
> > this treatment now. Zhao and Paulo (and others+reviewers) have made
> quite a
> > lot of fixes, granted there are still some outstanding bugs but the
> > majority of bad ones have been fixed in 3.11.1 and 3.0.15, the remaining
> > bugs mostly only affect views with a poor data model. Plus we've already
> > required the known broken components require a flag to be turned on.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-03-31 Thread Ben Bromhead
o aim for a Q3/4
>> 2018 release, and as we go we just review the outstanding improvements and
>> decide whether it's worth pushing it back or if we've got enough to
>> release. I suppose keep this time frame in mind when choosing your tickets.
>>
>> We can aim for an earlier date (midyear?) but I figure the
>> testing/validation/bugfixing period prior to release might drag on a bit so
>> being a bit conservative here.
>> The main goal would be to not let list 1 grow unless we're well ahead,
>> and only cull from it if we're heavily over-committed or we decide the
>> improvement can wait. I assume this all sounds like common sense but
>> figured it's better to spell it out now.
>>
>>
>> NEXT STEPS
>> After 2 weeks/whenever the discussion dies off I'll consolidate all the
>> tickets, relevant comments and follow up with a summary, where we can
>> discuss/nitpick issues and come up with a final list to go ahead with.
>>
>> On a side note, in conjunction with this effort we'll obviously have to
>> do something about validation and testing. I'll keep that out of this email
>> for now, but there will be a follow up so that those of us willing to help
>> validate/test trunk can avoid duplicating effort.
>>
>> REVIEW
>> This is the list of "huge/breaking" tickets that got mentioned in the
>> last roadmap discussion and their statuses. This is not terribly important
>> but just so we can keep in mind what we previously talked about. I think we
>> leave it up to the relevant contributors to decide whether they want to get
>> the still open tickets into 4.0.
>>
>> CASSANDRA-9425 Immutable node-local schema
>> <https://issues.apache.org/jira/browse/CASSANDRA-9425> - Committed
>> CASSANDRA-10699 Strongly consistent schema alterations
>> <https://issues.apache.org/jira/browse/CASSANDRA-10699> - Open, no
>> discussion in quite some time.
>> CASSANDRA-12229 NIO streaming
>> <https://issues.apache.org/jira/browse/CASSANDRA-12229> - Committed
>> CASSANDRA-8457 NIO messaging
>> <https://issues.apache.org/jira/browse/CASSANDRA-8457> - Committed
>> CASSANDRA-12345 Gossip 2.0
>> <https://issues.apache.org/jira/browse/CASSANDRA-12345> - Open, no sign
>> of any action.
>> CASSANDRA-9754 Make index info heap friendly for large CQL partitions
>> <https://issues.apache.org/jira/browse/CASSANDRA-9754> - In progress but
>> no update in a long time.
>> CASSANDRA-11559 enhanced node representation
>> <https://issues.apache.org/jira/browse/CASSANDRA-11559> - Open, no
>> change since early 2016.
>> CASSANDRA-6246 epaxos
>> <https://issues.apache.org/jira/browse/CASSANDRA-6246> - In progress but
>> no update since Feb 2017.
>> CASSANDRA-7544 storage port configurable per node
>> <https://issues.apache.org/jira/browse/CASSANDRA-7544> - Committed
>> CASSANDRA-5 remove thrift support
>> <https://issues.apache.org/jira/browse/CASSANDRA-5> - Committed
>> CASSANDRA-10857 dropping compact storage
>> <https://issues.apache.org/jira/browse/CASSANDRA-10857> - Committed
>>
>> To start us off...
>> And here are my lists to get us started.
>> 1.
>> CASSANDRA-8460 - Tiered/Cold storage for TWCS
>> <https://issues.apache.org/jira/browse/CASSANDRA-8460>
>> CASSANDRA-12783 - Batchlog redesign
>> <https://issues.apache.org/jira/browse/CASSANDRA-12783>
>> CASSANDRA-11559 - Enchance node representation
>> <https://issues.apache.org/jira/browse/CASSANDRA-11559>
>> CASSANDRA-12344 - Forward writes to replacement node with same
>> address <https://issues.apache.org/jira/browse/CASSANDRA-12344>
>> CASSANDRA-8119 - More expressive Consistency Levels
>> <https://issues.apache.org/jira/browse/CASSANDRA-8119>
>> CASSANDRA-14210 - Optimise SSTables upgrade task scheduling
>> <https://issues.apache.org/jira/browse/CASSANDRA-14210>
>> CASSANDRA-10540 - RangeAwareCompaction
>> <https://issues.apache.org/jira/browse/CASSANDRA-10540>
>>
>>
>> 2:
>> CASSANDRA-10726 - Read repair inserts should not be blocking
>> <https://issues.apache.org/jira/browse/CASSANDRA-10726>
>> CASSANDRA-9754 - Make index info heap friendly for large CQL partitions
>> <https://issues.apache.org/jira/browse/CASSANDRA-9754>
>> CASSANDRA-12294 - LDAP auth
>> <https://issues.apache.org/jira/browse/CASSANDRA-12294>
>> CASSANDRA-12151 - Audit logging
>> <https://issues.apache.org/jira/browse/CASSANDRA-12151>
>> CASSANDRA-10495 - Fix streaming with vnodes
>> <https://issues.apache.org/jira/browse/CASSANDRA-10495>
>>
>> Also, here's some handy JQL to start you off:
>> project = CASSANDRA AND fixVersion in (4.x, 4.0) AND issue in
>> watchedIssues() AND status != Resolved
>>
>>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: A Simple List of New Major Features Desired for Apache Cassandra Version 4.0

2018-03-31 Thread Ben Bromhead
Thank you, Kenneth, for listening to the PMC and for putting the discussion
in the correct list. I also appreciate your enthusiasm.

To address a few of your points from your previous emails in no particular
order:

   - There is already a compiled list of features slated for 4.0, this list
   is simply a search using the following JQL on JIRA - `project =
   CASSANDRA AND fixVersion = 4.0 ORDER BY priority DESC, updated DESC`. By
   looking at this list, we can see fixes/patches/features that have been
   committed to trunk as well as tickets that are still in progress but
   someone at some point thought it was likely to land in 4.0
   - I appreciate a desire to create a bucket list of any and all desired
   features for 4.0, the reality is that this is an open source project driven
   by individual contributors and so what goes in 4.0 is largely up to those
   who do the work and put those features in.
   - Whilst it has been a long time since 3.0 was first released, the 3.x
   tick-tock experiment resulted in a large number of major releases in short
   succession and as such, I think most folks are simply recovering from that
   push + there is no longer a march to a vendors drum beat. Hence the slow
   down in major feature release and a focus on getting existing features more
   stable.
   - The major release of 3.11 was only released at the end of June 2017
   and it has had 2 point releases in the meantime.
   - Whilst not having a consistent major release schedule can be
   frustrating for end users and product marketers, not having a stable
   database is far more maddening.
   - Having said that, there is a good body of both large and small changes
   that are in "done/resolved", "patch ready", "awaiting feedback" etc and
   slated for 4.0 which would be good to get out the door. While this is a
   matter of opinion I think it's about reaching a nice balance of having not
   too many changes making it a larger adoption risk and having enough time to
   work on some good things (e.g.
   https://issues.apache.org/jira/browse/CASSANDRA-12229), such that it's
   actually worth working toward cutting a major release.
   - I'd respectfully disagree that we have a "basic collaboration
   challenge". This is primarily a community that communicates by gradually
   moving towards consensus (which takes time) and this thread is simply one
   of the many discussions and interactions that move towards consensus about
   4.0. If anything we are resource/people constrained, but that is true of
   all open source communities.

I've decided to respond on the previous thread, "Roadmap to 4.0"
(however just via the dev list where the discussion should live) with my
suggestions as I don't want to ignore both Kurt and Jeffs previous
contributions on this subject.

On Fri, Mar 30, 2018 at 6:49 PM Kenneth Brotman
 wrote:

> Just list any desired new major features for 4.0 that you want added.  I
> will maintain a compiled list for all to see.  Don't worry about any steps
> beyond this.  Don't make any judgements about or make any comments at all
> about what others add.
>
> No judgments at this point.  This is a list of everyone's suggestions.  Add
> your suggestions for new major features you desire to be added for version
> 4.0 only. Keep it simple, not detailed yet.  That comes a few steps from
> now.  What we have is a basic collaboration challenge.  No problem.
>
> Kenneth Brotman
>
>
>
>
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-04-03 Thread Ben Bromhead
+1

Even though I suggested clearing blockers, I'm equally happy with a
time-boxed event to draw the line in the sand. As long as its something
clear to work towards with appropriate commitment from folk.

On Tue, Apr 3, 2018 at 8:10 AM Sylvain Lebresne 
wrote:

> For what it's worth (and based on the project experience), I think the
> strategy
> of "let's agree on a list of tickets everyone would love to get in before
> we
> freeze 4.0" doesn't work very well (it's largely useless, expect for making
> us
> feel good about not releasing anything). Those lists always end up being
> too
> big especially given we have no control on people's ability to contribute
> (some stuffs will always lag for a very long time, even when they sound
> really cool on paper).
>
> I'm also a bit sad that we seem to be getting back to our old demons of
> trying
> to shove as much as we possibly can in the next major as if having a
> feature
> miss it means it will never happen. The 4.0 changelog is big already and we
> haven't made a release with new features in almost a year now, so I
> personally
> think we should start being a bit more aggressive with it and learn to get
> comfortable letting feature slip if they are not ready.
>
> My concrete proposal would be to declare a feature freeze for 4.0 in 2
> months,
> so say June 1th. That leave some time for finishing features that are in
> progress, but not too much to get derailed. And let's be strict on that
> freeze.
> After that, we'll see how quickly we can get stuffs to stabilize but I'd
> suggest aiming for an alpha 3-4 weeks after that.
>
> Of course, we should probably (re-(re-(re-)))start a discussion on release
> "strategy" in parallel because it doesn't seem we have one right now, but
> that's imo a discussion we should keep separate.
>
> --
> Sylvain
>
>
> On Mon, Apr 2, 2018 at 4:54 PM DuyHai Doan  wrote:
>
> > My wish list:
> >
> > * Add support for arithmetic operators (CASSANDRA-11935)
> > * Allow IN restrictions on column families with collections
> > (CASSANDRA-12654)
> > * Add support for + and - operations on dates (CASSANDRA-11936)
> > * Add the currentTimestamp, currentDate, currentTime and currentTimeUUID
> > functions (CASSANDRA-13132)
> > * Allow selecting Map values and Set elements (CASSANDRA-7396)
> >
> > Those are mostly useful for timeseries data models and I guess has no
> > significant impact on the internals and operations so the risk of
> > regression is low
> >
> > On Mon, Apr 2, 2018 at 4:33 PM, Jeff Jirsa  wrote:
> >
> > > 9608 (java9)
> > >
> > > --
> > > Jeff Jirsa
> > >
> > >
> > > > On Apr 2, 2018, at 3:45 AM, Jason Brown 
> wrote:
> > > >
> > > > The only additional tickets I'd like to mention are:
> > > >
> > > > https://issues.apache.org/jira/browse/CASSANDRA-13971 - Automatic
> > > > certificate management using Vault
> > > > - Stefan's Vault integration work. A sub-ticket, CASSANDRA-14102,
> > > addresses
> > > > encryption at-rest, subsumes CASSANDRA-9633 (SSTable encryption) -
> > which
> > > I
> > > > doubt I would be able to get to any time this year. It would
> definitely
> > > be
> > > > nice to have a clarified encryption/security story for 4.0.
> > > >
> > > > https://issues.apache.org/jira/browse/CASSANDRA-11990 - Address rows
> > > rather
> > > > than partitions in SASI
> > > > - a nice update for SASI, but not critical.
> > > >
> > > > -Jason
> > > >
> > > >> On Sat, Mar 31, 2018 at 6:53 PM, Ben Bromhead 
> > > wrote:
> > > >>
> > > >> Apologies all, I didn't realize I was responding to this discussion
> > > only on
> > > >> the @user list. One of the perils of responding to a thread that is
> on
> > > both
> > > >> user and dev...
> > > >>
> > > >> For context, I have included my response to Kurt's previous
> discussion
> > > on
> > > >> this topic as it only ended up on the user list.
> > > >>
> > > >> *After some further discussions with folks offline, I'd like to
> revive
> > > this
> > > >> discussion. *
> > > >>
> > > >> *As Kurt mentioned, to keep it simple I if we can simply build
> > consensus
> > > >> around what is in for 4.

Re: Repair scheduling tools

2018-04-04 Thread Ben Bromhead
;>>> wrote:
> >>>>>>> I just want to say I think it would be great for our users if we
> >>>> moved
> >>>>>>> repair scheduling into Cassandra itself. The team here at Netflix
> >>> has
> >>>>>>> opened the ticket
> >>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-14346
> >>>>>>> and have written a detailed design document
> >>>>>>> <https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_
> >>>> t45rz7H3xs9G
> >>>>>>> bFSEyGzEtM/edit#heading=h.iasguic42ger
> >>>>>>> that includes problem discussion and prior art if anyone wants to
> >>>>>>> contribute to that. We tried to fairly discuss existing solutions,
> >>>>>>> what their drawbacks are, and a proposed solution.
> >>>>>>>
> >>>>>>> If we were to put this as part of the main Cassandra daemon, I
> >>> think
> >>>>>>> it should probably be marked experimental and of course be
> >>> something
> >>>>>>> that users opt into (table by table or cluster by cluster) with the
> >>>>>>> understanding that it might not fully work out of the box the first
> >>>>>>> time we ship it. We have to be willing to take risks but we also
> >>> have
> >>>>>>> to be honest with our users. It may help build confidence if a few
> >>>>>>> major deployments use it (such as Netflix) and we are happy of
> >>> course
> >>>>>>> to provide that QA as best we can.
> >>>>>>>
> >>>>>>> -Joey
> >>>>>>>
> >>>>>>> On Tue, Apr 3, 2018 at 10:48 AM, Blake Eggleston
> >>>>>>>  >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi dev@,
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> The question of the best way to schedule repairs came up on
> >>>>>>>> CASSANDRA-14346, and I thought it would be good to bring up the
> >>> idea
> >>>>>>>> of an external tool on the dev list.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Cassandra lacks any sort of tools for automating routine tasks
> >>> that
> >>>>>>>> are required for running clusters, specifically repair. Regular
> >>>>>>>> repair is a must for most clusters, like compaction. This means
> >>>> that,
> >>>>>>>> especially as far as eventual consistency is concerned, Cassandra
> >>>>>>>> isn’t totally functional out of the box. Operators either need to
> >>>>>>>> find a 3rd party solution or implement one themselves. Adding this
> >>>> to
> >>>>>>>> Cassandra would make it easier to use.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Is this something we should be doing? If so, what should it look
> >>>> like?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Personally, I feel like this is a pretty big gap in the project
> >>> and
> >>>>>>>> would like to see an out of process tool offered. Ideally,
> >>> Cassandra
> >>>>>>>> would just take care of itself, but writing a distributed repair
> >>>>>>>> scheduler that you trust to run in production is a lot harder than
> >>>>>>>> writing a single process management application that can failover.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Any thoughts on this?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Blake
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>> 
> >>> -
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>>
> >>>>>>
> >>>>>> 
> >>> -
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Thank you & Best Regards,
> >>> --Simon (Qingcun) Zhou
> >>>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-04-04 Thread Ben Bromhead
eze and KISS, but I feel that this really
> needs a
> >> bit more thought before we just jump in and set another precedent for
> >> future releases. IMO the Cassandra project has had a seriously bad track
> >> record of releasing major versions in the past, and we should probably
> work
> >> at resolving that properly, rather than just continuing the current
> "let's
> >> just try something new every time without really thinking about it".
> >>
> >> Some points:
> >>
> >>   1.  This strategy means that we don't care about what improvements
> >>   actually make it into any given major version. This means that we will
> >> have
> >>   major releases with nothing/very little desirable for users, and thus
> >>   little reason to upgrade other than to stay on a supported version
> (from
> >>   experience this isn't terribly important to users of a database). I
> >> think
> >>   this inevitably leads to supporting more versions than necessary, and
> in
> >>   general a pretty poor experience for users as we spend more time
> >> fighting
> >>   bugs in production rather than before we do a release (purely because
> of
> >>   increased frequency of releases).
> >>   2. We'll always be driven by feature deadlines, which for the most
> part
> >>   is fine, as long as we handle verification/quality assurance/release
> >>   candidates appropriately. The main problem here though is that we
> don't
> >>   really know what's going to be in a certain release until we hit the
> >>   freeze, and what's in it may not really make sense at that point in
> >> time.
> >>   3. We'll pump out major versions fairly regularly and end up with even
> >>   more users that are on EOL versions with complex upgrade paths to get
> >> to a
> >>   supported version or a version with a feature they need (think all
> those
> >>   people still out there on 1.2).
> >>   4. This strategy has the positive effect of allowing developers to see
> >>   their changes in production faster, but OTOH if no one really uses the
> >> new
> >>   versions this doesn't really happen anyway.
> >>
> >> I'd also note that if people hadn't noticed, users tend to be pretty
> >> reluctant to upgrade their databases (hello everyone still running 2.1).
> >> This tends to be the nature of a database to some extent (if it works on
> >> version x, why upgrade to y?). IMO it would make more sense to support
> less
> >> versions but for a longer period of time. I'm sure most users would
> >> appreciate 2 years of bug fixes for only 2 branches with a new major
> >> approximately every 2 years. Databases don't move that fast, there's not
> >> much desirable in a feature release every year for users.
> >>
> >> sidenote: 3.10 was released in January 2017, and while the changes list
> for
> >> 4.0 is getting quite large there's not much there that's going to win
> over
> >> users. It's mostly refactorings and improvements that affect developers
> >> more so than users. I'm really interested in why people believe there
> is an
> >> actual benefit in pumping out feature releases on a yearly basis. Who
> >> exactly does that benefit? From what I know, the majority of "major"
> users
> >> are still backporting stuff they want to 2.1, so why rush releasing more
> >> versions?
> >>
> >> Regardless of whatever plan we do end up following it would still be
> >>> valuable to have a list of tickets for 4.0 which is the overall goal of
> >>> this email - so let's not get too worked up on the details just yet
> (save
> >>> that for after I summarise/follow up).
> >>>
> >> lol. dreaming.
> >>
> >> On 4 April 2018 at 10:38, Aleksey Yeshchenko  wrote:
> >>
> >>> 3.0 will be the most popular release for probably at least another
> couple
> >>> years - I see no good reason to cap its support window. We aren’t
> Oracle.
> >>>
> >>> —
> >>> AY
> >>>
> >>> On 3 April 2018 at 22:29:29, Michael Shuler (mich...@pbandjelly.org)
> >>> wrote:
> >>>
> >>> Apache Cassandra 3.0 is supported until 6 months after 4.0 release
> (date
> >>> TBD).
> >>>
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-04-04 Thread Ben Bromhead
+1

On Wed, Apr 4, 2018 at 8:50 PM Michael Shuler 
wrote:

> On 04/04/2018 07:06 PM, Nate McCall wrote:
> >
> > It feels to me like we are coalescing on two points:
> > 1. June 1 as a freeze for alpha
> > 2. "Stable" is the new "Exciting" (and the testing and dogfooding
> > implied by such before a GA)
> >
> > How do folks feel about the above points?
>
> +1
> +1
>
> :)
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-04-09 Thread Ben Bromhead
>
> For those wanting to delay, are we just dancing around inclusion of
> some pet features? This is fine, I just think we need to communicate
> what we are after if so.
>

+1 Some solid examples of tickets that won't make it with the proposed
timeline and a proposed alternative would help.

Otherwise if no one chimes in I would propose sticking with June 1.




>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-04-11 Thread Ben Bromhead
 4.0 soon, which doesn't exclude a "short" cycle for the
> following
> major (where my definition of short here is something like 6-8
> months), and
> I'm happy to decide to make 4.0 a non-mandatory upgrade to whatever
> comes next so that folks that prefer upgrading rarely can simply skip
> it and
> go to the next one. Likely nobody will die if we wait more though, and
> it's
> clear it will make a few people here more happy if we do, but I
> believe the
> project as a whole will be a bit worst off, that's all.
>
> --
> Sylvain
>
>
> [1]: I'll note that I don't deny upgrading is huge deal for some
> users, but
> let's not skew arguments too much based on any one user interest. For
> many
> users, upgrading even every year to get improvements is still
> considered as
> a
> good deal, and that's not counting new users for which it's super
> frustrating
> to miss out on improvements because we release major only every 2+
> years.
> [2]: I'll be clear: I will simply not buy anyone argument that "we'll
> do
> so much better testing this time" on face value. Not anymore. If you
> want to
> use that argument to sell having bigger releases, then prove it first.
> Let's
> do reasonably sized 4.0 and 4.1/5.0 and prove that our
> testing/stability
> story
> is iron clad now, and then for 4.2/6.0 I'll be willing to agree that
> making
> bigger release may not impact stability too much.
> [3]: Conservative estimate, if we do care about stable releases as we
> all
> seem
> to, even if we were to freeze June 1, we will almost surely not release
> before
> October/November, which will be ~1.3 year since the last major release
> (again,
> that's the conservative estimate). If we push a few months to get some
> big
> complex feature in, not only this push the freeze of those few months,
> but
> will also require more testing, so we're looking at 2+ years, with a
> possibly
> large '+'.
>
>
>
>
> >
> > Beyond that, I still don't like June 1. Validating releases is hard.
> It
> > sounds easy to drop a 4.1 and ask people to validate again, but it's
> a hell
> > of a lot harder than it sounds. I'm not saying I'm a hard -1, but I
> really
> > think it's too soon. 50'ish days is too short to draw a line in the
> sand,
> > especially as people balance work obligations with Cassandra feature
> > development.
> >
> >
> >
> >
> > On Tue, Apr 10, 2018 at 3:18 PM, Nate McCall 
> wrote:
> >
> > > A lot of good points and everyone's input is really appreciated.
> > >
> > > So it sounds like we are building consensus towards June 1 for 4.0
> > > branch point/feature freeze and the goal is stability. (No one has
> > > come with a hard NO anyway).
> > >
> > > I want to reiterate Sylvain's point that we can do whatever we
> want in
> > > terms of dropping a new feature 4.1/5.0 (or whatev.) whenever we
> want.
> > >
> > > In thinking about this, what is stopping us from branching 4.0 a
> lot
> > > sooner? Like now-ish? This will let folks start hacking on trunk
> with
> > > new stuff, and things we've gotten close on can still go in 4.0
> > > (Virtual tables). I guess I'm asking here if we want to
> disambiguate
> > > "feature freeze" from "branch point?" I feel like this makes sense.
> > >
> > >
> -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-04-12 Thread Ben Bromhead
We (Instaclustr) are also happy to get started testing. Including (internal
to Instaclustr) production workloads.

On Thu, Apr 12, 2018 at 3:45 PM Nate McCall  wrote:

> To be clear, more who is willing to commit to testing should we go this
> route.
>
> On Fri, Apr 13, 2018, 7:41 AM Nate McCall  wrote:
>
> > Ok. So who's willing to test 4.0 on June 2nd? Let's start a sign up.
> >
> > We (tlp) will put some resources on this via going through some canned
> > scenarios we have internally. We aren't in a position to test data
> validity
> > (yet) but we can do a lot around cluster behavior.
> >
> > Who else has specific stuff they are willing to do? Even if it's just
> > tee'ing prod traffic, that would be hugely valuable.
> >
> > On Fri, Apr 13, 2018, 6:15 AM Jeff Jirsa  wrote:
> >
> >> On Thu, Apr 12, 2018 at 9:41 AM, Jonathan Haddad 
> >> wrote:
> >>
> >> > It sounds to me (please correct me if I'm wrong) like Jeff is arguing
> >> that
> >> > releasing 4.0 in 2 months isn't worth the effort of evaluating it,
> >> because
> >> > it's a big task and there's not enough stuff in 4.0 to make it
> >> worthwhile.
> >> >
> >> >
> >> More like "not enough stuff in 4.0 to make it worthwhile for the people
> I
> >> personally know to be willing and able to find the weird bugs".
> >>
> >>
> >> > If that is the case, I'm not quite sure how increasing the surface
> area
> >> of
> >> > changed code which needs to be vetted is going to make the process any
> >> > easier.
> >>
> >>
> >> It changes the interest level of at least some of the people able to
> >> properly test it from "not willing" to "willing".
> >>
> >> Totally possible that there exist people who are willing and able to
> find
> >> and fix those bugs, who just haven't committed to it in this thread.
> >> That's
> >> probably why Sankalp keeps asking who's actually willing to do the
> testing
> >> on June 2 - if nobody's going to commit to doing real testing on June 2,
> >> all we're doing is adding inconvenience to those of us who'd be willing
> to
> >> do it later in the year.
> >>
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-04-12 Thread Ben Bromhead
I would also suggest if you can't commit to June 2 due to timing or feature
set. If you could provide the absolute minimum date / features that would
let you commit to testing, that would be useful.

On Thu, Apr 12, 2018 at 3:49 PM Ben Bromhead  wrote:

> We (Instaclustr) are also happy to get started testing. Including
> (internal to Instaclustr) production workloads.
>
> On Thu, Apr 12, 2018 at 3:45 PM Nate McCall  wrote:
>
>> To be clear, more who is willing to commit to testing should we go this
>> route.
>>
>> On Fri, Apr 13, 2018, 7:41 AM Nate McCall  wrote:
>>
>> > Ok. So who's willing to test 4.0 on June 2nd? Let's start a sign up.
>> >
>> > We (tlp) will put some resources on this via going through some canned
>> > scenarios we have internally. We aren't in a position to test data
>> validity
>> > (yet) but we can do a lot around cluster behavior.
>> >
>> > Who else has specific stuff they are willing to do? Even if it's just
>> > tee'ing prod traffic, that would be hugely valuable.
>> >
>> > On Fri, Apr 13, 2018, 6:15 AM Jeff Jirsa  wrote:
>> >
>> >> On Thu, Apr 12, 2018 at 9:41 AM, Jonathan Haddad 
>> >> wrote:
>> >>
>> >> > It sounds to me (please correct me if I'm wrong) like Jeff is arguing
>> >> that
>> >> > releasing 4.0 in 2 months isn't worth the effort of evaluating it,
>> >> because
>> >> > it's a big task and there's not enough stuff in 4.0 to make it
>> >> worthwhile.
>> >> >
>> >> >
>> >> More like "not enough stuff in 4.0 to make it worthwhile for the
>> people I
>> >> personally know to be willing and able to find the weird bugs".
>> >>
>> >>
>> >> > If that is the case, I'm not quite sure how increasing the surface
>> area
>> >> of
>> >> > changed code which needs to be vetted is going to make the process
>> any
>> >> > easier.
>> >>
>> >>
>> >> It changes the interest level of at least some of the people able to
>> >> properly test it from "not willing" to "willing".
>> >>
>> >> Totally possible that there exist people who are willing and able to
>> find
>> >> and fix those bugs, who just haven't committed to it in this thread.
>> >> That's
>> >> probably why Sankalp keeps asking who's actually willing to do the
>> testing
>> >> on June 2 - if nobody's going to commit to doing real testing on June
>> 2,
>> >> all we're doing is adding inconvenience to those of us who'd be
>> willing to
>> >> do it later in the year.
>> >>
>> >
>>
> --
> Ben Bromhead
> CTO | Instaclustr <https://www.instaclustr.com/>
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Roadmap for 4.0

2018-04-12 Thread Ben Bromhead
While I would prefer earlier, if Sept 1 gets better buy-in and we can have
broader commitment to testing. I'm super happy with that. As Nate said,
having a solid line to work towards is going to help massively.

On Thu, Apr 12, 2018 at 4:07 PM Nate McCall  wrote:

> > If we push it to Sept 1 freeze, I'll personally spend a lot of time
> testing.
> >
> > What can I do to help convince the Jun1 folks that Sept1 is acceptable?
>
> I can come around to that. At this point, I really just want us to
> have a date we can start talking to/planning around.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Evolving the client protocol

2018-04-19 Thread Ben Bromhead
WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.


On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg  wrote:

> Hi,
>
> I think that updating the protocol spec to Cassandra puts the onus on the
> party changing the protocol specification to have an implementation of the
> spec in Cassandra as well as the Java and Python driver (those are both
> used in the Cassandra repo). Until it's implemented in Cassandra we haven't
> fully evaluated the specification change. There is no substitute for trying
> to make it work.
>
> There are also realities to consider as to what the maintainers of the
> drivers are willing to commit.
>
> RE #1,
>
> I am +1 on the fact that we shouldn't require an extra hop for range scans.
>
> In JIRA Jeremiah made the point that you can still do this from the client
> by breaking up the token ranges, but it's a leaky abstraction to have a
> paging interface that isn't a vanilla ResultSet interface. Serial vs.
> parallel is kind of orthogonal as the driver can do either.
>
> I agree it looks like the current specification doesn't make what should
> be simple as simple as it could be for driver implementers.
>
> RE #2,
>
> +1 on this change assuming an implementation in Cassandra and the Java and
> Python drivers.
>
> RE #3,
>
> It's hard to be +1 on this because we don't benefit by boxing ourselves in
> by defining a spec we haven't implemented, tested, and decided we are
> satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
> what if Cassandra decides to go a different direction in some way?
>
> I don't think there is much discussion to be had without an example of the
> the changes to the CQL specification to look at, but even then if it looks
> risky I am not likely to be in favor of it.
>
> Regards,
> Ariel
>
> On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:
> >
> >
> > On 2018/04/19 07:19:27, kurt greaves  wrote:
> > > >
> > > > 1. The protocol change is developed using the Cassandra process in
> > > >a JIRA ticket, culminating in a patch to
> > > >doc/native_protocol*.spec when consensus is achieved.
> > >
> > > I don't think forking would be desirable (for anyone) so this seems
> > > the most reasonable to me. For 1 and 2 it certainly makes sense but
> > > can't say I know enough about sharding to comment on 3 - seems to me
> > > like it could be locking in a design before anyone truly knows what
> > > sharding in C* looks like. But hopefully I'm wrong and there are
> > > devs out there that have already thought that through.
> >
> > Thanks. That is our view and is great to hear.
> >
> > About our proposal number 3: In my view, good protocol designs are
> > future proof and flexible. We certainly don't want to propose a design
> > that works just for Scylla, but would support reasonable
> > implementations regardless of how they may look like.
> >
> > >
> > > Do we have driver authors who wish to support both projects?
> > >
> > > Surely, but I imagine it would be a minority. ​
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> > additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: JNA to activate cassandra row cache

2013-08-21 Thread Ben Bromhead
Under the hood a process needs permissions for kernel capabilities in order to 
pin memory. 

Under linux a process needs the CAP_IPC_LOCK capability to call mlockall (which 
is used by C*), 99% of the time you don't have to worry about this unless you 
run SE linux or are messing about with your limits.conf. 

Under other OS's that use RBAC style permissions, for example Solaris, where 
capabilities are often specified according to project, role, service etc the 
capability may need to be explicitly set.

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr 

On 22/08/2013, at 5:47 AM, Chris Burroughs  wrote:

> On 08/19/2013 02:49 PM, CROCKETT, LEONARD P wrote:
>> Must Cassandra 1.2.5 run as root for JNA jar to effectively disable swapping?
> 
> It doesn't globally disable swap (which would require root), but "don't swap 
> this block of memory".



Best avenue for reporting security issues

2013-12-17 Thread Ben Bromhead
Hi guys

We’ve come across a bug with potential security implications and in the spirit 
of responsible disclosure whats the best path for reporting it / submitting 
patches without making the issue public until a fixed version of Cassandra is 
released?

As a follow up I would propose that the Cassandra project should have 
secur...@cassandra.apache.org mailing address, where sensitive issues can be 
reported to the core dev team without it being made public.

Regards

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359



Re: Best avenue for reporting security issues

2013-12-17 Thread Ben Bromhead
No worries, message sent

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

On 18 Dec 2013, at 10:35 am, Aleksey Yeschenko  wrote:

> Hi Ben,
> 
> Send it to me, I'll handle it.
> 
> Thanks
> 
> 
> On Wed, Dec 18, 2013 at 2:30 AM, Ben Bromhead  wrote:
> 
>> Hi guys
>> 
>> We’ve come across a bug with potential security implications and in the
>> spirit of responsible disclosure whats the best path for reporting it /
>> submitting patches without making the issue public until a fixed version of
>> Cassandra is released?
>> 
>> As a follow up I would propose that the Cassandra project should have
>> secur...@cassandra.apache.org mailing address, where sensitive issues can
>> be reported to the core dev team without it being made public.
>> 
>> Regards
>> 
>> Ben Bromhead
>> Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359
>> 
>> 



Re: Multitanency in Cassandra

2014-08-30 Thread Ben Bromhead
Some thoughts I have had around multi-tennancy from a while back outside of the 
CF/Keyspace limit issues… not sure how relevant they are nowadays. 

Assuming that you have no control over your tenants, they have direct 
thrift/native access and they may be malicious (either intentionally or not):

- Resource limits on the read / write path. For example 
https://issues.apache.org/jira/browse/CASSANDRA-6117 ensures Cassandra won't 
fall over if it reads a whole bunch of tombstones. Not too sure how many 
similar issues exist where a single read could use out of proportion amounts of 
resources. 
- Operation limits per node, currently configurable via cassandra.yaml… but you 
might want a more flexible solution. Not sure how this applies to large batches 
etc. 
- Sandboxing tenant triggers.
- Resource limits on expensive operations like CAS, CL=ALL 

We would be happy to work on some tickets around this as well as our 
in-progress multi-tennant solution just uses containers, namespaces et al for 
isolation. 

Cheers

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

On 31/08/2014, at 10:06 AM, Jay Patel  wrote:

> Hi Folks,
> 
> Ideally, it would be awesome if multitanency is a first class citizen in
> Cassandra. But as of today, the easiest way to get multitanency (on-disk
> data isolation, per tanent recovery & backup, replication strategy) is by
> having one keyspace per tanent. However, it’s not recommended to go beyond
> 300 to 500 tables in one cluster today.
> 
> By this thread, I would like to find out the current blocking issues for
> supporting high number of tables (10K/50K/?), and contribute the fixes.
> Also, open for any ideas for making keyspace itself tanent-aware and
> supporting multitanency out-of-the box, but having replication strategy
> (NTS) per tanent & on-disk data isolation are minimal features to have. Not
> sure but supporting high tables in a cluster may lead us to support
> multitanency out-of-the box in the future..
> 
> As per my quick discussion with Jonathan & few other folks, I think we
> already know below issues:
> 
> 1 MB heap per memtables
> Creating CFs can take long time (Fixed - CASSANDRA-6977)
> Multiple flushes turn writes into random than sequential (should we worry
> if use SSDs?)
> Unknowns!
> 
> Regarding '1 MB per memtable', CASSANDRA-5935 adds an option to allow
> disabling slab allocation to pack more CFs, but at the cost of GC pains.
> Seems like Cassandra 2.1 off-heap memtables will be a better option.
> However, looks like it also uses region-based memory allocation to avoid
> fragmentation. Does this mean no GC pain but still need high RAM (for 50K
> tables, end up with 50GB)?
> 
>>> (pls. correct if this is not the right file I'm looking into)
> 
> public class NativeAllocator extends MemtableAllocator
> {
>private static final Logger logger =
> LoggerFactory.getLogger(NativeAllocator.class);
> 
>private final static int REGION_SIZE = 1024 * 1024;
>private final static int MAX_CLONED_SIZE = 128 * 1024; // bigger than
> this don't go in the region
> 
> Would like to know any other known issues that I’ve not listed here and/or
> any recommendations for multitanency. Also, any thoughts on supporting
> efficient off-heap allocator option for high # of tables?
> 
> BTW, having 10K tables brings up many other issues around management,
> tooling, etc. but I'm less worried for that, at this point.
> 
> Thanks,
> Jay



Re: [DISCUSS] Releases after 4.0

2021-03-29 Thread Ben Bromhead
+1 good sensible suggestion.

On Tue, Mar 30, 2021 at 7:37 AM Ekaterina Dimitrova 
wrote:

> I also like the latest suggestion, +1, thank you
>
> On Mon, 29 Mar 2021 at 14:16, Yifan Cai  wrote:
>
> > +1
> >
> > On Mon, Mar 29, 2021 at 8:42 AM J. D. Jordan 
> > wrote:
> >
> > > +1 that deprecation schedule seems reasonable and a good thing to move
> > to.
> > >
> > > > On Mar 29, 2021, at 10:23 AM, Benjamin Lerer 
> > wrote:
> > > >
> > > > The proposal sounds good to me too.
> > > >
> > > >> Le lun. 29 mars 2021 à 16:48, Brandon Williams  a
> > > écrit :
> > > >>
> > > >>> On Mon, Mar 29, 2021 at 9:41 AM Joseph Lynch <
> joe.e.ly...@gmail.com>
> > > >>> wrote:
> > > >>> I like the idea of the 3-year support cycles, but I think since
> > > >>> 3.0/3.11/4.0 took so long to stabilize to a point folks could
> upgrade
> > > >>> to, we should reset the clock somewhat.
> > > >>
> > > >> I agree, the length of time to release 4.0 and the initialization
> of a
> > > >> new release cycle requires some special consideration for current
> > > >> releases.
> > > >>
> > > >>> 4.0: Fully supported until April 2023 and high severity bugs until
> > > >>> April 2024 (2 year full, 1 year bugfix)
> > > >>> 3.11: Fully supported until April 2022 and high severity bugs until
> > > >>> April 2023 (1 year full, 1 year bugfix).
> > > >>> 3.0: Supported for high severity correctness/performance bugs until
> > > >>> April 2022 (1 year bugfix)
> > > >>> 2.2+2.1: EOL immediately.
> > > >>>
> > > >>> Then going forward we could have this nice pattern when we cut the
> > > >>> yearly release:
> > > >>> Y(n-0): Support for 3 years from now (2 full, 1 bugfix)
> > > >>> Y(n-1): Fully supported for 1 more year and supported for high
> > > >>> severity correctness/perf bugs 1 year after that (1 full, 1 bugfix)
> > > >>> Y(n-2): Supported for high severity correctness/bugs for 1 more
> year
> > (1
> > > >> bugfix)
> > > >>
> > > >> This sounds excellent to me, +1.
> > > >>
> > > >>
> -
> > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>
> > > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: [VOTE] Release Apache Cassandra 4.0-rc1

2021-03-30 Thread Ben Bromhead
-1 (non-binding)

I hate that I need to voice this opinion, as I think this is a wonderful
milestone for the project to reach from a technical perspective and it is
truly ready for it from that perspective.

However given the issues listed by Mick that need to be resolved I don't
think this release truly qualifies as a release candidate (we won't pick
this sha / release as GA... so how could it be a real release candidate).

Please correct me if I'm wrong here, but a RC is something that could be a
GA release and due to the outstanding issues we don't meet that criteria.
Irrespective of how those issues get resolved.

Even though these issues are not technical and don't represent the
readiness of 4.0 from an implementation perspective, we owe it to the
broader community to resolve these (as we all say community > code).

On Wed, Mar 31, 2021 at 6:53 AM Blake Eggleston
 wrote:

> +1
>
> > On Mar 29, 2021, at 6:05 AM, Mick Semb Wever  wrote:
> >
> > Proposing the test build of Cassandra 4.0-rc1 for release.
> >
> > sha1: 2facbc97ea215faef1735d9a3d5697162f61bc8c
> > Git:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-rc1-tentative
> > Maven Artifacts:
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1234/org/apache/cassandra/cassandra-all/4.0-rc1/
> >
> > The Source and Build Artifacts, and the Debian and RPM packages and
> > repositories, are available here:
> > https://dist.apache.org/repos/dist/dev/cassandra/4.0-rc1/
> >
> > The vote will be open for 72 hours (longer if needed). Everyone who has
> > tested the build is invited to vote. Votes by PMC members are considered
> > binding. A vote passes if there are at least three binding +1s and no
> -1's.
> >
> > Known issues with this release, that are planned to be fixed in 4.0-rc2,
> are
> > - four files were missing copyright headers,
> > - LICENSE and NOTICE contain additional unneeded information,
> > - jar files under lib/ in the source artefact.
> >
> > These issues are actively being worked on, along with our expectations
> that
> > the ASF makes the policy around them more explicit so it is clear exactly
> > what is required of us.
> >
> >
> > [1]: CHANGES.txt:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-rc1-tentative
> > [2]: NEWS.txt:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-rc1-tentative
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: [VOTE] Release Apache Cassandra 4.0-rc1

2021-03-30 Thread Ben Bromhead
It was more that I felt bad about raining on the parade, than worried about
other reactions :)

It's good to hear that there is some confidence from folks that one of the
issues is potentially resolvable outside of this vote.

If we could tidy up the others quickly (I'm happy to submit a PR for
anything that is outstanding) I'm ready to jump on board the train!

On Wed, Mar 31, 2021 at 9:51 AM Mick Semb Wever  wrote:

> > I hate that I need to voice this opinion, …
>
> I think it is wonderful that you do! There needs to be more of this,
> without fear :-)
>
>
>
> > Please correct me if I'm wrong here, but a RC is something that could be
> a
> > GA release and due to the outstanding issues we don't meet that criteria.
> > Irrespective of how those issues get resolved.
> >
>
>
> I do not believe we relabel RC releases into GA releases. A new release is
> cut and voted on for the GA. Interestingly, the main issue at hand could be
> addressed in the release scripts, so the exact same SHA of 4.0-rc1 could in
> theory be cut into a GA release. (Though it looks like CASSANDRA-16391 is
> feasible.) And we are not looking at impacting QA or touching any
> compatibility aspect of the code.
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: [VOTE] Release Apache Cassandra 4.0-rc1

2021-03-30 Thread Ben Bromhead
https://issues.apache.org/jira/browse/CASSANDRA-16550 :)

On Wed, Mar 31, 2021 at 10:08 AM Mick Semb Wever  wrote:

> >
> > If we could tidy up the others quickly (I'm happy to submit a PR for
> > anything that is outstanding) I'm ready to jump on board the train!
> >
>
>
> The LICENSE and NOTICE issues remain unassigned, if you are keen!
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: [VOTE] Release Apache Cassandra 4.0-rc1 (take2)

2021-04-22 Thread Ben Bromhead
+1 (nb)

On Fri, Apr 23, 2021 at 2:37 AM Marcus Eriksson  wrote:

> +1
>
> On Wed, Apr 21, 2021 at 08:51:23PM +0200, Mick Semb Wever wrote:
> > Proposing the test build of Cassandra 4.0-rc1 for release.
> >
> > sha1: 3282f5ecf187ecbb56b8d73ab9a9110c010898b0
> > Git:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-rc1-tentative
> > Maven Artifacts:
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1235/org/apache/cassandra/cassandra-all/4.0-rc1/
> >
> > The Source and Build Artifacts, and the Debian and RPM packages and
> > repositories, are available here:
> > https://dist.apache.org/repos/dist/dev/cassandra/4.0-rc1/
> >
> > The vote will be open for 72 hours (longer if needed). Everyone who
> > has tested the build is invited to vote. Votes by PMC members are
> > considered binding. A vote passes if there are at least three binding
> > +1s and no -1's.
> >
> > [1]: CHANGES.txt:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-rc1-tentative
> > [2]: NEWS.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-rc1-tentative
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -----
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: Welcome Stefan Miklosovic as Cassandra committer

2021-05-03 Thread Ben Bromhead
Congrats mate!

On Tue, May 4, 2021 at 4:20 AM Scott Andreas  wrote:

> Congratulations, Štefan!
>
> 
> From: David Capwell 
> Sent: Monday, May 3, 2021 10:53 AM
> To: dev@cassandra.apache.org
> Subject: Re: Welcome Stefan Miklosovic as Cassandra committer
>
> Congrats!
>
> > On May 3, 2021, at 9:47 AM, Ekaterina Dimitrova 
> wrote:
> >
> > Congrat Stefan! Well done!!
> >
> > On Mon, 3 May 2021 at 11:49, J. D. Jordan 
> wrote:
> >
> >> Well deserved!  Congrats Stefan.
> >>
> >>> On May 3, 2021, at 10:46 AM, Sumanth Pasupuleti <
> >> sumanth.pasupuleti...@gmail.com> wrote:
> >>>
> >>> Congratulations Stefan!!
> >>>
> >>>> On Mon, May 3, 2021 at 8:41 AM Brandon Williams 
> >> wrote:
> >>>>
> >>>> Congratulations, Stefan!
> >>>>
> >>>>> On Mon, May 3, 2021 at 10:38 AM Benjamin Lerer 
> >> wrote:
> >>>>>
> >>>>> The PMC's members are pleased to announce that Stefan Miklosovic has
> >>>>> accepted the invitation to become committer last Wednesday.
> >>>>>
> >>>>> Thanks a lot, Stefan,  for all your contributions!
> >>>>>
> >>>>> Congratulations and welcome
> >>>>>
> >>>>> The Apache Cassandra PMC members
> >>>>
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>
> >>>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: Welcome Caleb Rackliffe as Cassandra committer

2021-05-16 Thread Ben Bromhead
Congrats!

On Sat, May 15, 2021 at 1:11 PM Jordan West  wrote:

> Congrats Caleb!
>
> Jordan
>
> On Fri, May 14, 2021 at 10:43 AM Scott Andreas 
> wrote:
>
> > Congratulations, Caleb!
> >
> > — Scott
> >
> > > On May 14, 2021, at 10:29 AM, Andrés de la Peña <
> > a.penya.gar...@gmail.com> wrote:
> > >
> > > Congrats Caleb, well deserved! :)
> > >
> > >> On Fri, 14 May 2021 at 17:53, Paulo Motta 
> > wrote:
> > >>
> > >> Awesome, congratulations Caleb!! :)
> > >>
> > >> Em sex., 14 de mai. de 2021 às 13:16, Patrick McFadin <
> > pmcfa...@gmail.com>
> > >> escreveu:
> > >>
> > >>> YES! Love seeing this. A very much deserved congratulations Caleb!
> > >>>
> > >>> Patrick
> > >>>
> > >>> On Fri, May 14, 2021 at 9:12 AM David Capwell
> >  > >>>
> > >>> wrote:
> > >>>
> > >>>> Congrats!
> > >>>>
> > >>>>> On May 14, 2021, at 8:52 AM, Charles Cao 
> > >> wrote:
> > >>>>>
> > >>>>> Congrats Caleb! Well deserved :)
> > >>>>>
> > >>>>> ~Charles
> > >>>>>
> > >>>>>> On May 14, 2021, at 07:30, Yifan Cai  wrote:
> > >>>>>>
> > >>>>>> Congrats Caleb!
> > >>>>>>
> > >>>>>>> On May 14, 2021, at 6:56 AM, Joshua McKenzie <
> jmcken...@apache.org
> > >>>
> > >>>> wrote:
> > >>>>>>>
> > >>>>>>> Congrats Caleb!
> > >>>>>>>
> > >>>>>>>>> On Fri, May 14, 2021 at 9:10 AM Brandon Williams <
> > >> dri...@gmail.com
> > >>>>
> > >>>> wrote:
> > >>>>>>>>
> > >>>>>>>> Congrats Caleb! Well deserved.
> > >>>>>>>>
> > >>>>>>>>> On Fri, May 14, 2021, 8:03 AM Mick Semb Wever 
> > >>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>> The PMC members are pleased to announce that Caleb Rackliffe
> has
> > >>>>>>>>> accepted the invitation to become committer.
> > >>>>>>>>>
> > >>>>>>>>> Thanks heaps Caleb for helping make Cassandra awesome!
> > >>>>>>>>>
> > >>>>>>>>> Congratulations and welcome,
> > >>>>>>>>> The Apache Cassandra PMC members
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >> -
> > >>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>>>>>
> > >>>>>
> > >>>>>
> -
> > >>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> -
> > >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>>>
> > >>>>
> > >>>
> > >>
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: Additions to Cassandra ecosystem page?

2021-06-22 Thread Ben Bromhead
There is certainly a lack of clarity in the grouping, as a number of those
services are not offering Apache Cassandra. I would suggest another
category along the lines of "Cassandra Protocol compatible offerings".

That way users can easily distinguish between ecosystem offerings where
"the driver works, but certain features might not", vs an actual Apache
Cassandra offering.

We could then also add things like Yugabyte and Scylla into that category.

On Wed, Jun 23, 2021 at 11:15 AM Jonathan Koppenhofer 
wrote:

> No major opinion on the "cloud offerings" piece, but I agree people should
> know what they are getting into, and be able to make an informed decision.
> However, if someone is going down that path, I would hope they do the
> due-diligence to make sure it fits their requirements.
>
> 1 small update I would suggest. It seems like Datastax Spring Boot entry
> would go in development frameworks as opposed to the sidecar section.
>
> On Tue, Jun 22, 2021, 5:39 PM bened...@apache.org 
> wrote:
>
> > Under Cloud Offerings, are we comfortable implicitly endorsing “API
> > compatible” offerings that aren’t actually Cassandra, and also don’t (as
> > far as I am aware) fully support Cassandra functionality? Should we at
> > least mention that this is the case?
> >
> >
> > From: Melissa Logan 
> > Date: Tuesday, 22 June 2021 at 21:39
> > To: u...@cassandra.apache.org ,
> > dev@cassandra.apache.org 
> > Subject: Additions to Cassandra ecosystem page?
> > Hi all,
> >
> > The Cassandra community recently updated its website and has added
> several
> > new entries to the Ecosystem page:
> https://cassandra.apache.org/ecosystem/
> > .
> >
> > If you have edits or know of other third-party Cassandra projects, tools,
> > products, etc that may be useful to others -- please get in touch and
> we'll
> > add to the next round of site updates in July.
> >
> > Thanks!
> >
> > Melissa
> > Apache Cassandra Contributor
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: Additions to Cassandra ecosystem page?

2021-06-23 Thread Ben Bromhead
I'm also comfortable with a strict approach where we just list actual
Apache Cassandra offerings, that also provides good solid clarity to users.

On Thu, Jun 24, 2021 at 3:06 AM bened...@apache.org 
wrote:

> +1
>
> From: Brandon Williams 
> Date: Wednesday, 23 June 2021 at 15:44
> To: dev@cassandra.apache.org 
> Subject: Re: Additions to Cassandra ecosystem page?
> On Wed, Jun 23, 2021 at 9:38 AM Joshua McKenzie 
> wrote:
> >
> > The obvious core responsibility of the website should be to ASLv2
> > permissively licensed Apache Cassandra and secondarily to CQL as a
> protocol
> > IMO. I don't think we as a project should be tracking derivative works,
> > forks, or other things built on top of the code-base and certainly not
> > things with wildly varied licensing (AGPL, proprietary closed, etc).
>
> I agree.  I don't see how it makes sense for us to promote less
> compatible derivatives with more restrictive licensing.  Imitation may
> be flattery but as you pointed out, we don't need to be the ones
> advertising it.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: [VOTE] Release Apache Cassandra 4.0-rc2

2021-06-28 Thread Ben Bromhead
+1 nb

On Tue, Jun 29, 2021 at 10:01 AM Scott Andreas  wrote:

> +1 nb
>
> 
> From: Andrés de la Peña 
> Sent: Monday, June 28, 2021 1:04 PM
> To: dev@cassandra.apache.org
> Subject: Re: [VOTE] Release Apache Cassandra 4.0-rc2
>
> +1 (nb)
>
> On Mon, 28 Jun 2021 at 21:01, Jon Meredith  wrote:
>
> > +1 (nb)
> >
> > On Mon, Jun 28, 2021 at 9:47 AM Yifan Cai  wrote:
> > >
> > > +1
> > >
> > >
> > > - Yifan
> > >
> > > > On Jun 28, 2021, at 8:40 AM, Ekaterina Dimitrova <
> > e.dimitr...@gmail.com> wrote:
> > > >
> > > > +1 Thanks everyone!
> > > >
> > > >> On Mon, 28 Jun 2021 at 11:39, Aleksey Yeschenko  >
> > wrote:
> > > >>
> > > >> +1
> > > >>
> > > >>>> On 28 Jun 2021, at 14:05, Gary Dusbabek 
> > wrote:
> > > >>>
> > > >>> +1; yay!
> > > >>>
> > > >>>> On Sun, Jun 27, 2021 at 11:02 AM Mick Semb Wever 
> > wrote:
> > > >>>
> > > >>>> Proposing the test build of Cassandra 4.0-rc2 for release.
> > > >>>>
> > > >>>> sha1: 4c98576533e1d7663baf447e8877788096489165
> > > >>>> Git:
> > > >>>>
> > > >>>>
> > > >>
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-rc2-tentative
> > > >>>> Maven Artifacts:
> > > >>>>
> > > >>>>
> > > >>
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1237/org/apache/cassandra/cassandra-all/4.0-rc2/
> > > >>>>
> > > >>>> The Source and Build Artifacts, and the Debian and RPM packages
> and
> > > >>>> repositories, are available here:
> > > >>>> https://dist.apache.org/repos/dist/dev/cassandra/4.0-rc2/
> > > >>>>
> > > >>>> The vote will be open for 72 hours (longer if needed). Everyone
> who
> > has
> > > >>>> tested the build is invited to vote. Votes by PMC members are
> > considered
> > > >>>> binding. A vote passes if there are at least three binding +1s and
> > no
> > > >> -1's.
> > > >>>>
> > > >>>> [1]: CHANGES.txt:
> > > >>>>
> > > >>>>
> > > >>
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-rc2-tentative
> > > >>>> [2]: NEWS.txt:
> > > >>>>
> > > >>>>
> > > >>
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-rc2-tentative
> > > >>>> [3]: The maven artifacts were accidentally prematurely made
> public.
> > Docs
> > > >>>> have been updated to prevent this happening again.
> > > >>>>
> > > >>
> > > >>
> > > >>
> -
> > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>
> > > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: Additions to Cassandra ecosystem page?

2021-06-28 Thread Ben Bromhead
On Thu, Jun 24, 2021 at 2:38 AM Joshua McKenzie 
wrote:

>
> The obvious core responsibility of the website should be to ASLv2
> permissively licensed Apache Cassandra and secondarily to CQL as a protocol
> IMO. I don't think we as a project should be tracking derivative works,
> forks, or other things built on top of the code-base and certainly not
> things with wildly varied licensing (AGPL, proprietary closed, etc).
>
> To go that route we either become fully inclusive of everything or become
> Kingmakers, and either way there's the consequence of inconsistent levels
> of vetting, maintenance, and dilution of what it means to "be Cassandra".
> There's plenty of other websites for other projects and everyone has access
> to search engines.
>

This makes sense to me as a line in the sand to draw if we are going down a
strict path.

It would be up to whoever wants to be added to the list to demonstrate this
is the case.

There would still be some degree of honesty required as well on the service
providers part.


Re: Additions to Cassandra ecosystem page?

2021-06-30 Thread Ben Bromhead
I would be sad to see us drop this just because it's a hard discussion with
a few different opinions. My apologies if this discussion is making folks
feel excluded.

Whilst I don't have a problem with a strict approach and it does improve
user clarity. I can understand how it might feel exclusionary. Having
classifications can make the tent bigger and allow for things that are API
compatible to be celebrated (the owners of some of the API compatible
offerings make significant contributions to the community and I would love
for them to be on the list).

Having some classification would better allow us to celebrate the different
offerings in the community and be more inclusive without misrepresenting
things to our users and making it easy to meet our obligations around how
we talk about Apache trademarks.

Part of demonstrating the health of the project is to talk about the
broader ecosystem around it.  Most other communities can seem to maintain
an ecosystem list that is fairly broad.

E.g.
Apache Kafka -> https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem
- Maintains a set of groupings and listings. Also listed projects could be
considered quite competitive.
Apache Spark -> A simple list of folk who use or do something with spark
https://spark.apache.org/powered-by.html
Apache Samza -> Again a simple list
http://samza.incubator.apache.org/powered-by/

Outside of the Apache landscape. The Postgres folk also simply have a list
of derived or adjacent PG databases which is kinda cool
https://wiki.postgresql.org/wiki/PostgreSQL_derived_databases.

Personally I think including the "API compatible offerings" is fine and
further demonstrates the power and reach of our community. For the
commercial vendors out there we have our marketing budgets and will do fine
(as Patrick said), but I would hate to see an opportunity to demonstrate
the breadth and depth of our community be missed.

As demonstrated with some of the above links, there should be a good
inclusive solution out there.


On Thu, Jul 1, 2021 at 2:33 PM Erick Ramirez 
wrote:

> >
> > And I'm thinking of anyone that has to update this list and reason
> through
> > all of the complex rulesets of why or why not, It's really not fair to
> > them.
>
> My proposal is that we completely drop the Cassandra Cloud Offereing
> > section.
> > Given that criteria, Professional Support and Education might be on the
> > chopping
> > block as well.
> >
>
> +1 would definitely make my life easier when I'm reviewing/pushing updates
> to the site. 🍻
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: [VOTE] Release Apache Cassandra 4.0.0 (take2)

2021-07-13 Thread Ben Bromhead
+1 (nb)

On Wed, Jul 14, 2021 at 12:26 PM Andrés de la Peña 
wrote:

> +1
>
> On Wed, 14 Jul 2021 at 00:39, Patrick McFadin  wrote:
>
> > +1 (nb)
> >
> > On Tue, Jul 13, 2021 at 3:45 PM Brandon Williams 
> wrote:
> >
> > > +1
> > >
> > > On Tue, Jul 13, 2021, 5:14 PM Mick Semb Wever  wrote:
> > >
> > > > Proposing the test build of Cassandra 4.0.0 for release.
> > > >
> > > > sha1: 924bf92fab1820942137138c779004acaf834187
> > > > Git:
> > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0.0-tentative
> > > > Maven Artifacts:
> > > >
> > > >
> > >
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1242/org/apache/cassandra/cassandra-all/4.0.0/
> > > >
> > > > The Source and Build Artifacts, and the Debian and RPM packages and
> > > > repositories, are available here:
> > > > https://dist.apache.org/repos/dist/dev/cassandra/4.0.0/
> > > >
> > > > The vote will be open for 72 hours (longer if needed). Everyone who
> > > > has tested the build is invited to vote. Votes by PMC members are
> > > > considered binding. A vote passes if there are at least three binding
> > > > +1s and no -1's.
> > > >
> > > > [1]: CHANGES.txt:
> > > >
> > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0.0-tentative
> > > > [2]: NEWS.txt:
> > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0.0-tentative
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >
> > > >
> > >
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975


Re: Thanks to Nate for his service as PMC Chair

2022-07-13 Thread Ben Bromhead
Thanks Nate and congrats Mick!

On Wed, Jul 13, 2022 at 1:44 PM Dinesh Joshi  wrote:

> Thank you Nate for your service!
>
> Welcome Mick!
>
>
> On Jul 11, 2022, at 5:54 AM, Paulo Motta  wrote:
>
> 
> Hi,
>
> I wanted to announce on behalf of the Apache Cassandra Project Management
> Committee (PMC) that Nate McCall (zznate) has stepped down from the PMC
> chair role. Thank you Nate for all the work you did as the PMC chair!
>
> The Apache Cassandra PMC has nominated Mick Semb Wever (mck) as the new
> PMC chair. Congratulations and good luck on the new role Mick!
>
> The chair is an administrative position that interfaces with the Apache
> Software Foundation Board, by submitting regular reports about project
> status and health. Read more about the PMC chair role on Apache projects:
> - https://www.apache.org/foundation/how-it-works.html#pmc
> - https://www.apache.org/foundation/how-it-works.html#pmc-chair
> - https://www.apache.org/foundation/faq.html#why-are-PMC-chairs-officers
>
> The PMC as a whole is the entity that oversees and leads the project and
> any PMC member can be approached as a representative of the committee. A
> list of Apache Cassandra PMC members can be found on:
> https://cassandra.apache.org/_/community.html
>
> Kind regards,
>
> Paulo
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | +64 27 383 8975