>> We fully intend to "engineer and test the snot out of" the changes
we are working on as the whole point of us working on them is so we
*can* run them in production, at our scale.
I'm not sure how the apache team does this. Perhaps individual engineers
can run some modern version at a company of theirs, altho that seems
unlikely, but as an Apache org, i just don't see how that happens.
To me it seems like the Apache Cassandra infrastructure itself needs to
stand up a multinode live instance running some 'real-world' example
that is getting pounded, so that we can stage feature branches to really
test them.
Otherwise we will forever be basing versions on the poor test saps who
decide they are willing to risk all to upgrade to the cutting edge, and
why, everyone believes in the adage, don't upgrade until at least .6
--dave
On 11/20/2016 09:50 AM, Jason Brown wrote:
Hey all,
One of the goals on my team, when working on large patches, is to get
community feedback on these initiatives before throwing them into prod.
This gets us a wider net of feedback (see Sylvain's continuing excellent
rounds of feedback to my work on CASSANDRA-8457), as well as making sure we
don't go too far off the deep end in terms of straying from the community
version. The latter point is crucial because if we make too many
incompatible changes to, for example, the internode messaging protocol or
the CQL protocol or the sstable file format, and deploy that, it may be
very difficult, if not impossible, to rectify with future, in-development
versions of cassandra.
We fully intend to "engineer and test the snot out of" the changes we are
working on as the whole point of us working on them is so we *can* run them
in production, at our scale. We aren't expecting others in the community to
dog food it for us. There will be a delay between committing something
upstream, and us backporting it to a current version we run in production
and actually deploying it. However, you can be sure that any bugs we find
will be fixed ASAP; we have many users counting on it.
Thanks for listening,
-Jason
On Sat, Nov 19, 2016 at 11:04 AM, Blake Eggleston <beggles...@apple.com>
wrote:
I think Ed's just using gossip 2.0 as a hypothetical example. His point is
that we should only commit things when we have a high degree of confidence
that they work correctly, not with the expectation that they don't.
On November 19, 2016 at 10:52:38 AM, Michael Kjellman (
mkjell...@internalcircle.com) wrote:
Jason has asked for review and feedback many times. Maybe be constructive
and review his code instead of just complaining (once again)?
Sent from my iPhone
On Nov 19, 2016, at 1:49 PM, Edward Capriolo <edlinuxg...@gmail.com>
wrote:
I would say start with a mindset like 'people will run this in
production'
not like 'why would you expect this to work'.
Now how does this logic effect feature develement? Maybe use gossip 2.0
as
an example.
I will play my given debby downer role. I could imagine 1 or 2 dtests and
the logic of 'dont expect it to work' unleash 4.0 onto hords of nubes
with
twitter announce of the release let bugs trickle in.
One could also do something comprehensive like test on clusters of 2 to
1000 nodes. Test with jepsen to see what happens during partitions,
inject
things like jvm pauses and account for behaivor. Log convergence times
after given events.
Take a stand and say look "we engineered and beat the crap out of this
feature. I deployed this release feature at my company and eat my
dogfood.
You are not my crash test dummy."
On Saturday, November 19, 2016, Jeff Jirsa <jji...@gmail.com> wrote:
Any proposal to solve the problem you describe?
--
Jeff Jirsa
On Nov 19, 2016, at 8:50 AM, Edward Capriolo <edlinuxg...@gmail.com
<;>> wrote:
This is especially relevant if people wish to focus on removing things.
For example, gossip 2.0 sounds great, but seems geared toward huge
clusters
which is not likely a majority of users. For those with a 20 node
cluster
are the indirect benefits woth it?
Also there seems to be a first push to remove things like compact
storage
or thrift. Fine great. But what is the realistic update path for
someone.
If the big players are running 2.1 and maintaining backports, the
average
shop without a dedicated team is going to be stuck saying (great
features
in 4.0 that improve performance, i would probably switch but its not
stable
and we have that one compact storage cf and who knows what is going to
happen performance wise when)
We really need to lose this realease wont be stable for 6 minor
versions
concept.
On Saturday, November 19, 2016, Edward Capriolo <edlinuxg...@gmail.com
<;>>
wrote:
On Friday, November 18, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com
<;>
<_e(%7B%7D,'cvml','jeff.ji...@crowdstrike.com <;>');>>
wrote:
We should assume that we’re ditching tick/tock. I’ll post a thread on
4.0-and-beyond here in a few minutes.
The advantage of a prod release every 6 months is fewer incentive to
push
unfinished work into a release.
The disadvantage of a prod release every 6 months is then we either
have
a very short lifespan per-release, or we have to maintain lots of
active
releases.
2.1 has been out for over 2 years, and a lot of people (including us)
are
running it in prod – if we have a release every 6 months, that means
we’d
be supporting 4+ releases at a time, just to keep parity with what we
have
now? Maybe that’s ok, if we’re very selective about ‘support’ for 2+
year
old branches.
On 11/18/16, 3:10 PM, "beggles...@apple.com <;> on behalf
of Blake
Eggleston" <beggles...@apple.com <;>> wrote:
While stability is important if we push back large "core" changes
until later we're just setting ourselves up to face the same issues
later on
In theory, yes. In practice, when incomplete features are earmarked
for
a certain release, those features are often rushed out, and not
always
fully baked.
In any case, I don’t think it makes sense to spend too much time
planning what goes into 4.0, and what goes into the next major
release
with
so many release strategy related decisions still up in the air. Are
we
going to ditch tick-tock? If so, what will it’s replacement look
like?
Specifically, when will the next “production” release happen? Without
knowing that, it's hard to say if something should go in 4.0, or 4.5,
or
5.0, or whatever.
The reason I suggested a production release every 6 months is
because
(in my mind) it’s frequent enough that people won’t be tempted to
rush
features to hit a given release, but not so frequent that it’s not
practical to support. It wouldn’t be the end of the world if some of
these
tickets didn’t make it into 4.0, because 4.5 would fine.
On November 18, 2016 at 1:57:21 PM, kurt Greaves (
k...@instaclustr.com <;>)
wrote:
On 18 November 2016 at 18:25, Jason Brown <jasedbr...@gmail.com
<;>> wrote:
#11559 (enhanced node representation) - decided it's *not*
something
we
need wrt #7544 storage port configurable per node, so we are
punting
on
#12344 - Forward writes to replacement node with same address during
replace
depends on #11559. To be honest I'd say #12344 is pretty important,
otherwise it makes it difficult to replace nodes without potentially
requiring client code/configuration changes. It would be nice to get
#12344
in for 4.0. It's marked as an improvement but I'd consider it a bug
and
thus think it could be included in a later minor release.
Introducing all of these in a single release seems pretty risky. I
think
it
would be safer to spread these out over a few 4.x releases (as
they’re
finished) and give them time to stabilize before including them in
an
LTS
release. The downside would be having to maintain backwards
compatibility
across the 4.x versions, but that seems preferable to delaying the
release
of 4.0 to include these, and having another big bang release.
I don't think anyone expects 4.0.0 to be stable. It's a major
version
change with lots of new features; in the production world people
don't
normally move to a new major version until it has been out for quite
some
time and several minor releases have passed. Really, most people are
only
migrating to 3.0.x now. While stability is important if we push back
large
"core" changes until later we're just setting ourselves up to face
the
same
issues later on. There should be enough uptake on the early releases
of
4.0
from new users to help test and get it to a production-ready state.
Kurt Greaves
k...@instaclustr.com <;>
I don't think anyone expects 4.0.0 to be stable
Someone previously described 3.0 as the "break everything release".
We know that many people are still 2.1 and 3.0. Cassandra will always
be
maintaining 3 or 4 active branches and have adoption issues if
releases
are
not stable and usable.
Being that cassandra was 1.0 years ago I expect things to be stable.
Half
working features , or added this broke that are not appealing to me.
--
Sorry this was sent from mobile. Will do less grammar and spell check
than
usual.
--
Sorry this was sent from mobile. Will do less grammar and spell check
than
usual.
--
Sorry this was sent from mobile. Will do less grammar and spell check
than
usual.