from:"tupshin"

Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-10 Thread tupshin

+1

On Sat, May 9, 2015, at 06:38 PM, Jonathan Ellis wrote:
> *With 8099 still weeks from being code complete, and even longer from
> being
> stable, I’m starting to think we should decouple everything that’s
> already
> done in trunk from 8099.  That is, ship 2.2 ASAP with - Windows support-
> UDF- Role-based permissions - JSON- Compressed commitlog- Off-heap row
> cache- Message coalescing on by default- Native protocol v4and let 3.0
> ship
> with 8099 and a few things that finish by then (vnode compaction,
> file-based hints, maybe materialized views).Remember that we had 7
> release
> candidates for 2.1.  Splitting 2.2 and 3.0 up this way will reduce the
> risk
> in both 2.2 and 3.0 by separating most of the new features from the big
> engine change.  We might still have a lot of stabilization to do for
> either
> or both, but at the least this lets us get a head start on testing the
> new
> features in 2.2.This does introduce a new complication, which is that
> instead of 3.0 being an unusually long time after 2.1, it will be an
> unusually short time after 2.2.  The “default” if we follow established
> practice would be to*
> 
>-
> 
>EOL 2.1 when 3.0 ships, and maintain 2.2.x and 3.0.x stabilization
>branches
> 
> 
> *But, this is probably not the best investment we could make for our
> users
> since 2.2 and 3.0 are relatively close in functionality.  I see a couple
> other options without jumping to 3 concurrent stabilization series:*
> 
> 
> 
> * - Extend 2.1.x series and 2.2.x until 4.0, but skip 3.0.x stabilization
> series in favor of tick-tock 3.x- Extend 2.1.x series until 4.0, but stop
> 2.2.x when 3.0 ships in favor of developing 3.0.x insteadThoughts?*
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced

Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-10 Thread tupshin

To clarify, I'm +1ing the creation of a stable 2.2 branch, prior to
8099, in order to not block certain key features, as mentioned. Neutral
on any additional nuances.

-Tupshin

On Sun, May 10, 2015, at 08:05 AM, tups...@tupshin.com wrote:
> +1
> 
> On Sat, May 9, 2015, at 06:38 PM, Jonathan Ellis wrote:
> > *With 8099 still weeks from being code complete, and even longer from
> > being
> > stable, I’m starting to think we should decouple everything that’s
> > already
> > done in trunk from 8099.  That is, ship 2.2 ASAP with - Windows support-
> > UDF- Role-based permissions - JSON- Compressed commitlog- Off-heap row
> > cache- Message coalescing on by default- Native protocol v4and let 3.0
> > ship
> > with 8099 and a few things that finish by then (vnode compaction,
> > file-based hints, maybe materialized views).Remember that we had 7
> > release
> > candidates for 2.1.  Splitting 2.2 and 3.0 up this way will reduce the
> > risk
> > in both 2.2 and 3.0 by separating most of the new features from the big
> > engine change.  We might still have a lot of stabilization to do for
> > either
> > or both, but at the least this lets us get a head start on testing the
> > new
> > features in 2.2.This does introduce a new complication, which is that
> > instead of 3.0 being an unusually long time after 2.1, it will be an
> > unusually short time after 2.2.  The “default” if we follow established
> > practice would be to*
> > 
> >-
> > 
> >EOL 2.1 when 3.0 ships, and maintain 2.2.x and 3.0.x stabilization
> >branches
> > 
> > 
> > *But, this is probably not the best investment we could make for our
> > users
> > since 2.2 and 3.0 are relatively close in functionality.  I see a couple
> > other options without jumping to 3 concurrent stabilization series:*
> > 
> > 
> > 
> > * - Extend 2.1.x series and 2.2.x until 4.0, but skip 3.0.x stabilization
> > series in favor of tick-tock 3.x- Extend 2.1.x series until 4.0, but stop
> > 2.2.x when 3.0 ships in favor of developing 3.0.x insteadThoughts?*
> > 
> > -- 
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder, http://www.datastax.com
> > @spyced

Re: Research on scalability bug finder for Cassandra

2016-04-11 Thread tupshin

Hi Haryadi,

Personally I'd love to see your approach extended to test up to 10K
nodes, or so.

There are not too many known instances of scaling past 1000 nodes, and
as the need for scale grows, and as scale out hardware becomes more
commonplace (high density, but with lots of small servers...aka hp
moonshot, blade servers, etc), 10K nodes is the next frontier. Would be
great to demonstrate that your tool can find *new* bugs and limitations 
(which it certainly would at that scale), as opposed to just reproducing
existing ones.

One other thought is to test with both non-vnodes and vnodes (and maybe
multiple number of vnodes per node) at extreme scales like that to get a
sense of what kind of overhead vnodes adds to the current gossip
implementation at scale.

Regarding existing bugs that you might usefully reproduce, I'll leave
that to others.

Thanks.

-Tupshin

On Fri, Apr 8, 2016, at 09:57 PM, Haryadi Gunawi wrote:
> Hi Jonathan,
> 
> Thanks for the reply!
> 
> We don't need a patched version of Cassandra.   Specifically, this is
> what
> we'd like to get help from you if possible:
> 
> Cassandra devs:  "Here are recent JIRA entries that discuss
> scale-dependent
> bugs: CASSANDRA-X, -Y, -Z (where XYZ are JIRA bug#)"
> 
> Our side: We will study the bug discussions, download the affected
> Cassandra version (as mentioned in the JIRA), integrate that specific
> version with our framework, and reproduce the bug in one machine.
> 
> Basically, we're interested to know if there are still unresolved or
> newly-resolved bugs (2015-2016) in Cassandra JIRA that we could use to
> test
> our approach.  (The bugs in our previous email are relatively old).
> 
> 
> We're targeting a publication deadline one month from now.  It'd be
> lovely
> if we get more sample bugs.  After the deadline, we'd be happy to send
> you
> the draft of the paper.
> 
> Please do let us know if you have any other questions.
> Thanks!
> -- Har
> 
> 
> 
> On Fri, Apr 8, 2016 at 8:03 PM, Jonathan Ellis  wrote:
> 
> > Sounds very interesting!  We'd love to hear more about your approach.  In
> > particular, does it require a patched version of Cassandra?
> >
> > On Thu, Apr 7, 2016 at 6:18 PM, Tanakorn Leesatapornwongsa <
> > tanak...@cs.uchicago.edu> wrote:
> >
> >> Dear Cassandra development team,
> >>
> >> We are computer science researchers at the University of Chicago.  Our
> >> research is about the reliability of cloud-scale distributed systems.
> >> Samples of our work can be found here: http://ucare.cs.uchicago.edu <
> >> http://ucare.cs.uchicago.edu/>
> >>
> >> We are reaching out to you because we are interested in reproducing any
> >> unsolved scalability bugs in Cassandra.
> >>
> >> We define scalability bugs as latent bugs that are scale-dependent.  They
> >> don't arise in small-scale deployment but arise in large-scale production
> >> runs.  For example, everything is fine in 100-node deployment but in
> >> 500-node deployment the bug appears.
> >>
> >> We have created a scale-check methodology (SLCK) that can unearth
> >> scalability bugs in a single machine.  With SLCK, we can run hundreds of
> >> nodes on a single machine and reproduce some old scalability bugs. For
> >> example, we have reproduced the following bugs in one machine:
> >>
> >> - https://issues.apache.org/jira/browse/CASSANDRA-6127 <
> >> https://issues.apache.org/jira/browse/CASSANDRA-6127>   (a customer
> >> observed node flapping when bootstrapping 1000 nodes)
> >>
> >> - https://issues.apache.org/jira/browse/CASSANDRA-3831 <
> >> https://issues.apache.org/jira/browse/CASSANDRA-3831>
> >>
> >> We are submitting SLCK for publication soon, and we can send you a draft
> >> a month from now if you are interested.
> >>
> >> To make a stronger publication submission, beyond reproducing old bugs,
> >> we thought it would be great if SLCK can reproduce new scalability bugs (if
> >> any) that you are still trying to resolve.
> >>
> >> We hope you find our work interesting and we would really appreciate if
> >> you can point to us any new scalability bugs that hopefully we can help you
> >> reproduce.
> >>
> >> Thank you very much for your attention!
> >>
> >> Best,
> >> Tanakorn L.
> >>
> >>
> >>
> >>
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder, http://www.datastax.com
> > @spyced
> >

Re: Cassandra + RAMP transactions

2015-02-09 Thread tupshin

Hi Jatin,

I believe there is a lot of interest in developing RAMP transactions for
Cassandra, but no concrete activity yet.
https://issues.apache.org/jira/browse/CASSANDRA-7056

-Tupshin

On Mon, Feb 9, 2015, at 11:30 PM, Jatin Ganhotra wrote:
> Hi,
> 
> Please forgive me if this is not the right forum for this query.
> I recently read an article where Jonathan Ellis mentioned that ePaxos and
> RAMP transactions should be soon added to Cassandra.
> 
> Is there any work going on in this direction?
> 
> Thanks
> —
> Jatin Ganhotra
> Graduate Student, Computer Science
> University of Illinois at Urbana Champaign
> http://jatinganhotra.com
> http://linkedin.com/in/jatinganhotra

Re: Proposal: require Java7 for Cassandra 2.0

2013-02-06 Thread Tupshin Harper

+1
On Feb 6, 2013 5:22 PM, "Jonathan Ellis"  wrote:

> Java 6 EOL is this month.  Java 7 will be two years old when C* 2.0
> comes out (July).  Anecdotally, a bunch of people are running C* on
> Java7 with no issues, except for the Snappy-on-OS-X problem (which
> will be moot if LZ4 becomes our default, as looks likely).
>
> Upgrading to Java7 lets us take advantage of new (two year old)
> features as well as simplifying interoperability with other
> dependencies, e.g., Jetty's BlockingArrayQueue requires java7.
>
> Thoughts?
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced
>

Re: Node side processing

2014-02-27 Thread Tupshin Harper

Hi David,

Check out the ongoing discussion in
https://issues.apache.org/jira/browse/CASSANDRA-6704 as well as some
related tickets linked to from that one.

No consensus at this point, but I'm personally hoping to see something
along the general lines of Hive's UDFs.

-Tupshin


On Thu, Feb 27, 2014 at 8:50 AM, David Semeria wrote:

> Hi List,
>
> I was wondering whether there have been any past proposals for
> implementing node side processing (NSP) in C*. By NSP, I mean the passing a
> reference to a Java class which would then process the result set before it
> being returned to the client.
>
> In our particular use case our clients typically loop through result sets
> of a million or more rows to produce a tiny amount of output (sums, means,
> variance, etc). The bottleneck -- quite obviously -- is the need to
> transfer a million rows to the client before processing can take place. It
> would be extremely useful to execute this processing on the coordinator
> node and only transfer the results to the client.
>
> I mention this here because I can imagine other C* users having similar
> requirements.
>
> Thanks
>
> D.
>

Re: Pointers on writing your own Compaction Strategy

2014-09-07 Thread Tupshin Harper

In addition to what Markus said, take a look at the latest patch in
https://issues.apache.org/jira/browse/CASSANDRA-6602 for a relevant
example.

-Tupshin
On Sep 4, 2014 2:28 PM, "Marcus Eriksson"  wrote:

> 1. create a class that extends AbstractCompactionStrategy (i would keep it
> in-tree while developing to avoid having classpath issues etc)
> 2. Implement the abstract methods
>- getNextBackgroundTask - called when cassandra wants to do a new minor
> (background) compaction - return a CompactionTask with the sstables you
> want compacted
>- getMaximalTask - called when a user triggers a major compaction
>- getUserDefinedTask - when a user triggers a user defined compaction
> from JMX
>- getEstimatedRemainingTasks - return the guessed number of tasks before
> we are "done"
>- getMaxSSTableBytes - if your compaction strategy puts a limit on the
> size of sstables
> 3. Execute this in cqlsh to enable your compaction strategy: ALTER TABLE
> foo WITH compaction = { class: ‘Bar’ }
> 4. Things to think about:
> - make sure you mark sstables as compacting before returning them from
> the compaction strategy (and check the return value!):
>
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java#L271
> - if you do this on 2.1 - dont mix repaired and unrepaired sstables
> (SSTableReader#isRepaired)
>
> Let me know if you need any more information
>
> /Marcus
>
>
>
> On Thu, Sep 4, 2014 at 6:50 PM, Ghosh, Mainak 
> wrote:
>
> > Hello,
> >
> > I am planning to write a new compaction strategy and I was hoping if
> > anyone can point me to the relevant functions and how they are related in
> > the call hierarchy.
> >
> > Thanks for the help.
> >
> > Regards,
> > Mainak.
> >
>

Re: 0.6.3

2010-06-18 Thread Tupshin Harper

Contrary to my expectations and Jonathan's, Cassandra rebuilds cleanly 
against the latest thrift source (and any recent snapshot), with no code 
changes. This is important because it includes the patches for

https://issues.apache.org/jira/browse/THRIFT-601
which, in turn, causes
https://issues.apache.org/jira/browse/CASSANDRA-475
#475 is targetted for 0.7 release, but I believe that was only because 
of the belief that it would require medium-scale change to the Cassandra 
source, where, in fact, it requires none.


I have been running stock 0.6.2 rebuilt against a recent thrift build 
for the past week and have seen no regressions and a complete fix for 
the thrift exceptions (followed by OOMs) that I was previously seeing.


I would like to nominate #475 for inclusion in 0.6.3. It would require 
picking a stable recent snapshot of the thrift libs, replacing the 
existing libthrift jar, and rebuilding interface/thrift/gen-java.


Thoughts?

-Tupshin

On 6/18/2010 12:12 PM, Eric Evans wrote:

If there aren't any objections, I'd like to target the end of this month
for the next point release, 0.6.3.

If you have any show-stoppers that you feel should absolutely make it
into the next release, please let me know, otherwise I'll plan on
tagging 0.6.3 a week from today, Friday June 25th (open 0.6.3 issues
will be moved to 0.6.4 at that time).

Leaning into it,

Re: 0.6.3

2010-06-18 Thread Tupshin Harper


On 6/18/2010 1:24 PM, Eric Evans wrote:

On Fri, 2010-06-18 at 12:25 -0700, Tupshin Harper wrote:
   

I would like to nominate #475 for inclusion in 0.6.3. It would require
picking a stable recent snapshot of the thrift libs, replacing the
existing libthrift jar, and rebuilding interface/thrift/gen-java.
 

If it means rebuilding our generated code, then it means (some )users
will need to do the same, which means incorporating Thrift up/downgrades
into their upgrade/rollback procedures.

I think it's also important to realize that as nasty as this bug is,
it's not a recent regression (it's not a regression at all per say), and
0.7.0 will be a long relatively soon.

I'd be interested to hear what others think, but I'm personally not very
comfortable introducing such a change into a minor release.

   
I would also like to hear other opinions, but if the consensus is that 
this is too big a change to introduce to 0.6.3, I'm fine with that too. 
I'm rolling my own releases now, and can continue to do so. I just know 
that the thrift problem has bitten a number of people and can be a hard 
one for new adopters to figure out.


-Tupshin

Re: network compatibility from 0.6 to 0.7

2010-07-22 Thread Tupshin Harper

As long as network compatibility is in place, it is possible to
incrementally upgrade a cluster by restricting thrift clients to only talk
to the 0.6 nodes until half the cluster is upgraded and then modify them to
talk to the 0.7 nodes. If networking compatibility breaks, there is no way
to avoid downtime or even test 0.7 under production load.

On Jul 22, 2010 9:50 AM, "Jonathan Ellis"  wrote:

How useful is this to insist on, given that 0.7 thrift api is fairly
incompatible with 0.6's?  (timestamp -> Clock change being the biggest
problem there)

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

Re: Research on scalability bug finder for Cassandra

Re: Cassandra + RAMP transactions

Re: Proposal: require Java7 for Cassandra 2.0

Re: Node side processing

Re: Pointers on writing your own Compaction Strategy

Re: 0.6.3

Re: 0.6.3

Re: network compatibility from 0.6 to 0.7

10 matches

Site Navigation

Mail list logo

Footer information