Re: Discussion: release quality

2011-11-30 Thread Bill Au
I think we definitely need better quality for the releases.  Just looked at
1.0.3 and 1.0.4.  I am willing to test out release candidate and report
back my finding on the mailing list.  Hopefully more folks can do that to
make the testing more comprehensive.  And folks with binding votes can take
the testing results into account before they vote.

Bill

On Tue, Nov 29, 2011 at 10:14 PM, Joe Stein  wrote:

> I need at least a week, maybe two to promote anything to staging which is
> mainly because we do weekly releases.   I could introduce a 2 day turn
> around but only with a more fixed type schedule.  I am running 0.8.6 in
> production and REALLY want to upgrade for nothing more than getting
> compression ( the cost of petabytes of uncompressed data is just stupid ).
>  So however I can help in changing my process OR better understanding the
> PMC here I am game for.
>
> One thing I use C* for is holding days worth of data and re-running those
> days for regression on our software... simulating production... It might
> not take much to reverse it.
>
> /*
> Joe Stein
> http://www.medialets.com
> Twitter: @allthingshadoop
> */
>
> On Nov 29, 2011, at 10:04 PM, Edward Capriolo 
> wrote:
>
> > On Tue, Nov 29, 2011 at 6:16 PM, Jeremy Hanna <
> jeremy.hanna1...@gmail.com>wrote:
> >
> >> I'd like to start a discussion about ideas to improve release quality
> for
> >> Cassandra.  Specifically I wonder if the community can do more to help
> the
> >> project as a whole become more solid.  Cassandra has an active and
> vibrant
> >> community using Cassandra for a variety of things.  If we all pitch in a
> >> little bit, it seems like we can make a difference here.
> >>
> >> Release quality is difficult, especially for a distributed system like
> >> Cassandra.  The core devs have done an amazing job with this considering
> >> how complicated it is.  Currently, there are several things in place to
> >> make sure that a release is generally usable:
> >> - review-then-commit
> >> - 72 hour voting period
> >> - at least 3 binding +1 votes
> >> - unit tests
> >> - integration tests
> >> Then there is the personal responsibility aspect - testing a release in
> a
> >> staging environment before pushing it to production.
> >>
> >> I wonder if more could be done here to give more confidence in releases.
> >> I wanted to see if there might be ways that the community could help out
> >> without being too burdensome on either the core devs or the community.
> >>
> >> Some ideas:
> >> More automation: run YCSB and stress with various setups.  Maybe people
> >> can rotate donating cloud instances (or simply money for them) but have
> a
> >> common set of scripts to do this in the source.
> >>
> >> Dedicated distributed test suite: I know there has been work done on
> >> various distributed test suites (which is great!) but none have really
> >> caught on so far.
> >>
> >> I know what the apache guidelines say, but what if the community could
> >> help out with the testing effort in a more formal way.  For example, for
> >> each release to be finalized, what if there needed to be 3 community
> >> members that needed to try it out in their own environment?
> >>
> >> What if there was a post release +1 vote for the community to sign off
> on
> >> - sort of a "works for me" kind of thing to reassure others that it's
> safe
> >> to try.  So when the release email gets posted to the user list, start a
> >> tradition of people saying +1 in reply if they've tested it out and it
> >> works for them.  That's happening informally now when there are
> problems,
> >> but it might be nice to see a vote of confidence.  Just another idea.
> >>
> >> Any other ideas or variations?
> >
> >
> > I am no software engineering guru, but whenever I +1 a hive release I
> > actually do checkout the code and run a couple queries. Mostly I find
> that
> > because there is just so many things not unit testable like those gosh
> darn
> > bash scripts that launch Java applications. There have been times when
> even
> > after multiple patch revisions and passing unit tests something just does
> > not work in the real world. So I never +1 a binary release I don't spend
> an
> > hour with and if possible I try twisting the knobs on any new feature or
> at
> > least just trying the basics.Hive is aiming for something like quarterly
> > releases.
> >
> > So possibly better to have Cassandra do time based releases. It does not
> > have to be quarterly but if people want bleeding edge features (something
> > committed 2 days ago) really they should go out and build something from
> > trunk.
> >
> > It seems like Cassandra devs have the voting and releasing down to a
> > science but from my world the types of bugs I worry about are data file
> > corruption, and any weird bug that would result in data faults like
> > read_repair not working or writes not going to the write nodes, or bloom
> > filters giving a faulty result. New features are great and I love seeing
> > th

Re: [VOTE] Release Apache Cassandra 1.0.5

2011-11-30 Thread Eric Evans
On Tue, Nov 29, 2011 at 1:27 PM, Sylvain Lebresne  wrote:
> So 1.0.4 was actually pretty catastrophic. CASSANDRA-3540 is a clear blocker
> and CASSANDRA-3539 is critical too. For now, we've pulled the plug on 1.0.4 by
> removing it from the website (and I broke the debian upgrade to 1.0.4), but we
> need a fix asap.
>
> I've taken the 1.0.4 artifacts and on top of that I've:
>  - reverted CASSANDRA-3407 (to 'fix' CASSANDRA-3540)
>  - committed CASSANDRA-3539
> I thus propose the following artifacts for release as 1.0.5.
>
> SVN: 
> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-1.0.5@1208016
> Artifacts: 
> https://repository.apache.org/content/repositories/orgapachecassandra-269/org/apache/cassandra/apache-cassandra/1.0.5/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-269/
>
> The artifacts as well as the debian package are also available here:
> http://people.apache.org/~slebresne/
>
> Given the situation, and given that the proposed artifacts differs only
> slightly from the one of 1.0.4, I propose an expedited vote of 24 hours
> (longer if needed).

I don't think it's anything new, but in addition to the previously
discussed ConsistencyLevelTest, I get the occasional failure of
CompactionsTest due to a timeout.  It is pretty long running, maybe we
just need to extend the timeout.

Other than that, +1

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: [VOTE] Release Apache Cassandra 1.0.5

2011-11-30 Thread Jake Luciani
+1

On Wed, Nov 30, 2011 at 10:55 AM, Eric Evans  wrote:

> On Tue, Nov 29, 2011 at 1:27 PM, Sylvain Lebresne 
> wrote:
> > So 1.0.4 was actually pretty catastrophic. CASSANDRA-3540 is a clear
> blocker
> > and CASSANDRA-3539 is critical too. For now, we've pulled the plug on
> 1.0.4 by
> > removing it from the website (and I broke the debian upgrade to 1.0.4),
> but we
> > need a fix asap.
> >
> > I've taken the 1.0.4 artifacts and on top of that I've:
> >  - reverted CASSANDRA-3407 (to 'fix' CASSANDRA-3540)
> >  - committed CASSANDRA-3539
> > I thus propose the following artifacts for release as 1.0.5.
> >
> > SVN:
> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-1.0.5@1208016
> > Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-269/org/apache/cassandra/apache-cassandra/1.0.5/
> > Staging repository:
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-269/
> >
> > The artifacts as well as the debian package are also available here:
> > http://people.apache.org/~slebresne/
> >
> > Given the situation, and given that the proposed artifacts differs only
> > slightly from the one of 1.0.4, I propose an expedited vote of 24 hours
> > (longer if needed).
>
> I don't think it's anything new, but in addition to the previously
> discussed ConsistencyLevelTest, I get the occasional failure of
> CompactionsTest due to a timeout.  It is pretty long running, maybe we
> just need to extend the timeout.
>
> Other than that, +1
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu
>



-- 
http://twitter.com/tjake


Re: [VOTE] Release Apache Cassandra 1.0.5

2011-11-30 Thread Gary Dusbabek
+1
 On Nov 29, 2011 1:28 PM, "Sylvain Lebresne"  wrote:

> So 1.0.4 was actually pretty catastrophic. CASSANDRA-3540 is a clear
> blocker
> and CASSANDRA-3539 is critical too. For now, we've pulled the plug on
> 1.0.4 by
> removing it from the website (and I broke the debian upgrade to 1.0.4),
> but we
> need a fix asap.
>
> I've taken the 1.0.4 artifacts and on top of that I've:
>  - reverted CASSANDRA-3407 (to 'fix' CASSANDRA-3540)
>  - committed CASSANDRA-3539
> I thus propose the following artifacts for release as 1.0.5.
>
> SVN:
> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-1.0.5@1208016
> Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-269/org/apache/cassandra/apache-cassandra/1.0.5/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-269/
>
> The artifacts as well as the debian package are also available here:
> http://people.apache.org/~slebresne/
>
> Given the situation, and given that the proposed artifacts differs only
> slightly from the one of 1.0.4, I propose an expedited vote of 24 hours
> (longer if needed).
>
> [1]: http://goo.gl/YAgqE (CHANGES.txt)
> [2]: http://goo.gl/BsBRE (NEWS.txt)
>


Re: [VOTE] Release Apache Cassandra 1.0.5

2011-11-30 Thread Brandon Williams
+1

On Tue, Nov 29, 2011 at 1:27 PM, Sylvain Lebresne  wrote:
> So 1.0.4 was actually pretty catastrophic. CASSANDRA-3540 is a clear blocker
> and CASSANDRA-3539 is critical too. For now, we've pulled the plug on 1.0.4 by
> removing it from the website (and I broke the debian upgrade to 1.0.4), but we
> need a fix asap.
>
> I've taken the 1.0.4 artifacts and on top of that I've:
>  - reverted CASSANDRA-3407 (to 'fix' CASSANDRA-3540)
>  - committed CASSANDRA-3539
> I thus propose the following artifacts for release as 1.0.5.
>
> SVN: 
> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-1.0.5@1208016
> Artifacts: 
> https://repository.apache.org/content/repositories/orgapachecassandra-269/org/apache/cassandra/apache-cassandra/1.0.5/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-269/
>
> The artifacts as well as the debian package are also available here:
> http://people.apache.org/~slebresne/
>
> Given the situation, and given that the proposed artifacts differs only
> slightly from the one of 1.0.4, I propose an expedited vote of 24 hours
> (longer if needed).
>
> [1]: http://goo.gl/YAgqE (CHANGES.txt)
> [2]: http://goo.gl/BsBRE (NEWS.txt)
>


RE: [VOTE] Release Apache Cassandra 1.0.5

2011-11-30 Thread declan
PLEASE take me off this email distribution listthanks

All the Best
Declan

-Original Message-
From: Jake Luciani [mailto:jak...@gmail.com] 
Sent: Wednesday, November 30, 2011 11:00 AM
To: dev@cassandra.apache.org
Subject: Re: [VOTE] Release Apache Cassandra 1.0.5

+1

On Wed, Nov 30, 2011 at 10:55 AM, Eric Evans  wrote:

> On Tue, Nov 29, 2011 at 1:27 PM, Sylvain Lebresne 
> wrote:
> > So 1.0.4 was actually pretty catastrophic. CASSANDRA-3540 is a clear
> blocker
> > and CASSANDRA-3539 is critical too. For now, we've pulled the plug on
> 1.0.4 by
> > removing it from the website (and I broke the debian upgrade to 1.0.4),
> but we
> > need a fix asap.
> >
> > I've taken the 1.0.4 artifacts and on top of that I've:
> >  - reverted CASSANDRA-3407 (to 'fix' CASSANDRA-3540)
> >  - committed CASSANDRA-3539
> > I thus propose the following artifacts for release as 1.0.5.
> >
> > SVN:
>
https://svn.apache.org/repos/asf/cassandra/branches/cassandra-1.0.5@1208016
> > Artifacts:
>
https://repository.apache.org/content/repositories/orgapachecassandra-269/or
g/apache/cassandra/apache-cassandra/1.0.5/
> > Staging repository:
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-269/
> >
> > The artifacts as well as the debian package are also available here:
> > http://people.apache.org/~slebresne/
> >
> > Given the situation, and given that the proposed artifacts differs only
> > slightly from the one of 1.0.4, I propose an expedited vote of 24 hours
> > (longer if needed).
>
> I don't think it's anything new, but in addition to the previously
> discussed ConsistencyLevelTest, I get the occasional failure of
> CompactionsTest due to a timeout.  It is pretty long running, maybe we
> just need to extend the timeout.
>
> Other than that, +1
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu
>



-- 
http://twitter.com/tjake



Re: [VOTE] Release Apache Cassandra 1.0.5

2011-11-30 Thread Eric Evans
On Wed, Nov 30, 2011 at 10:06 AM, declan  wrote:
> PLEASE take me off this email distribution listthanks

http://threepanelsoul.com/2008/12/02/on-sea-lions/

> -Original Message-
> From: Jake Luciani [mailto:jak...@gmail.com]
> Sent: Wednesday, November 30, 2011 11:00 AM
> To: dev@cassandra.apache.org
> Subject: Re: [VOTE] Release Apache Cassandra 1.0.5
>
> +1
>
> On Wed, Nov 30, 2011 at 10:55 AM, Eric Evans  wrote:
>
>> On Tue, Nov 29, 2011 at 1:27 PM, Sylvain Lebresne 
>> wrote:
>> > So 1.0.4 was actually pretty catastrophic. CASSANDRA-3540 is a clear
>> blocker
>> > and CASSANDRA-3539 is critical too. For now, we've pulled the plug on
>> 1.0.4 by
>> > removing it from the website (and I broke the debian upgrade to 1.0.4),
>> but we
>> > need a fix asap.
>> >
>> > I've taken the 1.0.4 artifacts and on top of that I've:
>> >  - reverted CASSANDRA-3407 (to 'fix' CASSANDRA-3540)
>> >  - committed CASSANDRA-3539
>> > I thus propose the following artifacts for release as 1.0.5.
>> >
>> > SVN:
>>
> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-1.0.5@1208016
>> > Artifacts:
>>
> https://repository.apache.org/content/repositories/orgapachecassandra-269/or
> g/apache/cassandra/apache-cassandra/1.0.5/
>> > Staging repository:
>> >
>> https://repository.apache.org/content/repositories/orgapachecassandra-269/
>> >
>> > The artifacts as well as the debian package are also available here:
>> > http://people.apache.org/~slebresne/
>> >
>> > Given the situation, and given that the proposed artifacts differs only
>> > slightly from the one of 1.0.4, I propose an expedited vote of 24 hours
>> > (longer if needed).
>>
>> I don't think it's anything new, but in addition to the previously
>> discussed ConsistencyLevelTest, I get the occasional failure of
>> CompactionsTest due to a timeout.  It is pretty long running, maybe we
>> just need to extend the timeout.
>>
>> Other than that, +1
>>
>> --
>> Eric Evans
>> Acunu | http://www.acunu.com | @acunu
>>
>
>
>
> --
> http://twitter.com/tjake
>
>



-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Build failed in Jenkins: Cassandra-quick #146

2011-11-30 Thread Apache Jenkins Server
See 

Changes:

[brandonwilliams] Bulk loader is no longer a fat client, hadoop bulk loader 
output format.
Patch by brandonwilliams, reviewed by Yuki Morishita for CASSANDRA-3045

--
[...truncated 1633 lines...]
[junit] Testsuite: 
org.apache.cassandra.locator.OldNetworkTopologyStrategyTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.122 sec
[junit] 
[junit] Testsuite: 
org.apache.cassandra.locator.ReplicationStrategyEndpointCacheTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.397 sec
[junit] 
[junit] Testsuite: org.apache.cassandra.locator.SimpleStrategyTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.83 sec
[junit] 
[junit] Testsuite: org.apache.cassandra.locator.TokenMetadataTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.349 sec
[junit] 
[junit] Testsuite: 
org.apache.cassandra.service.AntiEntropyServiceCounterTest
[junit] Tests run: 6, Failures: 0, Errors: 1, Time elapsed: 2.234 sec
[junit] 
[junit] Testcase: 
testValidatorPrepare(org.apache.cassandra.service.AntiEntropyServiceCounterTest):
 Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:261)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:231)
[junit] at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:484)
[junit] at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:461)
[junit] at 
org.apache.cassandra.service.AntiEntropyServiceTestAbstract.prepare(AntiEntropyServiceTestAbstract.java:80)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.AntiEntropyServiceCounterTest 
FAILED
[junit] Testsuite: 
org.apache.cassandra.service.AntiEntropyServiceStandardTest
[junit] Tests run: 6, Failures: 0, Errors: 1, Time elapsed: 2.251 sec
[junit] 
[junit] Testcase: 
testValidatorPrepare(org.apache.cassandra.service.AntiEntropyServiceStandardTest):
Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:261)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:231)
[junit] at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:484)
[junit] at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:461)
[junit] at 
org.apache.cassandra.service.AntiEntropyServiceTestAbstract.prepare(AntiEntropyServiceTestAbstract.java:80)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.AntiEntropyServiceStandardTest 
FAILED
[junit] Testsuite: org.apache.cassandra.service.CassandraServerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.524 sec
[junit] 
[junit] Testsuite: org.apache.cassandra.service.ConsistencyLevelTest
[junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 1.046 sec
[junit] 
[junit] - Standard Error -
[junit] ERROR 17:16:12,964 unknown endpoint /127.0.0.2
[junit] ERROR 17:16:12,968 unknown endpoint /127.0.0.2
[junit] -  ---
[junit] Testcase: 
testReadWriteConsistencyChecks(org.apache.cassandra.service.ConsistencyLevelTest):
FAILED
[junit] 
[junit] junit.framework.AssertionFailedError: 
[junit] at 
org.apache.cassandra.service.ConsistencyLevelTest.testReadWriteConsistencyChecks(ConsistencyLevelTest.java:165)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.ConsistencyLevelTest FAILED
[junit] Testsuite: org.apache.cassandra.service.EmbeddedCassandraServiceTest
[junit] Testsuite: org.apache.cassandra.service.EmbeddedCassandraServiceTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] 
[junit] Testcase: 
org.apache.cassandra.service.EmbeddedCassandraServiceTest:BeforeFirstTest:  
  Caused an ERROR
[junit] Forked Java VM exited abnormally

Distributed testing

2011-11-30 Thread Sylvain Lebresne
Cassandra needs distributed regression testing. The current unit tests and
so-called 'system' tests are great, but limited in scope for a distributed
system.  As pointed by others very recently, this can only help towards the
goal of having bullet-proof releases (this is obviously not enough, but it's
needed all the same).

We have an attempt at distributed tests in-tree but it never caught up.
Amongst the reason, is probably the fact that they are using whirr and
thus require an EC2 or Rackspace account.

To try to solve that problem, Datastax has developed a small distributed test
framework called cassandra-dtest (dtest for short). It is written in python
and uses nosetests and ccm (https://github.com/pcmanus/ccm). It is
open-source and available at https://github.com/riptano/cassandra-dtest. The
number of tests is yet limited but already far exceed those of the in-tree
distributed tests. It include in particular multi-DC tests and upgrade tests.

This is very much young code and all inputs will be greatly appreciated. The
clear goal however is that unless someone has anything better to propose, we
use this as the default distributed test framework for Cassandra. Which means
getting into the habit of requiring a regression test (unit or distributed)
with each bug fix.

--
Sylvain


Re: Distributed testing

2011-11-30 Thread Zach Richardson
One thing that I think would help with this, that I noticed when
wanting to shoot myself in the head when trying to write unit tests to
test anything, is getting rid of the ridiculously large number of
static classes.  I understand this would be a very large effort (and
would make the code more verbose) since the dependencies on accessing
them are littered throughout the entire codebase.  It would make it
much easier to create mock objects for dependency injection so that
you can more thoroughly test individual components without having to
run it in a distributed environment to even know if it is working.

This is just my 2-cents, and if there is an interest in refactoring to
remove some of the static classes (beyond my own interest), I would be
more than happy to help.

Zach

On Wed, Nov 30, 2011 at 11:16 AM, Sylvain Lebresne  wrote:
> Cassandra needs distributed regression testing. The current unit tests and
> so-called 'system' tests are great, but limited in scope for a distributed
> system.  As pointed by others very recently, this can only help towards the
> goal of having bullet-proof releases (this is obviously not enough, but it's
> needed all the same).
>
> We have an attempt at distributed tests in-tree but it never caught up.
> Amongst the reason, is probably the fact that they are using whirr and
> thus require an EC2 or Rackspace account.
>
> To try to solve that problem, Datastax has developed a small distributed test
> framework called cassandra-dtest (dtest for short). It is written in python
> and uses nosetests and ccm (https://github.com/pcmanus/ccm). It is
> open-source and available at https://github.com/riptano/cassandra-dtest. The
> number of tests is yet limited but already far exceed those of the in-tree
> distributed tests. It include in particular multi-DC tests and upgrade tests.
>
> This is very much young code and all inputs will be greatly appreciated. The
> clear goal however is that unless someone has anything better to propose, we
> use this as the default distributed test framework for Cassandra. Which means
> getting into the habit of requiring a regression test (unit or distributed)
> with each bug fix.
>
> --
> Sylvain
>


Re: Discussion: release quality

2011-11-30 Thread Jonathan Ellis
There's a couple more points to be made here, I think.

First, we've also gone to a great deal of effort to make upgrading
seamless, and we recently (1.0.3) added support for seamless
downgrading as well.  Anyone with a staging cluster (which should be
everyone) can drop 1.0.4 on a single node, see if there's any
problems, and roll back to 1.0.3 if there are.  Which is, as near as I
can tell, what happened.  Granted, it's always better to not release
bugs at all, but it happens to *everyone*, so defense in depth is a
Good Thing.

Second, I don't think you're going to be able to mandate more
prerelease community testing.  It's kind of a law of nature that the
bulk of testing happens post-release, whether in databases, web
frameworks, or OS kernels, new projects or mature, you see the same
pattern everywhere.  (A big thank you to everyone who *did* test
prerelease 1.0.x artifacts, you guys are awesome!)

IMO the best we can do is get more automated coverage of the
distributed side of things.  We've had a framework for this in-tree
for a while but it's so incredibly painful to actually write tests for
that we only have a handful.  DataStax has been working on a next-gen
dtest framework to improve this situation -- Sylvain just posted about
that, so I'll defer to that thread now.

On Tue, Nov 29, 2011 at 5:16 PM, Jeremy Hanna
 wrote:
> I'd like to start a discussion about ideas to improve release quality for 
> Cassandra.  Specifically I wonder if the community can do more to help the 
> project as a whole become more solid.  Cassandra has an active and vibrant 
> community using Cassandra for a variety of things.  If we all pitch in a 
> little bit, it seems like we can make a difference here.
>
> Release quality is difficult, especially for a distributed system like 
> Cassandra.  The core devs have done an amazing job with this considering how 
> complicated it is.  Currently, there are several things in place to make sure 
> that a release is generally usable:
> - review-then-commit
> - 72 hour voting period
> - at least 3 binding +1 votes
> - unit tests
> - integration tests
> Then there is the personal responsibility aspect - testing a release in a 
> staging environment before pushing it to production.
>
> I wonder if more could be done here to give more confidence in releases.  I 
> wanted to see if there might be ways that the community could help out 
> without being too burdensome on either the core devs or the community.
>
> Some ideas:
> More automation: run YCSB and stress with various setups.  Maybe people can 
> rotate donating cloud instances (or simply money for them) but have a common 
> set of scripts to do this in the source.
>
> Dedicated distributed test suite: I know there has been work done on various 
> distributed test suites (which is great!) but none have really caught on so 
> far.
>
> I know what the apache guidelines say, but what if the community could help 
> out with the testing effort in a more formal way.  For example, for each 
> release to be finalized, what if there needed to be 3 community members that 
> needed to try it out in their own environment?
>
> What if there was a post release +1 vote for the community to sign off on - 
> sort of a "works for me" kind of thing to reassure others that it's safe to 
> try.  So when the release email gets posted to the user list, start a 
> tradition of people saying +1 in reply if they've tested it out and it works 
> for them.  That's happening informally now when there are problems, but it 
> might be nice to see a vote of confidence.  Just another idea.
>
> Any other ideas or variations?



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: [VOTE] Release Apache Cassandra 1.0.5

2011-11-30 Thread Sylvain Lebresne
I'm counting 5 binding +1's and no -1's, the vote passes.
I'll get the artifacts published.

ps: I know it's been more like 23 hours but I want to get this out before it's
too late here and I doubt one more hour would change anything here.

--
Sylvain

On Wed, Nov 30, 2011 at 5:10 PM, Eric Evans  wrote:
> On Wed, Nov 30, 2011 at 10:06 AM, declan  wrote:
>> PLEASE take me off this email distribution listthanks
>
> http://threepanelsoul.com/2008/12/02/on-sea-lions/
>
>> -Original Message-
>> From: Jake Luciani [mailto:jak...@gmail.com]
>> Sent: Wednesday, November 30, 2011 11:00 AM
>> To: dev@cassandra.apache.org
>> Subject: Re: [VOTE] Release Apache Cassandra 1.0.5
>>
>> +1
>>
>> On Wed, Nov 30, 2011 at 10:55 AM, Eric Evans  wrote:
>>
>>> On Tue, Nov 29, 2011 at 1:27 PM, Sylvain Lebresne 
>>> wrote:
>>> > So 1.0.4 was actually pretty catastrophic. CASSANDRA-3540 is a clear
>>> blocker
>>> > and CASSANDRA-3539 is critical too. For now, we've pulled the plug on
>>> 1.0.4 by
>>> > removing it from the website (and I broke the debian upgrade to 1.0.4),
>>> but we
>>> > need a fix asap.
>>> >
>>> > I've taken the 1.0.4 artifacts and on top of that I've:
>>> >  - reverted CASSANDRA-3407 (to 'fix' CASSANDRA-3540)
>>> >  - committed CASSANDRA-3539
>>> > I thus propose the following artifacts for release as 1.0.5.
>>> >
>>> > SVN:
>>>
>> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-1.0.5@1208016
>>> > Artifacts:
>>>
>> https://repository.apache.org/content/repositories/orgapachecassandra-269/or
>> g/apache/cassandra/apache-cassandra/1.0.5/
>>> > Staging repository:
>>> >
>>> https://repository.apache.org/content/repositories/orgapachecassandra-269/
>>> >
>>> > The artifacts as well as the debian package are also available here:
>>> > http://people.apache.org/~slebresne/
>>> >
>>> > Given the situation, and given that the proposed artifacts differs only
>>> > slightly from the one of 1.0.4, I propose an expedited vote of 24 hours
>>> > (longer if needed).
>>>
>>> I don't think it's anything new, but in addition to the previously
>>> discussed ConsistencyLevelTest, I get the occasional failure of
>>> CompactionsTest due to a timeout.  It is pretty long running, maybe we
>>> just need to extend the timeout.
>>>
>>> Other than that, +1
>>>
>>> --
>>> Eric Evans
>>> Acunu | http://www.acunu.com | @acunu
>>>
>>
>>
>>
>> --
>> http://twitter.com/tjake
>>
>>
>
>
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu


Build failed in Jenkins: Cassandra #1228

2011-11-30 Thread Apache Jenkins Server
See 

Changes:

[brandonwilliams] Bulk loader is no longer a fat client, hadoop bulk loader 
output format.
Patch by brandonwilliams, reviewed by Yuki Morishita for CASSANDRA-3045

--
[...truncated 2301 lines...]
[junit] Testsuite: 
org.apache.cassandra.locator.ReplicationStrategyEndpointCacheTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.402 sec
[junit] 
[junit] Testsuite: org.apache.cassandra.locator.SimpleStrategyTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.816 sec
[junit] 
[junit] Testsuite: org.apache.cassandra.locator.TokenMetadataTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.344 sec
[junit] 
[junit] Testsuite: 
org.apache.cassandra.service.AntiEntropyServiceCounterTest
[junit] Tests run: 6, Failures: 0, Errors: 1, Time elapsed: 2.081 sec
[junit] 
[junit] Testcase: 
testValidatorPrepare(org.apache.cassandra.service.AntiEntropyServiceCounterTest):
 Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:261)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:231)
[junit] at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:484)
[junit] at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:461)
[junit] at 
org.apache.cassandra.service.AntiEntropyServiceTestAbstract.prepare(AntiEntropyServiceTestAbstract.java:80)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.AntiEntropyServiceCounterTest 
FAILED
[junit] Testsuite: 
org.apache.cassandra.service.AntiEntropyServiceStandardTest
[junit] Tests run: 6, Failures: 0, Errors: 1, Time elapsed: 2.238 sec
[junit] 
[junit] Testcase: 
testValidatorPrepare(org.apache.cassandra.service.AntiEntropyServiceStandardTest):
Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:261)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:231)
[junit] at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:484)
[junit] at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:461)
[junit] at 
org.apache.cassandra.service.AntiEntropyServiceTestAbstract.prepare(AntiEntropyServiceTestAbstract.java:80)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.AntiEntropyServiceStandardTest 
FAILED
[junit] Testsuite: org.apache.cassandra.service.CassandraServerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.522 sec
[junit] 
[junit] Testsuite: org.apache.cassandra.service.ConsistencyLevelTest
[junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 1.019 sec
[junit] 
[junit] - Standard Error -
[junit] ERROR 18:06:04,352 unknown endpoint /127.0.0.2
[junit] ERROR 18:06:04,357 unknown endpoint /127.0.0.2
[junit] -  ---
[junit] Testcase: 
testReadWriteConsistencyChecks(org.apache.cassandra.service.ConsistencyLevelTest):
FAILED
[junit] 
[junit] junit.framework.AssertionFailedError: 
[junit] at 
org.apache.cassandra.service.ConsistencyLevelTest.testReadWriteConsistencyChecks(ConsistencyLevelTest.java:165)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.ConsistencyLevelTest FAILED
[junit] Testsuite: org.apache.cassandra.service.EmbeddedCassandraServiceTest
[junit] Testsuite: org.apache.cassandra.service.EmbeddedCassandraServiceTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] 
[junit] Testcase: 
org.apache.cassandra.service.EmbeddedCassandraServiceTest:BeforeFirstTest:  
  Caused an ERROR
[junit] Forked Java VM exited abnormally. Please note the time in the 
report does not reflect the time until the VM exit.
[junit] junit.framework.AssertionFailedError: Forked Java VM exited 
abnormally. Please 

Jenkins build is still unstable: Cassandra-Coverage #178

2011-11-30 Thread Apache Jenkins Server
See