Re: 1.1 freeze approaching

2011-12-20 Thread Jeremy Hanna
I like this part of that thread (w00t for a distributed test suite):

> # Automate all tests
>
> I think the only way that we can keep people close to trunk and stay
> stable is to build automated tests for *everything*. All code should
> be exercised by thorough unit tests and distributed black-box tests.
> Every regression should get a test.

Agreed.

On Dec 20, 2011, at 1:16 AM, Brandon Williams wrote:

> On Tue, Dec 20, 2011 at 12:45 AM, Radim Kolar  wrote:
>> can you make release cycle slower? its better to have more new features and
>> do major upgrades less often. it saves time needed for testing and
>> migrations.
> 
> http://www.mail-archive.com/dev@cassandra.apache.org/msg01549.html
> 
> -Brandon



major version release schedule

2011-12-20 Thread Radim Kolar

http://www.mail-archive.com/dev@cassandra.apache.org/msg01549.html

I read it but things are different now because magic 1.0 is out. If you 
implement 1.0 and put it into production, you really do not want to 
retest app on new version every 4 months and its unlikely that you will 
get migration approved by management unless you present clear benefits 
for such migration. Compression was nice new feature of 1.0 but it was 
rejected by lot of IT managers  as "too risky" for now.


While you can test application quite easily, testing cluster stability 
is way harder in test environment because its not usually possible to 
fully replicate workload and data volume in test environment and 
migration back is difficult because Cassandra currently does not have 
tool for fast sstable downgrade (1.0 -> 0.8).


For production use long time between major releases is better. I would 
double time between major releases, maybe not for 1.1/1.2 but later for 
sure.  Take look at postgresql project, they release 1 major version per 
year and they support 4 major versions for bugfixes and older postgresql 
versions are still common in production.


Did you asked people running mission critical workloads about their 
opinion? Another possibility is to use ISV like Datastax to provide long 
term support.


Re: major version release schedule

2011-12-20 Thread Jonathan Ellis
Nobody's forcing you to upgrade.  If you want twice as much time
between upgrading, just wait for 1.2.  In the meantime, people who
need the features in 1.1 also get those early (no, running trunk in
production isn't a serious option).  I don't see any real benefit for
you in forcing your preference on everyone, and I see a big negative
for some.

It's also worth noting that waiting for 2x as many features for freeze
will result in MORE than 2x as much complexity for tracking down
regressions.  Given the limited testing we get during freeze, I think
that's a pretty strong argument in favor of more-frequent, smaller
releases.

On Tue, Dec 20, 2011 at 7:42 AM, Radim Kolar  wrote:
> http://www.mail-archive.com/dev@cassandra.apache.org/msg01549.html
>
> I read it but things are different now because magic 1.0 is out. If you
> implement 1.0 and put it into production, you really do not want to retest
> app on new version every 4 months and its unlikely that you will get
> migration approved by management unless you present clear benefits for such
> migration. Compression was nice new feature of 1.0 but it was rejected by
> lot of IT managers  as "too risky" for now.
>
> While you can test application quite easily, testing cluster stability is
> way harder in test environment because its not usually possible to fully
> replicate workload and data volume in test environment and migration back is
> difficult because Cassandra currently does not have tool for fast sstable
> downgrade (1.0 -> 0.8).
>
> For production use long time between major releases is better. I would
> double time between major releases, maybe not for 1.1/1.2 but later for
> sure.  Take look at postgresql project, they release 1 major version per
> year and they support 4 major versions for bugfixes and older postgresql
> versions are still common in production.
>
> Did you asked people running mission critical workloads about their opinion?
> Another possibility is to use ISV like Datastax to provide long term
> support.



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: major version release schedule

2011-12-20 Thread Tatu Saloranta
On Tue, Dec 20, 2011 at 6:16 AM, Jonathan Ellis  wrote:
> Nobody's forcing you to upgrade.  If you want twice as much time
> between upgrading, just wait for 1.2.  In the meantime, people who
> need the features in 1.1 also get those early (no, running trunk in
> production isn't a serious option).  I don't see any real benefit for
> you in forcing your preference on everyone, and I see a big negative
> for some.
>
> It's also worth noting that waiting for 2x as many features for freeze
> will result in MORE than 2x as much complexity for tracking down
> regressions.  Given the limited testing we get during freeze, I think
> that's a pretty strong argument in favor of more-frequent, smaller
> releases.

+1. I really don't see why anyone would feel forced to upgrade just
because a new version is available.

-+ Tatu +-


Re: 1.1 freeze approaching

2011-12-20 Thread Jeremiah Jordan
Unless you need the new features, you don't need to upgrade.  And the current 
version won't stop getting updates.  As mentioned in the thread where the 
project moved to a 4 month major version cycle, smaller changes between major 
versions means they will be more stable.  Even with the smaller cycle it takes 
a bit to get all the kinks out of new features (see all the 1.0.X releases), if 
the release cycle is longer, you have even longer to wait, like you did with 
0.7 where it took 5-6 months, for the new release to really be stable.

-Jeremiah

On Dec 20, 2011, at 12:45 AM, Radim Kolar wrote:

> 
>> Just a reminder that for us to meet our four-month major release
>> schedule (i.e., 1.1 = Feb 18),
> can you make release cycle slower? its better to have more new features and 
> do major upgrades less often. it saves time needed for testing and migrations.



Re: major version release schedule

2011-12-20 Thread Peter Schuller
Here is another thing to consider: There is considerable cost involved
in running/developing on old branches as the divergence between the
version you're running and trunk increases.

For those actively doing development, such divergence actually causes
extra work and slows down development.

A more reasonable approach IMO is to make sure that important
*bugfixes* are backported to branches that are sufficiently old to
satisfy the criteria of the OP. But that is orthogonal to how often
new releases happen.

The OP compares with PostgreSQL, and they're in a similar position.
You can run on a fairly old version and still get critical bug fixes,
meaning that if you don't actually need the new version there is no
one telling you that you must upgrade.

It seems to me that what matters are mostly two things:

(1) When you *do* need/want to upgrade, that upgrade path you care
about being stable, and working, and the version you're upgrading too
should be stable.
(2) Critical fixes need still be maintained for the version you're
running (else you are in fact kind of forced to upgrade).

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Build failed in Jenkins: Cassandra-quick #186

2011-12-20 Thread Apache Jenkins Server
See 

Changes:

[jbellis] merge #3335 from 1.0

--
[...truncated 2166 lines...]
[junit] at 
org.apache.cassandra.service.StorageService.initClient(StorageService.java:380)
[junit] at 
org.apache.cassandra.service.InitClientTest.testInitClientStartup(InitClientTest.java:33)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.InitClientTest FAILED
[junit] Testsuite: org.apache.cassandra.service.LeaveAndBootstrapTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 2.133 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 20:53:22,530 Overriding RING_DELAY to 1000ms
[junit]  WARN 20:53:24,240 Node /127.0.0.3 'leaving' token mismatch. Long 
network partition?
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.service.MoveTest
[junit] Testsuite: org.apache.cassandra.service.MoveTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] 
[junit] Testcase: org.apache.cassandra.service.MoveTest:BeforeFirstTest:
Caused an ERROR
[junit] Timeout occurred. Please note the time in the report does not 
reflect the time until the timeout.
[junit] junit.framework.AssertionFailedError: Timeout occurred. Please note 
the time in the report does not reflect the time until the timeout.
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.MoveTest FAILED (timeout)
[junit] Testsuite: org.apache.cassandra.service.RemoveTest
[junit] Tests run: 3, Failures: 0, Errors: 3, Time elapsed: 1.204 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 20:54:25,603 Overriding RING_DELAY to 1000ms
[junit] -  ---
[junit] Testcase: testBadToken(org.apache.cassandra.service.RemoveTest):
Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:279)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:249)
[junit] at 
org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:71)
[junit] 
[junit] 
[junit] Testcase: testLocalToken(org.apache.cassandra.service.RemoveTest):  
Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:279)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:249)
[junit] at 
org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:71)
[junit] 
[junit] 
[junit] Testcase: testRemoveToken(org.apache.cassandra.service.RemoveTest): 
Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:279)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:249)
[junit] at 
org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:71)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.RemoveTest FAILED
[junit] Testsuite: org.apache.cassandra.service.RowResolverTest
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.506 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 20:54:27,603 Overriding RING_DELAY to 1000ms
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.service.SerializationsTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.285 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 20:54:28,747 Overriding RING_DELAY to 1000ms
[junit] - ---

Build failed in Jenkins: Cassandra #1263

2011-12-20 Thread Apache Jenkins Server
See 

Changes:

[jbellis] merge #3335 from 1.0

--
[...truncated 2827 lines...]
[junit] at 
org.apache.cassandra.service.InitClientTest.testInitClientStartup(InitClientTest.java:33)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.InitClientTest FAILED
[junit] Testsuite: org.apache.cassandra.service.LeaveAndBootstrapTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 2.254 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 21:18:44,585 Overriding RING_DELAY to 1000ms
[junit]  WARN 21:18:46,415 Node /127.0.0.3 'leaving' token mismatch. Long 
network partition?
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.service.MoveTest
[junit] Testsuite: org.apache.cassandra.service.MoveTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] 
[junit] Testcase: org.apache.cassandra.service.MoveTest:BeforeFirstTest:
Caused an ERROR
[junit] Timeout occurred. Please note the time in the report does not 
reflect the time until the timeout.
[junit] junit.framework.AssertionFailedError: Timeout occurred. Please note 
the time in the report does not reflect the time until the timeout.
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.MoveTest FAILED (timeout)
[junit] Testsuite: org.apache.cassandra.service.RemoveTest
[junit] Tests run: 3, Failures: 0, Errors: 3, Time elapsed: 1.368 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 21:19:47,834 Overriding RING_DELAY to 1000ms
[junit] -  ---
[junit] Testcase: testBadToken(org.apache.cassandra.service.RemoveTest):
Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:279)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:249)
[junit] at 
org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:71)
[junit] 
[junit] 
[junit] Testcase: testLocalToken(org.apache.cassandra.service.RemoveTest):  
Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:279)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:249)
[junit] at 
org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:71)
[junit] 
[junit] 
[junit] Testcase: testRemoveToken(org.apache.cassandra.service.RemoveTest): 
Caused an ERROR
[junit] /127.0.0.1:7010 is in use by another process.  Change 
listen_address:storage_port in cassandra.yaml to values that do not conflict 
with other services
[junit] org.apache.cassandra.config.ConfigurationException: /127.0.0.1:7010 
is in use by another process.  Change listen_address:storage_port in 
cassandra.yaml to values that do not conflict with other services
[junit] at 
org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:279)
[junit] at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:249)
[junit] at 
org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:71)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.service.RemoveTest FAILED
[junit] Testsuite: org.apache.cassandra.service.RowResolverTest
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.511 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 21:19:49,990 Overriding RING_DELAY to 1000ms
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.service.SerializationsTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.284 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 21:19:51,143 Overriding RING_DELAY to 1000ms
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.service.StorageProxyTest
[j

Re: major version release schedule

2011-12-20 Thread Radim Kolar



Nobody's forcing you to upgrade.  If you want twice as much time
between upgrading, just wait for 1.2.
Currently 1.0 branch is still less stable then 0.8, i still get OOM on 
some nodes. Adding 1.1 feature set on top will make it less stable.

It's also worth noting that waiting for 2x as many features for freeze
will result in MORE than 2x as much complexity for tracking down
regressions.
Then make releases from 2 branches - stable and dev. Its common practice 
used in lot of software projects.



 I really don't see why anyone would feel forced to upgrade just because a new 
version is available.

you will get less likely bugfixes if new branch comes out. Also client 
libraries needs some time to catch release.




CQL support for compound columns

2011-12-20 Thread Eric Evans
There has been a discussion taking place in CASSANDRA-2474[1]
regarding the language and semantics of compound columns in CQL.
Though the issue was only opened in July, and despite extended periods
of inactivity, it is monstrously long.  Additionally, the discussion
necessarily includes inline visual aids (tables, graphics, and
verbatim code snippets) that are constantly being revised, which only
compounds (pun intended) the problem.  I feel as though this is not
only making the discussion less constructive, but that it may be
scaring people off, (and IMO, this issue could use to be discussed
among a larger group anyway).

I propose two things, 1) that we move the discussion to this mailing
list, and 2) that we track the various approaches in the wiki.

For the latter of these, I've stubbed out a page[2] that would
hopefully serve as a starting point.

Thoughts?


[1]: https://issues.apache.org/jira/browse/CASSANDRA-2474
[2]: http://wiki.apache.org/cassandra/Cassandra2475

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: major version release schedule

2011-12-20 Thread Eric Evans
On Tue, Dec 20, 2011 at 8:16 AM, Jonathan Ellis  wrote:
> Nobody's forcing you to upgrade.  If you want twice as much time
> between upgrading, just wait for 1.2.  In the meantime, people who
> need the features in 1.1 also get those early (no, running trunk in
> production isn't a serious option).  I don't see any real benefit for
> you in forcing your preference on everyone, and I see a big negative
> for some.
>
> It's also worth noting that waiting for 2x as many features for freeze
> will result in MORE than 2x as much complexity for tracking down
> regressions.  Given the limited testing we get during freeze, I think
> that's a pretty strong argument in favor of more-frequent, smaller
> releases.

Until recently we were working hard to reach a set of goals that
culminated in a 1.0 release.  I'm not sure we've had a formal
discussion on it, but just talking to people, there seems to be
consensus around the idea that we're now shifting our goals and
priorities around some (usability, stability, etc).  If that's the
case, I think we should at least be open to reevaluating our release
process and schedule accordingly (whether that means lengthening,
shorting, and/or simply shifting the barrier-to-entry for stable
updates).

> On Tue, Dec 20, 2011 at 7:42 AM, Radim Kolar  wrote:
>> http://www.mail-archive.com/dev@cassandra.apache.org/msg01549.html
>>
>> I read it but things are different now because magic 1.0 is out. If you
>> implement 1.0 and put it into production, you really do not want to retest
>> app on new version every 4 months and its unlikely that you will get
>> migration approved by management unless you present clear benefits for such
>> migration. Compression was nice new feature of 1.0 but it was rejected by
>> lot of IT managers  as "too risky" for now.

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: major version release schedule

2011-12-20 Thread Peter Schuller
> Until recently we were working hard to reach a set of goals that
> culminated in a 1.0 release.  I'm not sure we've had a formal
> discussion on it, but just talking to people, there seems to be
> consensus around the idea that we're now shifting our goals and
> priorities around some (usability, stability, etc).  If that's the
> case, I think we should at least be open to reevaluating our release
> process and schedule accordingly (whether that means lengthening,
> shorting, and/or simply shifting the barrier-to-entry for stable
> updates).

Personally I am all for added stability, quality, and testing. But I
don't see how a decreased release frequency will cause more stability.
It may be that decreased release frequency is the necessary *result*
of more stability, but I don't think the causality points in the other
direction unless developers ship things early to get it into the
release.

But also keep in mind: If we reach a point where major users of
Cassandra need to run on significantly divergent versions of Cassandra
because the release is just too old, the "normal" mainstream release
will end up getting even less testing.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Jenkins build is still unstable: Cassandra-Coverage #193

2011-12-20 Thread Apache Jenkins Server
See