[VOTE] Release Apache Cassandra 1.0.9
1.0.8 has been release more than a month ago, we made quite a few bug fixes and don't have any major outstanding issue open. I thus propose the following artifacts for release as 1.0.9. sha1: 4457839b9da623d9d4a090fa444614c35d39bb4c Git: http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/1.0.9-tentative Artifacts: https://repository.apache.org/content/repositories/orgapachecassandra-001/org/apache/cassandra/apache-cassandra/1.0.9/ Staging repository: https://repository.apache.org/content/repositories/orgapachecassandra-001/ The artifacts as well as the debian package are also available here: http://people.apache.org/~slebresne/ The vote will be open for 72 hours (longer if needed). [1]: http://goo.gl/CsEDg (CHANGES.txt) [2]: http://goo.gl/4ByoR (NEWS.txt)
digest query: why relying on value?
Hello, Why does the digest read response include a hash of the column value? Isn't the timestamp sufficient? May be an answer: Is the value hash computed to cope with (I presume rare) race condition scenario where 2 nodes would end up with same col. name and same col. timestamp but with a different col. value ? But then I wonder how to decide which value wins! Sincerely, Nicolas.
Re: digest query: why relying on value?
Look at Column.reconcile. On Mon, Apr 2, 2012 at 9:17 AM, Nicolas Romanetti wrote: > Hello, > > Why does the digest read response include a hash of the column value? Isn't > the timestamp sufficient? > > May be an answer: > Is the value hash computed to cope with (I presume rare) race condition > scenario where 2 nodes would end up with same col. name and same col. > timestamp but with a different col. value ? > But then I wonder how to decide which value wins! > > Sincerely, > > Nicolas. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: [VOTE] Release Apache Cassandra 1.0.9
+1 On Mon, Apr 2, 2012 at 8:33 AM, Sylvain Lebresne wrote: > 1.0.8 has been release more than a month ago, we made quite a few bug fixes > and don't have any major outstanding issue open. I thus propose the following > artifacts for release as 1.0.9. > > sha1: 4457839b9da623d9d4a090fa444614c35d39bb4c > Git: > http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/1.0.9-tentative > Artifacts: > https://repository.apache.org/content/repositories/orgapachecassandra-001/org/apache/cassandra/apache-cassandra/1.0.9/ > Staging repository: > https://repository.apache.org/content/repositories/orgapachecassandra-001/ > > The artifacts as well as the debian package are also available here: > http://people.apache.org/~slebresne/ > > The vote will be open for 72 hours (longer if needed). > > [1]: http://goo.gl/CsEDg (CHANGES.txt) > [2]: http://goo.gl/4ByoR (NEWS.txt) -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: [VOTE] Release Apache Cassandra 1.0.9
+1 -- Pavel Yaskevich On Monday 2 April 2012 at 17:25, Jonathan Ellis wrote: > +1 > > On Mon, Apr 2, 2012 at 8:33 AM, Sylvain Lebresne (mailto:sylv...@datastax.com)> wrote: > > 1.0.8 has been release more than a month ago, we made quite a few bug fixes > > and don't have any major outstanding issue open. I thus propose the > > following > > artifacts for release as 1.0.9. > > > > sha1: 4457839b9da623d9d4a090fa444614c35d39bb4c > > Git: > > http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/1.0.9-tentative > > Artifacts: > > https://repository.apache.org/content/repositories/orgapachecassandra-001/org/apache/cassandra/apache-cassandra/1.0.9/ > > Staging repository: > > https://repository.apache.org/content/repositories/orgapachecassandra-001/ > > > > The artifacts as well as the debian package are also available here: > > http://people.apache.org/~slebresne/ > > > > The vote will be open for 72 hours (longer if needed). > > > > [1]: http://goo.gl/CsEDg (CHANGES.txt) > > [2]: http://goo.gl/4ByoR (NEWS.txt) > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > >
Re: digest query: why relying on value?
A digest query is about making 1 digests for many columns, not 1 digest per column. If it were 1 digest per column, then yes, the timestamp would be an option. -- Sylvain On Mon, Apr 2, 2012 at 4:25 PM, Jonathan Ellis wrote: > Look at Column.reconcile. > > On Mon, Apr 2, 2012 at 9:17 AM, Nicolas Romanetti wrote: >> Hello, >> >> Why does the digest read response include a hash of the column value? Isn't >> the timestamp sufficient? >> >> May be an answer: >> Is the value hash computed to cope with (I presume rare) race condition >> scenario where 2 nodes would end up with same col. name and same col. >> timestamp but with a different col. value ? >> But then I wonder how to decide which value wins! >> >> Sincerely, >> >> Nicolas. > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com
Re: digest query: why relying on value?
Right on spot thanks! It would be interesting to have some metrics on how rare is the case: // break ties by comparing values. if (timestamp() == column.timestamp()) return value().compareTo(column.value()) < 0 ? column : this; If extremely rare, it would be may be more efficient to not hash the value and reclaim it only when hitting a such case (ok easy to say :-)) On Mon, Apr 2, 2012 at 4:25 PM, Jonathan Ellis wrote: > Look at Column.reconcile. > > On Mon, Apr 2, 2012 at 9:17 AM, Nicolas Romanetti > wrote: > > Hello, > > > > Why does the digest read response include a hash of the column value? > Isn't > > the timestamp sufficient? > > > > May be an answer: > > Is the value hash computed to cope with (I presume rare) race condition > > scenario where 2 nodes would end up with same col. name and same col. > > timestamp but with a different col. value ? > > But then I wonder how to decide which value wins! > > > > Sincerely, > > > > Nicolas. > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > -- Nicolas Romanetti 06 18 65 03 89 twitter: @nromanetti http://www.jaxio.com/ http://www.springfuse.com/
Re: ranges
Just List for the most part. If there are exactly two, maybe Pair. On Mon, Apr 2, 2012 at 6:30 PM, Mark Dewey wrote: > Is there an object that is standard for specifying a compound range? (eg > [W, X] + [Y, Z]) > > Mark -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
kudos...
I just wanted to let you guys know that I gave you a shout out... http://brianoneill.blogspot.com/2012/04/cassandra-vs-couchdb-mongodb-riak-hbase.html thanks for all the support, brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
implementation choice with regard to multiple range slice query filters
Hi guys I'm a PhD student and I'm trying to dip my feet in the water wrt to cassandra development, as I'm a long time fan. I'm implementing CASSANDRA-3885 which pertains to supporting returning multiple slices of a row. After looking around at the portion of the code that is involved two implementation options come to mind and I'd like to get feedback from you on whichever you think might work best (or even if I'm in the right track). As a first approach I simply subclassed SliceQueryFilter (setting start and finish to firstRange.start and lastRange.finish) and made the subclass not return the elements in between the ranges (spinning to the first element of the next range whenever the final element of the previous was found). This approach only uses one IndexedSliceReader but it scans from firstRange.start to lastRange.finish. Still when I was finishing It came to mind that in cases where the filter's selectivity is very low i.e., the ranges are a sparse selection of the total number of columns, I might be doing a full row scan for nothing, so another option came to mind: an iterator of iterators where I use multiple IndexedSliceReader's for each of the required slice ranges and simply iterate though them. Which do you think is the better option? Am I making any sense, or am I completely off track? Any help would be greatly appreciated. Cheers David Ribeiro Alves
Re: kudos...
Good post. Thanks, Brian! On Mon, Apr 2, 2012 at 11:04 PM, Brian O'Neill wrote: > I just wanted to let you guys know that I gave you a shout out... > http://brianoneill.blogspot.com/2012/04/cassandra-vs-couchdb-mongodb-riak-hbase.html > > thanks for all the support, > brian > > -- > Brian ONeill > Lead Architect, Health Market Science (http://healthmarketscience.com) > mobile:215.588.6024 > blog: http://weblogs.java.net/blog/boneill42/ > blog: http://brianoneill.blogspot.com/ -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: implementation choice with regard to multiple range slice query filters
That would work, but I think the best approach would actually push multiple ranges down into ISR itself, otherwise you could waste a lot of time reading the row header redundantly (the skipBloomFilter/deserializeIndex part). The tricky part would be getting IndexedBlockFetcher to not do extra work in the case where the ranges's index blocks overlap -- in other words, best of both worlds where we "skip ahead" when the index says we can at the end of one range, but doing a seq scan when that is more efficient. (Here's where I admit that I've asked several people to implement 3885 as a technical interview problem for DataStax. For the purposes of that interview, this last part is optional.) On Mon, Apr 2, 2012 at 11:19 PM, David Alves wrote: > Hi guys > > I'm a PhD student and I'm trying to dip my feet in the water wrt to > cassandra development, as I'm a long time fan. > I'm implementing CASSANDRA-3885 which pertains to supporting returning > multiple slices of a row. > > After looking around at the portion of the code that is involved two > implementation options come to mind and I'd like to get feedback from you on > whichever you think might work best (or even if I'm in the right track). > > As a first approach I simply subclassed SliceQueryFilter (setting > start and finish to firstRange.start and lastRange.finish) and made the > subclass not return the elements in between the ranges (spinning to the first > element of the next range whenever the final element of the previous was > found). This approach only uses one IndexedSliceReader but it scans from > firstRange.start to lastRange.finish. > > Still when I was finishing It came to mind that in cases where the > filter's selectivity is very low i.e., the ranges are a sparse selection of > the total number of columns, I might be doing a full row scan for nothing, so > another option came to mind: an iterator of iterators where I use multiple > IndexedSliceReader's for each of the required slice ranges and simply iterate > though them. > > Which do you think is the better option? Am I making any sense, or am > I completely off track? > > Any help would be greatly appreciated. > > Cheers > David Ribeiro Alves > > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com