Re: Write into composite coulmns
First, this is a compound primary key, not a composite [partition] key. You don't have a column definition for dpci, your partition key. Compact storage requires precisely and exactly and only one non-primary key column, but you have two. Maybe dpci and item were supposed to be the same. -- Jack Krupansky On Fri, Jul 10, 2015 at 1:43 AM, Ajay Chander wrote: > Any one here came across a situation like this ? Thank you! > > On Thursday, July 9, 2015, Ajay Chander wrote: > > > More information: > > > > > > Below is my cassandra bolt. > > > > > > public CassandraTest withCassandraBolt() { > > > > String[] rowKeyFields = {“item","location”}; > > > > HashMap clientConfig = newHashMap(); > > > > clientConfig.put(StormCassandraConstants.CASSANDRA_HOST, > > > > this.configuration.getCassandraBoltServer()); > > > > clientConfig.put(StormCassandraConstants.CASSANDRA_KEYSPACE, Arrays > > > > .asList(new String[] {this.projectConfiguration > > > > .getCassandraBoltKeyspace() })); > > > > this.stormConfig.put( > > > > this.configuration.getCassandraBoltConfigKey(), > > > > clientConfig); > > > > cassandraBolt = new CassandraBatchingBolt( > > > > this.configuration.getCassandraBoltConfigKey(), > > > > new CompositeRowTupleMapper( > > > > this.configuration.getCassandraBoltKeyspace(), > > > > this.configuration > > > > .getCassandraBoltColumnFamily(), > > > > rowKeyFields)); > > > > cassandraBolt.setAckStrategy(AckStrategy.ACK_ON_WRITE); > > > > return this; > > > > } > > > > > > > > This is my table in Cassandra: > > > > > > CREATE TABLE store ( item text, location text, type text, > > PRIMARY KEY (dpci, location) ) WITH COMPACT STORAGE; > > > > > > > > Error I am getting is below: > > > > > > 15810 [batch-bolt-thread] WARN > > com.netflix.astyanax.connectionpool.impl.Slf4jConnectionPoolMonitorImpl - > > BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=21(21), > > attempts=1]InvalidRequestException(why:Not enough bytes to read value of > > component 0) > > > > 15811 [batch-bolt-thread] ERROR > > com.hmsonline.storm.cassandra.bolt.CassandraBatchingBolt - Unable to > write > > batch. > > > > com.netflix.astyanax.connectionpool.exceptions.BadRequestException: > > BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=21(21), > > attempts=1]InvalidRequestException(why:Not enough bytes to read value of > > component 0) > > > > at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException( > > ThriftConverter.java:159) ~[astyanax-thrift-1.56.44.jar:na] > > > > at com.netflix.astyanax.thrift.AbstractOperationImpl.execute( > > AbstractOperationImpl.java:65) ~[astyanax-thrift-1.56.44.jar:na] > > > > at com.netflix.astyanax.thrift.AbstractOperationImpl.execute( > > AbstractOperationImpl.java:28) ~[astyanax-thrift-1.56.44.jar:na] > > > > at > > > com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute( > > ThriftSyncConnectionFactoryImpl.java:151) > > ~[astyanax-thrift-1.56.44.jar:na] > > > > at > > > com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation( > > AbstractExecuteWithFailoverImpl.java:119) ~[astyanax-core-1.56.44.jar:na] > > > > at > > > com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover( > > AbstractHostPartitionConnectionPool.java:338) > > ~[astyanax-core-1.56.44.jar:na] > > > > at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation( > > ThriftKeyspaceImpl.java:493) ~[astyanax-thrift-1.56.44.jar:na] > > > > at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000( > > ThriftKeyspaceImpl.java:79) ~[astyanax-thrift-1.56.44.jar:na] > > > > at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute( > > ThriftKeyspaceImpl.java:123) ~[astyanax-thrift-1.56.44.jar:na] > > > > at com.hmsonline.storm.cassandra.client.AstyanaxClient.writeTuples( > > AstyanaxClient.java:417) ~[classes/:na] > > > > at com.hmsonline.storm.cassandra.bolt.CassandraBolt.writeTuples( > > CassandraBolt.java:67) ~[classes/:na] > > > > at com.hmsonline.storm.cassandra.bolt.CassandraBatchingBolt.executeBatch( > > CassandraBatchingBolt.java:49) ~[classes/:na] > > > > at > com.hmsonline.storm.cassandra.bolt.AbstractBatchingBolt$BatchThread.run( >
Re: UML sequence diagrams on Wiki for explaining read/write path
Great stuff! You wrote "When the replica node comes back online the coordinator node will send the data to the replica node", which is partially true - if the replica comes back online within the timeout window of three hours. So, you probably want to say something like: "If the replica node comes back online within the hinted handoff window the coordinator node will send the data to the replica node, otherwise the replica node will need to be repaired." And maybe mention the configuration of the window. Change "The data is stored for a default period of 3 hours" to "The data is stored for a default period of 3 hours, configurable using the max_hint_window_in_ms property in cassandra.yaml." -- Jack Krupansky On Mon, Nov 30, 2015 at 2:24 AM, Michael Edge wrote: > Write path docs updated on Wiki - please review diagram/text and let me > have your comments (or update text in place). > > https://wiki.apache.org/cassandra/WritePathForUsers > > Cheers, > > Michael > > On 26 November 2015 at 11:25, Michael Shuler > wrote: > > > On 11/25/2015 07:36 PM, Michael Edge wrote: > > > I'd like to update the read/write path description on the wiki (see > link > > > below) by adding a couple of UML sequence diagrams I drew a while ago. > I > > > think they are much better than long textual descriptions for > describing > > > the order of operations on components. Before publishing them I'd > prefer > > a > > > knowledgeable volunteer to review them to ensure they are accurate. > > > > > > Let me know if you're interested and I'll send you a copy. > > > > > > https://wiki.apache.org/cassandra/ArchitectureOverview > > > > I don't think the mailing list likes attachments, so throw them up on a > > web server somewhere and send the list link(s) to your diagrams. > > Alternatively, go ahead and post them on the wiki and ask for feedback. > > > > -- > > Kind regards, > > Michael > > >
Re: UML sequence diagrams on Wiki for explaining read/write path
I just remembered... the new Materialized View support in 3.0 - writes to the materialized views get triggered when a write occurs to the base table. That needs to be in the write path flow/description as well. -- Jack Krupansky On Mon, Nov 30, 2015 at 9:30 PM, Michael Edge wrote: > Thanks for the feedback guys. I've made the updates. > > On 1 December 2015 at 00:56, Jack Krupansky > wrote: > > > Great stuff! > > > > You wrote "When the replica node comes back online the coordinator node > > will send the data to the replica node", which is partially true - if the > > replica comes back online within the timeout window of three hours. So, > you > > probably want to say something like: > > > > "If the replica node comes back online within the hinted handoff window > the > > coordinator node will send the data to the replica node, otherwise the > > replica node will need to be repaired." > > > > And maybe mention the configuration of the window. Change "The data is > > stored for a default period of 3 hours" to "The data is stored for a > > default period of 3 hours, configurable using the > > max_hint_window_in_ms property > > in cassandra.yaml." > > > > -- Jack Krupansky > > > > On Mon, Nov 30, 2015 at 2:24 AM, Michael Edge > > wrote: > > > > > Write path docs updated on Wiki - please review diagram/text and let me > > > have your comments (or update text in place). > > > > > > https://wiki.apache.org/cassandra/WritePathForUsers > > > > > > Cheers, > > > > > > Michael > > > > > > On 26 November 2015 at 11:25, Michael Shuler > > > wrote: > > > > > > > On 11/25/2015 07:36 PM, Michael Edge wrote: > > > > > I'd like to update the read/write path description on the wiki (see > > > link > > > > > below) by adding a couple of UML sequence diagrams I drew a while > > ago. > > > I > > > > > think they are much better than long textual descriptions for > > > describing > > > > > the order of operations on components. Before publishing them I'd > > > prefer > > > > a > > > > > knowledgeable volunteer to review them to ensure they are accurate. > > > > > > > > > > Let me know if you're interested and I'll send you a copy. > > > > > > > > > > https://wiki.apache.org/cassandra/ArchitectureOverview > > > > > > > > I don't think the mailing list likes attachments, so throw them up > on a > > > > web server somewhere and send the list link(s) to your diagrams. > > > > Alternatively, go ahead and post them on the wiki and ask for > feedback. > > > > > > > > -- > > > > Kind regards, > > > > Michael > > > > > > > > > >
Re: UML sequence diagrams on Wiki for explaining read/write path
Thanks. I took a quick look and it seems fine, but... I don't have the depth of expertise in that code to be 100% sure of each detail. Hopefully Carl or Jake, et al can review. One additional point on MV: When a new MV is created for a base table that is already populated, that kicks off a backfilling process that will read each existing row from the base table and apply the MV update process. I'm not sure of the precise details there either (like, which consistency those backfill writes use.) And that backfilling process has to occur on each node of the cluster (but not each replica.) Again, Carl, Jake, et al need to review precise details. -- Jack Krupansky On Wed, Dec 2, 2015 at 1:02 AM, Michael Edge wrote: > Yes - good point. > > I've updated the text for the write path page below - could you please > review? Once my understanding is correct I'll update the sequence diagram > to match. > > https://wiki.apache.org/cassandra/WritePathForUsers > > On 1 December 2015 at 22:51, Jack Krupansky > wrote: > > > I just remembered... the new Materialized View support in 3.0 - writes to > > the materialized views get triggered when a write occurs to the base > table. > > That needs to be in the write path flow/description as well. > > > > -- Jack Krupansky > > > > On Mon, Nov 30, 2015 at 9:30 PM, Michael Edge > > wrote: > > > > > Thanks for the feedback guys. I've made the updates. > > > > > > On 1 December 2015 at 00:56, Jack Krupansky > > > wrote: > > > > > > > Great stuff! > > > > > > > > You wrote "When the replica node comes back online the coordinator > node > > > > will send the data to the replica node", which is partially true - if > > the > > > > replica comes back online within the timeout window of three hours. > So, > > > you > > > > probably want to say something like: > > > > > > > > "If the replica node comes back online within the hinted handoff > window > > > the > > > > coordinator node will send the data to the replica node, otherwise > the > > > > replica node will need to be repaired." > > > > > > > > And maybe mention the configuration of the window. Change "The data > is > > > > stored for a default period of 3 hours" to "The data is stored for a > > > > default period of 3 hours, configurable using the > > > > max_hint_window_in_ms property > > > > in cassandra.yaml." > > > > > > > > -- Jack Krupansky > > > > > > > > On Mon, Nov 30, 2015 at 2:24 AM, Michael Edge < > edge.mich...@gmail.com> > > > > wrote: > > > > > > > > > Write path docs updated on Wiki - please review diagram/text and > let > > me > > > > > have your comments (or update text in place). > > > > > > > > > > https://wiki.apache.org/cassandra/WritePathForUsers > > > > > > > > > > Cheers, > > > > > > > > > > Michael > > > > > > > > > > On 26 November 2015 at 11:25, Michael Shuler < > mich...@pbandjelly.org > > > > > > > > wrote: > > > > > > > > > > > On 11/25/2015 07:36 PM, Michael Edge wrote: > > > > > > > I'd like to update the read/write path description on the wiki > > (see > > > > > link > > > > > > > below) by adding a couple of UML sequence diagrams I drew a > while > > > > ago. > > > > > I > > > > > > > think they are much better than long textual descriptions for > > > > > describing > > > > > > > the order of operations on components. Before publishing them > I'd > > > > > prefer > > > > > > a > > > > > > > knowledgeable volunteer to review them to ensure they are > > accurate. > > > > > > > > > > > > > > Let me know if you're interested and I'll send you a copy. > > > > > > > > > > > > > > https://wiki.apache.org/cassandra/ArchitectureOverview > > > > > > > > > > > > I don't think the mailing list likes attachments, so throw them > up > > > on a > > > > > > web server somewhere and send the list link(s) to your diagrams. > > > > > > Alternatively, go ahead and post them on the wiki and ask for > > > feedback. > > > > > > > > > > > > -- > > > > > > Kind regards, > > > > > > Michael > > > > > > > > > > > > > > > > > > > > >
Re: Apache Cassandra - Question about data model
It's best to ask usage and data modeling questions on the user email list - this list is the dev list, for development of Cassandra itself, not for development of applications. See: http://cassandra.apache.org/ -- Jack Krupansky On Thu, Dec 31, 2015 at 8:36 AM, Lior Menashe wrote: > Hi, > > Just got your mail from the #cassandra channel on the web chat because i > couldn't get an answer... > > I have a question that i'll be glad if you can help me or give me a > direction. > > I have an activity feed like the activity feed on Instagram. When user > (lets say UserA) enters his page he can see all the activities that are > related to him, > for example, user B liked your post.user C commented on your post etc... > > the cassandra data model that i thought about is: > > userID UDID (partition key) > datetimeadded timestamp (clustering column DESC) > userID_Name text > userID_Picture_URL text > userID_From UDID (this is userB from the example) > userID_From_Name text > userID_From_Picture_URL > > With this structure i can get the different activities to a user and it > works just fine. My problem is that userID_From can change his name and his > pictire and i need this data to be updated all arround the different tables > because i want to show the current right values. > > The problem is that the update is a table scan and it's not efficient. > Should i hold only the ID and every time that i select a slice of the data > and get a several ID's i'll do a nother query to query about > the values of the users name and picture path? Should i do something else? > > Best regards, > Lior >
Re: Versioning policy?
Mark, what would the policy dictate if the bug is in a feature that was introduced in 3.x and the feature wasn't in 3.0 and the current release is 3.x+k where k is greater than two - technically 3.x is no longer supported, so does that mean no backporting of the fix and only 3.x+k+1 would get the bug fix (or maybe 3.x+k-1.y in your rare cases)? In this case the user on 3.x would have to upgrade more than two releases to get the big fix even though 3.x may only have been released a few months earlier. I think this is the gist of the original discussion that the EOL policy for 3.x is way too short for more conservative customers. Maybe the right answer for them is to only use x.0.z releases and not use features introduced in x.y releases. But even then... if they adopt 3.0.z just a month or two before x+1.0 comes out they are back in that same boat that no matter what release they pick it will go EOL within just a few months. -- Jack Krupansky On Thu, Jan 14, 2016 at 11:39 AM, Mark Dewey wrote: > Let's do a couple examples: > >1. Current release: 3.4.0, bug found in 3.1.0 that also exists in >subsequent versions; the bug fix will be ported back to 3.0.x and 3.5.0. >2. Current release 3.3.0, bug found in 3.3.0; the bug fix will be ported >back to 3.0.x and 3.4.0. In rare cases, a 3.3.1 may also be released > (we've >already seen one example of this). >3. Current release 3.6.0, bug found in 3.1.0; the bug fix will be ported >back to 3.0.x and 3.7. > > Essentially, any bug fixes will be released in the next minor version and > backported to the 3.0.x branches (unless the feature doesn't exist in 3.0). > We have seen one instance where we released a point version as well, but > that was for a major regression that was discovered almost immediately > after a release. > > On Wed, Jan 13, 2016 at 12:55 PM Maciek Sakrejda > wrote: > > > Can anyone chime in here? We're getting ready to run a decent number of > > nodes; we'd like to have a better idea of what to expect with respect to > > patching and upgrading. A clear versioning policy like the one laid out > by > > Postgres would be very helpful. > > > > >
Re: Versioning policy?
Jonathan, just to complete the list, it would be help to state: 3.1.x will be maintained until 3.2 will be maintained until And in general, 3.x (x != 0) will be maintained until (and does x even vs. odd affect the rule) And what exactly is the general rule/criteria for when 3.x will be considered the official "stable release"? And will 3.0.x always be considered the recommended non-production release until 4.0 comes out or is there general guidance/criteria for when a 3.x would become recommended for non-production? Or will tick-tock completely replace that "traditional" section? In which case, the question of criteria for defining "stable release" remains, unless it becomes no different than the latest tick-tock release. -- Jack Krupansky On Thu, Jan 14, 2016 at 12:57 PM, Anuj Wadehra wrote: > Hi Jonathan, > Thanks for the crisp communication regarding the tick tock release & EOL. > I think its worth considering some points regarding EOL policy and it > would be great if you can share your thoughts on below points: > 1. EOL of a release should be based on "most stable"/"production ready" > version date rather than "GA" date of subsequent major releases. > 2. I think we should have "Formal EOL Announcement" on Apache Cassandra > website. > 3. "Formal EOL Announcement" should come at least 6 months before the EOL, > so that users get reasonable time to upgrade. > 4. EOL Policy (even if flexible) should be stated on Apache Cassandra > website > > EOL thread on users mailing list ended with the conclusion of raising a > Wishlist JIRA but I think above points are more about working on policy and > processes rather than just a wish list. > > ThanksAnuj > > > > Sent from Yahoo Mail on Android > > On Thu, 14 Jan, 2016 at 10:57 pm, Jonathan Ellis > wrote: Hi Maciek, > > First let's talk about the tick-tock series, currently 3.x. This is pretty > simple: outside of the regular monthly releases, we will release fixes for > critical bugs against the most recent bugfix release, the way we did > recently with 3.1.1 for CASSANDRA-10822 [1]. No older tick-tock releases > will be patched. > > Now, we also have three other release series currently being supported: > > 2.1.x: supported with critical fixes only until 4.0 is released, projected > in November 2016 [2] > 2.2.x: maintained until 4.0 is released > 3.0.x: maintained for 6 months after 4.0, i.e. projected until May 2017 > > I will add this information to the releases page [3]. > > [1] > > https://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201512.mbox/%3CCAKkz8Q3StqRFHfMgCMRYaaPdg+HE5N5muBtFVt-=v690pzp...@mail.gmail.com%3E > [2] 4.0 will be an ordinary tick-tock release after 3.11, but we will be > sunsetting deprecated features like Thrift so bumping the major version > seems appropriate > [3] http://cassandra.apache.org/download/ > > On Sun, Jan 10, 2016 at 9:29 PM, Maciek Sakrejda > wrote: > > > There was a discussion recently about changing the Cassandra EOL policy > on > > the users list [1], but it didn't really go anywhere. I wanted to ask > here > > instead to clear up the status quo first. What's the current versioning > > policy? The tick-tock versioning blog post [2] states in passing that two > > major releases are maintained, but I have not found this as an official > > policy stated anywhere. For comparison, the Postgres project lays this > out > > very clearly [3]. To be clear, I'm not looking for any official support, > > I'm just asking for clarification regarding the maintenance policy: if a > > critical bug or security vulnerability is found in version X.Y.Z, when > can > > I expect it to be fixed in a bugfix patch to that major version, and when > > do I need to upgrade to the next major version. > > > > [1]: http://www.mail-archive.com/user@cassandra.apache.org/msg45324.html > > [2]: http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/ > > [3]: http://www.postgresql.org/support/versioning/ > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder, http://www.datastax.com > @spyced > >
Re: Versioning policy?
Thanks, Jonathan. The end-of-life (EOL) question is still dangling out there - when does 3.x go off support, after 3.x+3 or six months after 4.0? Or... six months after 5.0? -- Jack Krupansky On Thu, Jan 14, 2016 at 6:15 PM, Jonathan Ellis wrote: > On Thu, Jan 14, 2016 at 4:26 PM, Jack Krupansky > wrote: > > > Jonathan, just to complete the list, it would be help to state: > > > > 3.1.x will be maintained until > > 3.2 will be maintained until > > > > One of the confusing things about tick tock is that we're stuck with > numbers that look like the old ones but mean different things. > > In the old world, 2.1 was a release that took a year of work, and it got > maintained with roughly-monthly updates of 2.1.x. > > In the tick tock world, the corresponding series is just "3," and the > monthly updates are 3.1, 3.2, and so forth, with new features allowed in > the even releases every two months. So in general, there will be no 3.1.x > or 3.2.y releases. When a bug is critical enough to make an exception to > the "wait for the next monthly release" rule, it will be fixed in the most > recent bugfix tock. > > will tick-tock completely replace that "traditional" > > section? > > > Yes. > > > > In which case, the question of criteria for defining "stable > > release" remains, unless it becomes no different than the latest > tick-tock > > release. > > > > That's the idea, and that's why we're getting very religious about test > engineering, so that those monthly releases will always be stable. >
3.1 status?
It's great to see clear support status marked on the 3.0.x and 2.x releases on the download page now. A couple more questions... 1. What is the support and stability status of 3.1 and 3.2 (as opposed to 3.2.1)? Are they "for non-production development only"? Are they considered "stable"? The page should say. 2. Is there simply no "stable" release for 3.x, or is the latest tick-tock release by definition considered "stable"? 3. The first paragraph says "If a critical bug is found, a patch will be released against the most recent bug fix release", but in fact the latest critical patch (3.2.1) is against a feature release, not a bug fix release. Should that simply say "... against the most recent tick-tock release" regardless of whether it was an even (feature) or odd (bug fix) release? Thanks. -- Jack Krupansky
Re: DSE Release Planned That Corresponds with Cassandra 3.2
Stack Overflow has reasonable coverage for DataStax Enterprise - at least I paid attention to it when I was consulting for DataStax on DSE Search. See: http://stackoverflow.com/questions/tagged/datastax-enterprise http://stackoverflow.com/questions/34293295/cassandra-3-x-with-datastax-enterprise That doesn't give you a desirable answer to your question, but that's the best you can do for now, other than the sweet nothings that a sales person is willing to whisper in your ear for private consumption in the name of enhancing revenue for DataStax. -- Jack Krupansky On Wed, Feb 3, 2016 at 10:36 AM, Corry Opdenakker wrote: > Thanks Ryan, I just sent them a copy of the post. > Cheers, Corry > > On Wed, Feb 3, 2016 at 4:34 PM, Ryan Svihla wrote: > > > I’d just contact sa...@datastax.com instead. > > > > > On Feb 3, 2016, at 9:32 AM, Corry Opdenakker > wrote: > > > > > > You are Right Ryan, but since there is no DSE mailinglist, I thought I > > > could ask it overhere. > > > Thanks anyway for the reply. > > > Regards, Corry > > > > > > On Wed, Feb 3, 2016 at 4:30 PM, Ryan Svihla wrote: > > > > > >> Corry, > > >> > > >> This is the Cassandra developer mailing list aimed at contributors to > > the > > >> Cassandra code base (not all of whom work for DataStax for example). > > Can I > > >> suggest you contact DataStax and ask the same question? > > >> > > >> > > >> Regards, > > >> > > >> Ryan Svihla > > >> > > >>> On Feb 3, 2016, at 9:27 AM, Corry Opdenakker > > wrote: > > >>> > > >>> Hi everyone, > > >>> > > >>> Yesterday evening I installed DSE for the first time at my macbook, > > and I > > >>> must say that untill now it runs very fast with 0 users and 0 data > > >> records:) > > >>> > > >>> I was reading the release notes of C* 3.x and it reports "The Storage > > >>> Engine has been refactored", "Java 8 required", and the introduction > of > > >>> "Materialized Views" as some of the most important changes. > > >>> When will there be a DSE version released that Corresponds with > > Cassandra > > >>> 3.2 or any other 3.x subversion? > > >>> If there isn't one planned yet then I'll probably switch to C* 3.2 > > before > > >>> starting development. > > >>> > > >>> I've searched at several places but I couldn't find a DSE release > > >>> announcement or product roadmap that covers this topic. > > >>> If anyone could mention the relevant page, then that would be great. > > >>> > > >>> Cheers, Corry > > >> > > >> > > > > > > > > > -- > > > -- > > > Bestdata.be > > > Optimised ict > > > Tel:+32(0)496609576 > > > co...@bestdata.be > > > -- > > > > > > > -- > -- > Bestdata.be > Optimised ict > Tel:+32(0)496609576 > co...@bestdata.be > -- >
Re: Custom Java class for secondary index implementation
Is RegularColumnIndex representative of what a typical custom index needs... or not? https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/composites/RegularColumnIndex.java Ditto for CassandraIndex (abstract class) - should all (or at least most) custom indexes extend it? https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java -- Jack Krupansky On Sun, Mar 6, 2016 at 11:03 PM, Henry Manasseh wrote: > Thank you. This is a perfect class for to start experimenting with. > > > On Sun, Mar 6, 2016 at 2:38 PM Sam Tunnicliffe wrote: > > > You might find o.a.c.i.StubIndex in the test source tree useful. > > On 6 Mar 2016 19:24, "Henry Manasseh" wrote: > > > > > Hello, > > > I was wondering if anyone is aware of a minimal reference > implementation > > > for a java class implementing a secondary index or some documentation > of > > > the interface(s) I would need to implement (I looked at the SASI 2i > code > > > but I am trying to find the bare bones test or sample class for a > newbie > > if > > > it exists). > > > > > > e.g. I am referring to the class you would use for > > > 'path.to.the.IndexClass'. > > > > > > CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass'; > > > > > > > > > > > > > I am trying to understand if it is possible to index a UDT collection > > based > > > on only one field of the UDT and still use the cassandra index file > > > management (I don't want to provide my own file storage). In other > > words, I > > > am looking to see if the custom index class can serve as a > transformation > > > function to only index my subfield. This probably will need a tweak to > > the > > > CQL to prevent a syntax error but not concerned about that at the > moment. > > > > > > Thank you for any helps and tips, > > > Henry > > > > > >
Re: Wiki
Interesting question as to what the future of the nodetool wiki page is. You can get more/full detail in the DataStax doc: https://docs.datastax.com/en/cassandra/3.x/cassandra/tools/toolsNodetool.html -- Jack Krupansky On Mon, Apr 4, 2016 at 12:08 PM, Anubhav Kale wrote: > There are others missing as well. For instance, getendpoints (which is > quite handy). > > -Original Message- > From: Dave Brosius [mailto:dbros...@mebigfatguy.com] > Sent: Sunday, April 3, 2016 8:13 AM > To: dev@cassandra.apache.org > Subject: Re: Wiki > > Done > > On 04/03/2016 10:56 AM, Pedro Gordo wrote: > > Hi Jonathan > > > > I forgot about that, sorry. It's "PedroGordo". > > > > Best regards > > > > Pedro Gordo > > > > On 3 April 2016 at 15:41, Jonathan Ellis wrote: > > > >> Hi Pedro, > >> > >> We need your wiki username to add it to the editors list. Thanks! > >> > >> On Sun, Apr 3, 2016 at 9:15 AM, Pedro Gordo > >> > >> wrote: > >> > >>> Hi > >>> > >>> I would like to contribute to C* Wiki if possible, so sending the > >>> email > >> as > >>> requested on the wiki's front page. The reason for this is that the > >>> nodetool page > >>> <https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fwi > >>> ki.apache.org%2fcassandra%2fNodeTool&data=01%7c01%7cAnubhav.Kale% > 40microsoft.com%7c1d8be3522d2d494617df08d35bd273e5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=UBQCfJSoDBisqLOYqqk34DE5y8a1YIf3z%2fbl6TRSEqw%3d> > is missing the "gossipinfo" command. I haven't checked if there's more > commands missing. > >>> > >>> Let me know if I can help! > >>> > >>> Best regards > >>> Pedro Gordo > >>> > >> > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder, > >> https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fwww.d > >> atastax.com&data=01%7c01%7cAnubhav.Kale%40microsoft.com%7c1d8be3522d2 > >> d494617df08d35bd273e5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=Lj > >> hh7VAb8bqQ5aTq6DYIlEXnhAFqyTX7aZeHRqGzCG4%3d > >> @spyced > >> > >
max_mutation_size_in_kb addition not noted in CHANGES.txt
I don't find mention of max_mutation_size_in_kb in CHANGES.txt, but I do see it mentioned for 3.0.0 in NEWS.txt. It doesn't have (DataStax) doc either. It looks like it was added on 8/19/2015: "patch by Aleksey Yeschenko; reviewed by Benedict Elliott Smith for CASSANDRA-6230": https://github.com/apache/cassandra/commit/96d41f0e0e44d9b3114a5d80dedf12053d36a76b#diff-b66584c9ce7b64019b5db5a531deeda1 It probably should be (or have been) added to CHANGES.txt as well. I do see that comments on this was added to the yaml file on 10/2/2015 as part of: https://issues.apache.org/jira/browse/CASSANDRA-10256 -- Jack Krupansky
Typo in CommitLog.java: maxiumum s.b. maximum
I was baffled why I couldn't find a user's reported log message of "Mutation 32MB too large for maximum size of 16Mb" even when I searched GitHub for "too large for maximum size". The reason my search failed was that the user (or their email client) must have corrected the typo in the message that the code actually produces - the code has the text "too large for the maxiumum size", when "maxiumum" should be "maximum". In CommitLog.java, the current code in the add method in trunk: if (totalSize > MAX_MUTATION_SIZE) { throw new IllegalArgumentException(String.format("Mutation of %s is too large for the maxiumum size of %s", FBUtilities.prettyPrintMemory(totalSize), FBUtilities.prettyPrintMemory(MAX_MUTATION_SIZE))); } The corrected code: if (totalSize > MAX_MUTATION_SIZE) { throw new IllegalArgumentException(String.format("Mutation of %s is too large for the maximum size of %s", FBUtilities.prettyPrintMemory(totalSize), FBUtilities.prettyPrintMemory(MAX_MUTATION_SIZE))); } -- Jack Krupansky
COMPACT STORAGE in 4.0?
My understanding is Thrift is being removed from Cassandra in 4.0, but will COMPACT STORAGE be removed as well? Clearly the two are related, but COMPACT STORAGE had a performance advantage in addition to Thrift compatibility, so its status is ambiguous. I recall vague chatter, but no explicit deprecation notice or 4.0 plan for removal of COMPACT STORAGE. Actually, I don't even see a deprecation notice for Thrift itself in CHANGES.txt. Will a table with only a single non-PK column automatically be implemented at a comparable level of efficiency compared to the old/current Compact STORAGE? That will still leave the question of how to migrate a non-Thrift COMPACT STORAGE table (i.e., used for performance by a CQL-oriented developer rather than Thrift compatibility per se) to pure CQL. -- Jack Krupansky
Undocumented config properties
When checking into doc for max_mutation_size_in_kb I noticed that there are more Config properties that are neither in the yaml nor the DataStax doc - or the old (outdated) Config wiki. For example, the first is permissions_cache_max_entries. Before suggesting/requesting that the DataStax doc guys add all the missing properties, are there any properties in Config.java that should not be documented or disclosed to users? A separate question is whether "doc" for all Config properties should be added to the yaml file, even if the actual properties are commented out. If the answer is that not all of the properties should be documented, there should be an annotation convention for those that should be hidden. See: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/Config.java https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml For (outdated) reference: http://wiki.apache.org/cassandra/StorageConfiguration -- Jack Krupansky
Re: COMPACT STORAGE in 4.0?
Thanks, Benedict. Is this only true as of 3.x (new storage engine), or was the equivalent efficiency also true with 2.x? It would be good to have an explicit statement on this efficiency question in the spec/doc since the spec currently does say: "The option also *provides a slightly more compact layout of data on disk* but at the price of diminished flexibility and extensibility for the table." So, if that "slightly more compact layout of data on disk" benefit is no longer true, away with it. See: https://cassandra.apache.org/doc/cql3/CQL-3.0.html And I would recommend that your own statement be added there instead. -- Jack Krupansky On Mon, Apr 11, 2016 at 5:03 PM, Jeremiah Jordan wrote: > As I understand it "COMPACT STORAGE" only has meaning in the CQL parser > for backwards compatibility as of 3.0. The on disk storage is not affected > by its usage. > > > On Apr 11, 2016, at 3:33 PM, Benedict Elliott Smith > wrote: > > > > Compact storage should really have been named "not wasteful storage" - > now > > everything is "not wasteful storage" so it's void of meaning. This is > true > > without constraint. You do not need to limit yourself to a single non-PK > > column; you can have many and it will remain as or more efficient than > > "compact storage" > > > > On Mon, 11 Apr 2016 at 15:04, Jack Krupansky > > wrote: > > > >> My understanding is Thrift is being removed from Cassandra in 4.0, but > will > >> COMPACT STORAGE be removed as well? Clearly the two are related, but > >> COMPACT STORAGE had a performance advantage in addition to Thrift > >> compatibility, so its status is ambiguous. > >> > >> I recall vague chatter, but no explicit deprecation notice or 4.0 plan > for > >> removal of COMPACT STORAGE. Actually, I don't even see a deprecation > notice > >> for Thrift itself in CHANGES.txt. > >> > >> Will a table with only a single non-PK column automatically be > implemented > >> at a comparable level of efficiency compared to the old/current Compact > >> STORAGE? That will still leave the question of how to migrate a > non-Thrift > >> COMPACT STORAGE table (i.e., used for performance by a CQL-oriented > >> developer rather than Thrift compatibility per se) to pure CQL. > >> > >> -- Jack Krupansky > >> >
CQL spec error: COUNT(column)
The CQL spec for COUNT says: "It also can be used to count the non null value of a given column. Example: SELECT COUNT(scores) FROM plays;" But, the parser only recognizes COUNT(*) and COUNT(1). See: https://cassandra.apache.org/doc/cql3/CQL-3.0.html https://github.com/apache/cassandra/blob/trunk/src/antlr/Parser.g#L276 Does this need a Jira, or can somebody just fix it to avoid the paperwork? Or, was this actually supposed to work and the bug is some missing implementation? Thanks. -- Jack Krupansky
Re: CQL spec error: COUNT(column)
No, I didn't test, I was just reading the code, but I hadn't checked for all occurrences of K_COUNT, so I hadn't noticed that it also occurs in the allowedFunctionName grammar production rule. And I found the code that dynamically creates a count function for each type here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L885 So, sorry for the confusion. Some comments in the grammar would have helped. And it's not in the DataStax doc, but that's another issue I can follow up on. -- Jack Krupansky On Mon, Apr 18, 2016 at 10:12 AM, Benjamin Lerer < benjamin.le...@datastax.com> wrote: > May be I misunderstood you. > Do you mean that you tested it and that it is not working on the version > you used? > > On Mon, Apr 18, 2016 at 3:50 PM, Jack Krupansky > wrote: > > > The CQL spec for COUNT says: > > > > "It also can be used to count the non null value of a given column. > > Example: > > > > SELECT COUNT(scores) FROM plays;" > > > > But, the parser only recognizes COUNT(*) and COUNT(1). > > > > See: > > https://cassandra.apache.org/doc/cql3/CQL-3.0.html > > https://github.com/apache/cassandra/blob/trunk/src/antlr/Parser.g#L276 > > > > Does this need a Jira, or can somebody just fix it to avoid the > paperwork? > > > > Or, was this actually supposed to work and the bug is some missing > > implementation? > > > > Thanks. > > > > -- Jack Krupansky > > >
Re: C-136A Questionnaire for Apache Cassandra
Here's a recent security assessment discussion - list of questions, proposed response, and discussion, which you might find helpful, courtesy of Oleg Yusim: https://docs.google.com/document/d/13-yu-1a0MMkBiJFPNkYoTd1Hzed9tgKltWi6hFLZbsk/edit?ts=56c3a130#heading=h.xq6exsjcda8 -- Jack Krupansky On Mon, Apr 18, 2016 at 7:31 AM, Johnson, Tom (ES & CSO) < thomas.johns...@ngc.com> wrote: > Apache Cassandra Support, > > > > In order to install this software on our servers, I need help from Planet > Cassandra in completing the attached questionnaire for Northrop Grumman > Information Security. Please provide as much detail as possible for all > questions. If you have any questions for me, please let me know. Thanks. > > > > > > *Thomas (Tom) Johnson* > > *Software Development Analyst* > > *Engineering Tools Support Team (ETST)* > > *Northrop Grumman Information Technology* > > *Office: 321-674-3168 <321-674-3168>* > > *thomas.johns...@ngc.com * > > >
Typo in comment for maxFunctionForCounter in AggregateFcts.java
Comment typo for maxFunctionForCounter in AggregateFcts.java: /** * AVG function for counter column values. */ public static final AggregateFunction maxFunctionForCounter = new NativeAggregateFunction("max", CounterColumnType.instance, CounterColumnType.instance) That comment should be "MAX" rather than "AVG" See: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L760 Oops... I see more copy and paste typos, where the type for the function is wrong in the comment: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L284 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L320 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L362 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L398 Let me know if this needs a Jia ticket or if somebody can just fix it without the paperwork. -- Jack Krupansky
Re: Typo in comment for maxFunctionForCounter in AggregateFcts.java
Thanks for the prompt service! -- Jack Krupansky On Tue, Apr 19, 2016 at 12:16 PM, Benjamin Lerer < benjamin.le...@datastax.com> wrote: > Done. Thanks for reporting the problem. > > On Tuesday, April 19, 2016, Benjamin Lerer > wrote: > > > I will fix it. I am the one that has probably done them anyway. > > > > Benjamin > > > > On Tue, Apr 19, 2016 at 4:57 PM, Jack Krupansky < > jack.krupan...@gmail.com > > > wrote: > > > >> Comment typo for maxFunctionForCounter in AggregateFcts.java: > >> > >> /** > >> * AVG function for counter column values. > >> */ > >> public static final AggregateFunction maxFunctionForCounter = > >> new NativeAggregateFunction("max", CounterColumnType.instance, > >> CounterColumnType.instance) > >> > >> That comment should be "MAX" rather than "AVG" > >> > >> See: > >> > >> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L760 > >> > >> Oops... I see more copy and paste typos, where the type for the function > >> is > >> wrong in the comment: > >> > >> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L284 > >> > >> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L320 > >> > >> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L362 > >> > >> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L398 > >> > >> Let me know if this needs a Jia ticket or if somebody can just fix it > >> without the paperwork. > >> > >> -- Jack Krupansky > >> > > > > >
Re: Criteria for upgrading to 3.x releases in PROD
Is the question whether a new application can go into production with 3.x, or whether an existing application in production with 2.x.y should be upgraded to 3.x? For the latter, a "If it ain't broke, don't fix it" philosophy is best. And if there are critical bug fixes needed, simply upgrade the 2.x line that you are already on. Or if your production is on 3.0.x, upgrade to 3.0.x+k. For the former, we aren't hearing people hollering that 3.x is crap, so it is reasonably safe for a new app going into production, subject to your own testing. Given the relative stability of 3.x due to the tick-tock and "trunk always releasable" strategies, users are no longer faced with the kind of wild instabilities of the past. Ultimately, stability really is subjective and in the eye of the beholder - how conservative or adventurous are you and your organization. Sure, maybe 2.2.x is more stable in some abstract sense, but for a new app, why start so far behind the curve? In fact, for a new app you should be trying to take advantage of new features and performance improvements, like materialized views, SASI, and wide rows coming soon. In the past, upgrading from 2.x to 2.y was a big deal. That just isn't a problem with upgrading from 3.x to 3.y. At least in theory, and again, nobody has been hollering about having problems doing that. For EOL, you will have to judge for yourself how long it may take your organization to carefully migrate a production 2.x system to 3.x somewhere down the road. No need to rush, but don't wait until the last minute either. And I suspect that you won't even want to think about upgrading 2.x to 4.x - IOW, upgrade to 3.x well before 3.x EOL. -- Jack Krupansky On Sat, Apr 23, 2016 at 3:28 PM, Anuj Wadehra < anujw_2...@yahoo.co.in.invalid> wrote: > Jonathan, > I understand you point. In my perspective, people in production usually > prefer stability over features and would always want at least emergency fix > releases if not fully supported versions.I am glad that today we have such > releases which are very stable and not yet EOL. Its just that users are > tempted to use latest odd releases as per the tick-tock strategy > highlighted on the website and then probably fallback to previous ones > after discussing stable versions on various forums. I just wanted to make > their decisions simpler :) I agree with you - Every thing cant be white and > black..stable and unstable..At the same..I feel.. most of the time there > would be a single stable release which is not EOL. > Thanks for your time. > > > Anuj > Sent from Yahoo Mail on Android > > On Tue, 19 Apr, 2016 at 7:06 AM, Jonathan Ellis > wrote: Anuj, > > The problem is that this question defies a simplistic answer like "version > X is the most stable" (are you willing to use unsupported releases? what > about emergency-fix-only? what features can you not live without?) so > we're intentionally resisting the urge to oversimplify the situation. > > On Mon, Apr 18, 2016 at 8:25 PM, Anuj Wadehra < > anujw_2...@yahoo.co.in.invalid> wrote: > > > Hi All, > > Let me reiterate, my question is not about selecting right Cassandra for > > me. The intent is to get dev community response on below question. > > Question: > > Would it be a wise decision to mention the "most stable/production > > ready" version (as it used to be before 3.x) on the Apache website till > > tick-tock release strategy evolves and matures? > > > > Drivers for posting above info on website: > > I have read all the posts/forums and realized that there is no absolute > > answer for selecting Production Ready Cassandra version one should > > use..Even now, people often hesitate to recommend latest releases for > Prod > > and go back to 2.1 and 2.2..In every suggestion there are too many > > ifs..like I said...if you want features x..if u want rock solid y..if you > > are adventurous zno offense but who would not want a rock solid > > version for Production? Who would want features for stability in Prod? > And > > who would want to take risks in Prod? > > The stability of a release should NOT depend my risk appetite and use > > case..if some version of 2.1 or 2.2 or 3.0.x is stable for production why > > not put that info until tick-tock matures? > > > > Please realize that everyone goes for thorough testing before upgrading > > but the scope of application testing cant uncover most critical > > bugs..Community guidance and a bigger picture on stability can help the > > community until tick-tock matures and we deliver stable production ready > > releases. > > > > > > > > ThanksAnuj > > Sent from Yahoo Ma
Re: Criteria for upgrading to 3.x releases in PROD
The old "most stable" labeling was super-important in the old days since newer releases tended to have a fair amount of instability - hence the admonition to wait for a x.y.5 release before using the x.y release line. But that dramatic degree of release instability is no longer the case. Besides, we already have 3.5 and 3.0.5, so that x.y.5 point is now moot. And, generally, there won't be any 3.x.y releases other than 3.0.y and 3.x.0 unless something especially unusual goes wrong - there have only been two, 3.1.1 and 3.2.1. The simple fact that there have only been five subsequent patch releases for 3.0.0 in over five months is a testament to its stability. That's not an argument for why anybody should switch from 2.x to 3.x if 2.x is working fine for them, but a challenge to the presumption that 3.x is unstable and unsuitable for production. To put it more simply, the goal is that no 3.x (or x.y for x > 2) goes out the door unless it is suitable for production. To put it even more strongly, I think the development and release process is now robust enough that no new releases, 2.x or 3.x, get out the door unless suitable for production. In short, the only reason to go with 2.2.x at this stage is if you are currently using 2.2.x and simply don't want to rock the boat in even the smallest way. That's reasonable. But if you are using 2.1.x or earlier and feel the need to upgrade due to EOL or whatever, in may be less risky to complete the full upgrade path to 3.x rather than go halfway since sometimes halfhearted measures don't get the degree of attention needed for the full effort, such as people trying to cut corners to do it on the cheap. -- Jack Krupansky On Sat, Apr 23, 2016 at 10:50 PM, Anuj Wadehra < anujw_2...@yahoo.co.in.invalid> wrote: > Jack, > The question was about publishing "most stable" release on Apache website > as it done before 3.x. > Regarding your comments, I still feel adventure cant happen in production > systems. And you should certainly test every release before upgrading but > you woulf not like to upgrade to latest releases based on your limited > testing. I feel that you cant do exhaustive testing of the database and can > easily miss critical corner cases which may trigger in production. But its > just my perspective of looking at things. People may think differently. > Thanks All of you for your comments !! > > ThanksAnuj > Sent from Yahoo Mail on Android > > On Sun, 24 Apr, 2016 at 1:28 AM, Jack Krupansky > wrote: Is the question whether a new application can go into production > with 3.x, > or whether an existing application in production with 2.x.y should be > upgraded to 3.x? > > For the latter, a "If it ain't broke, don't fix it" philosophy is best. And > if there are critical bug fixes needed, simply upgrade the 2.x line that > you are already on. Or if your production is on 3.0.x, upgrade to 3.0.x+k. > > For the former, we aren't hearing people hollering that 3.x is crap, so it > is reasonably safe for a new app going into production, subject to your own > testing. > > Given the relative stability of 3.x due to the tick-tock and "trunk always > releasable" strategies, users are no longer faced with the kind of wild > instabilities of the past. > > Ultimately, stability really is subjective and in the eye of the beholder - > how conservative or adventurous are you and your organization. Sure, maybe > 2.2.x is more stable in some abstract sense, but for a new app, why start > so far behind the curve? In fact, for a new app you should be trying to > take advantage of new features and performance improvements, like > materialized views, SASI, and wide rows coming soon. > > In the past, upgrading from 2.x to 2.y was a big deal. That just isn't a > problem with upgrading from 3.x to 3.y. At least in theory, and again, > nobody has been hollering about having problems doing that. > > For EOL, you will have to judge for yourself how long it may take your > organization to carefully migrate a production 2.x system to 3.x somewhere > down the road. No need to rush, but don't wait until the last minute > either. And I suspect that you won't even want to think about upgrading 2.x > to 4.x - IOW, upgrade to 3.x well before 3.x EOL. > > -- Jack Krupansky > > On Sat, Apr 23, 2016 at 3:28 PM, Anuj Wadehra < > anujw_2...@yahoo.co.in.invalid> wrote: > > > Jonathan, > > I understand you point. In my perspective, people in production usually > > prefer stability over features and would always want at least emergency > fix > > releases if not fully supported versions.I am glad that today we have > such > > releases which are very stable and not yet EOL. Its just that users
Re: [Proposal] Mandatory comments
+1 for some kind of (unspecified) improvement on the comment front. +1 for updated style guide giving recommended practices for commenting Specific rules/guidelines? So hard to get it right at an abstract level - what can sound great in theory can fail miserably in practice. And what can work great initially, when people are all excited about it, can slowly rot as that initial enthusiasm fades. How to start? Suggestion: Instead of trying to seek consensus on abstract rules/guidelines, just randomly pick some module that is in need of better comments and have dueling patches for how to best comment it. And then once the dust settles and there is some general consensus on what the real/implied rules/guidelines should be, based on the reality of that initial module, pick another module.and see if the deduced rules/guidelines from the first module can be methodically applied. -- Jack Krupansky On Mon, May 2, 2016 at 1:29 PM, Josh McKenzie wrote: > o.a.c.hints/package-info.java is a pretty good example of what that looks > like. My previous statement about dangers of comment atrophy and needing to > be diligent holds doubly-true if the comments themselves aren't localized > to the files we're touching on modification. > > I see a case for better documentation on all three: public API's for > classes, classes, and package level. Each serve different and important > purposes IMO. > > On Mon, May 2, 2016 at 1:16 PM, Jonathan Ellis wrote: > > > What I'd like to see is more comments like the one in StreamSession: > > something that can give me the "big picture" for a piece of > functionality. > > > > I wonder if focusing on class-based comments might miss an opportunity > > here. StreamSession was chosen somewhat arbitrarily to be where we > > described the streaming life cycle. If we focused just on describing > each > > class in isolation then we might miss something more valuable. > > > > Is this a case for package-level javadoc, and organizing our class > > hierarchy better along those lines? > > > > On Mon, May 2, 2016 at 11:26 AM, Sylvain Lebresne > > wrote: > > > > > There could be disagreement on this, but I think that we are in general > > not > > > very good at commenting Cassandra code and I would suggest that we > make a > > > collective effort to improve on this. And to help with that goal, I > would > > > like > > > to suggest that we add the following rule to our code style guide > > > (https://wiki.apache.org/cassandra/CodeStyle): > > > - Every public class and method must have a quality javadoc. That > > > javadoc, on > > > top of describing what the class/method does, should call > particular > > > attention to the assumptions that the class/method does, if any. > > > > > > And of course, we'd also start enforcing that rule by not +1ing code > > unless > > > it > > > adheres to this rule. > > > > > > Note that I'm not pretending this arguably simplistic rule will > magically > > > make > > > comments perfect, it won't. It's still up to us to write good and > > complete > > > comments, and it doesn't even talk about comments within methods that > are > > > important too. But I think some basic rule here would be beneficial and > > > that > > > one feels sensible to me. > > > > > > Looking forward to other's opinions and feedbacks on this proposal. > > > > > > -- > > > Sylvain > > > > > > > > > > > -- > > Jonathan Ellis > > Project Chair, Apache Cassandra > > co-founder, http://www.datastax.com > > @spyced > > >
Re: [Proposal] Mandatory comments
Not so much wiggle room in that case so much as a guideline for commenting getters and setters and the field they access. Some consensus is needed whether there should be some rote comment for getters and setters, or whether the Javadoc should be simply skipped for simple getters and setters, provided that there is separate doc for the field that they name. I'm not sure what the best way is to document or distinguish a private field vs. a pseudo-field whose getters and setters are implemented by calling methods rather than merely returning or setting the private field. I mean, you don't want to clutter the public Javadoc with details of every private field, but those private fields that have getters and setters clearly need to be documented - at least in terms that will make sense to the users of the getters and setters. One alternative is to document the field/pseudo-field on either the getter or the setter with a "See..." linked on the other. I mean, if you are looking at the code for one, you should be able to quickly get to the doc for the field/pseudo-field itself. And if there are any special rules or checks in the setter, they should have Javadoc, either for the setter or the field. -- Jack Krupansky On Tue, May 3, 2016 at 12:57 PM, Eric Evans wrote: > On Mon, May 2, 2016 at 11:26 AM, Sylvain Lebresne > wrote: > > Looking forward to other's opinions and feedbacks on this proposal. > > We might want to leave just a little wiggle room for judgment on the > part of the reviewer, for the very simple cases. Documenting > something like setFoo(int) with "Sets foo" can get pretty tiresome for > everyone, and doesn't add any value. > > Otherwise I think this is perfectly reasonable; +1 > > > -- > Eric Evans > john.eric.ev...@gmail.com >
Re: [Proposal] Mandatory comments
FWIW, I recently wrote up a bunch of notes on Code Quality and published them on Medium. There are notes on comments and consistency and boilerplate buried in there. WARNING: There's a lot of stuff there and it is not for the faint of heart or those not truly committed to code quality. tl;dr - I'm not a fan of boiler plate just to say you did something, but... I am a fan of consistency, but that doesn't mean every situation is the same, just that similar situations should be treated similarly - unless there is some reasonable reason to do otherwise. See: https://medium.com/@jackkrupansky/code-quality-preamble-932626a3131c#.ynrjbryus https://medium.com/@jackkrupansky/software-and-product-quality-notes-no-1-346ab1d8df24#.xzg1ihuxb https://medium.com/@jackkrupansky/code-quality-notes-no-1-4dc522a5e29c#.cm7tan2zu https://medium.com/@jackkrupansky/code-quality-notes-no-2-7939377b73c6#.zco8oq3dj -- Jack Krupansky On Thu, May 5, 2016 at 10:55 AM, Eric Evans wrote: > On Wed, May 4, 2016 at 12:14 PM, Jonathan Ellis wrote: > > On Wed, May 4, 2016 at 2:27 AM, Sylvain Lebresne > > wrote: > > > >> On Tue, May 3, 2016 at 6:57 PM, Eric Evans > >> wrote: > >> > >> > On Mon, May 2, 2016 at 11:26 AM, Sylvain Lebresne < > sylv...@datastax.com> > >> > wrote: > >> > > Looking forward to other's opinions and feedbacks on this proposal. > >> > > >> > We might want to leave just a little wiggle room for judgment on the > >> > part of the reviewer, for the very simple cases. Documenting > >> > something like setFoo(int) with "Sets foo" can get pretty tiresome for > >> > everyone, and doesn't add any value. > >> > > >> > >> I knew someone was going to bring this :). In principle, I don't really > >> disagree. In practice though, > >> I suspect it's sometimes just easier to adhere to such simple rule > somewhat > >> strictly. In particular, > >> I can guarantee that we don't all agree where the border lies between > what > >> warrants a javadoc > >> and what doesn't. Sure, there is a few cases where you're just > paraphrasing > >> the method name > >> (and while it might often be the case for getters and setters, it's > worth > >> noting that we don't really > >> do much of those in C*), but how hard is it to write a one line comment? > >> Surely that's a negligeable > >> part of writing a patch and we're not that lazy. > >> > > > > I'm more concerned that this kind of boilerplate commenting obscures > rather > > than clarifies. When I'm reading code i look for comments to help me > > understand key points, points that aren't self-evident. If we institute > a > > boilerplate "comment everything" rule then I lose that signpost. > > This. > > Additionally you could also probably argue that it obscures the true > purpose to leaving a comment; It becomes a check box to tick, having > some javadoc attached to every method, rather than genuinely looking > for the value that could be added with quality comments (or even > altering the approach so that the code is more obvious in the absence > of them). > > The reason I suggested "wiggle room", is that I think everyone > basically agrees that the default should be to leave good comments > (and that that hasn't been the case), that we should start making this > a requirement to successful review, and that we can afford to leave > some room for judgment on the part of the reviewer. Worse-case is > that we find in doing so that there isn't much common ground on what > constitutes a quality comment versus useless boilerplate, and that we > have to remove any wiggle room and make it 100% mandatory (I don't > think that will (has to) be the case, though). > > > -- > Eric Evans > john.eric.ev...@gmail.com >
Re: Expiring Tables or Columns?
The "default_time_to_live" property for CREATE TABLE. See: http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/tabProp.html That's in the DataStax CQL doc, but I don't see it in the Apache Cassandra CQL spec: https://cassandra.apache.org/doc/cql3/CQL-2.0.html -- Jack Krupansky -Original Message- From: Brandon Williams Sent: Tuesday, July 15, 2014 5:36 PM To: dev@cassandra.apache.org Subject: Re: Expiring Tables or Columns? https://issues.apache.org/jira/browse/CASSANDRA-3974 On Tue, Jul 15, 2014 at 4:14 PM, Shehaaz Saif wrote: Hi! I just received a question from someone at work about setting a TTL for a table...I said that they can't do that...it can only be set at the column level. Is that right? These are the options that they have: http://www.datastax.com/documentation/cql/3.0/cql/cql_using/use_remove_data_c.html <http://s.bl-1.com/h/hPXFFQr> Thank you! -Shehaaz -- Shehaaz Saif Twitter: @Shehaaz <http://s.bl-1.com/h/hPXYLQ9> Blog: www.shehaaz.com <http://s.bl-1.com/h/hPXYRpC> Tel: 425-516-2160 <http://s.bl-1.com/h/hPXYWCF>
Re: Embedding Cassandra
Are you trying to use Cassandra as a simple single-node data store within a process, or are you trying to wrap Cassandra and run a cluster of custom nodes? IOW, what are you trying to accomplish. You could always run Cassandra as a spawned, background process, fully controlled by a parent process. Who is doing the queries, the program in which Cassandra is embedded, or an external client over the net? -- Jack Krupansky On Tue, Mar 24, 2015 at 5:38 AM, Ersin Er wrote: > Hi all, > > As far as I understand it seems to be enough to initialize and activate > CassandraDaemon for embedding Cassandra. (Right?) > > My question is what's the next step? Which class or classes should I use in > order to execute CQL queries for example? > > Any pointers would be great. > > Thanks! > > -- > Ersin Er >