Re: Write into composite coulmns

2015-07-10 Thread Jack Krupansky
First, this is a compound primary key, not a composite [partition] key.

You don't have a column definition for dpci, your partition key.

Compact storage requires precisely and exactly and only one non-primary key
column, but you have two. Maybe dpci and item were supposed to be the same.

-- Jack Krupansky

On Fri, Jul 10, 2015 at 1:43 AM, Ajay Chander  wrote:

> Any one here came across a situation like this ? Thank you!
>
> On Thursday, July 9, 2015, Ajay Chander  wrote:
>
> > More information:
> >
> >
> > Below is my cassandra bolt.
> >
> >
> > public CassandraTest withCassandraBolt() {
> >
> > String[] rowKeyFields = {“item","location”};
> >
> > HashMap clientConfig = newHashMap();
> >
> > clientConfig.put(StormCassandraConstants.CASSANDRA_HOST,
> >
> > this.configuration.getCassandraBoltServer());
> >
> > clientConfig.put(StormCassandraConstants.CASSANDRA_KEYSPACE, Arrays
> >
> > .asList(new String[] {this.projectConfiguration
> >
> > .getCassandraBoltKeyspace() }));
> >
> > this.stormConfig.put(
> >
> > this.configuration.getCassandraBoltConfigKey(),
> >
> > clientConfig);
> >
> > cassandraBolt = new CassandraBatchingBolt(
> >
> > this.configuration.getCassandraBoltConfigKey(),
> >
> > new CompositeRowTupleMapper(
> >
> > this.configuration.getCassandraBoltKeyspace(),
> >
> > this.configuration
> >
> > .getCassandraBoltColumnFamily(),
> >
> > rowKeyFields));
> >
> > cassandraBolt.setAckStrategy(AckStrategy.ACK_ON_WRITE);
> >
> > return this;
> >
> > }
> >
> >
> >
> > This is my table in Cassandra:
> >
> >
> > CREATE TABLE store ( item text, location text, type text,
> > PRIMARY KEY (dpci, location) ) WITH COMPACT STORAGE;
> >
> >
> >
> > Error I am getting is below:
> >
> >
> > 15810 [batch-bolt-thread] WARN
> > com.netflix.astyanax.connectionpool.impl.Slf4jConnectionPoolMonitorImpl -
> > BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=21(21),
> > attempts=1]InvalidRequestException(why:Not enough bytes to read value of
> > component 0)
> >
> > 15811 [batch-bolt-thread] ERROR
> > com.hmsonline.storm.cassandra.bolt.CassandraBatchingBolt - Unable to
> write
> > batch.
> >
> > com.netflix.astyanax.connectionpool.exceptions.BadRequestException:
> > BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=21(21),
> > attempts=1]InvalidRequestException(why:Not enough bytes to read value of
> > component 0)
> >
> > at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(
> > ThriftConverter.java:159) ~[astyanax-thrift-1.56.44.jar:na]
> >
> > at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(
> > AbstractOperationImpl.java:65) ~[astyanax-thrift-1.56.44.jar:na]
> >
> > at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(
> > AbstractOperationImpl.java:28) ~[astyanax-thrift-1.56.44.jar:na]
> >
> > at
> >
> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(
> > ThriftSyncConnectionFactoryImpl.java:151)
> > ~[astyanax-thrift-1.56.44.jar:na]
> >
> > at
> >
> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(
> > AbstractExecuteWithFailoverImpl.java:119) ~[astyanax-core-1.56.44.jar:na]
> >
> > at
> >
> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(
> > AbstractHostPartitionConnectionPool.java:338)
> > ~[astyanax-core-1.56.44.jar:na]
> >
> > at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(
> > ThriftKeyspaceImpl.java:493) ~[astyanax-thrift-1.56.44.jar:na]
> >
> > at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(
> > ThriftKeyspaceImpl.java:79) ~[astyanax-thrift-1.56.44.jar:na]
> >
> > at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(
> > ThriftKeyspaceImpl.java:123) ~[astyanax-thrift-1.56.44.jar:na]
> >
> > at com.hmsonline.storm.cassandra.client.AstyanaxClient.writeTuples(
> > AstyanaxClient.java:417) ~[classes/:na]
> >
> > at com.hmsonline.storm.cassandra.bolt.CassandraBolt.writeTuples(
> > CassandraBolt.java:67) ~[classes/:na]
> >
> > at com.hmsonline.storm.cassandra.bolt.CassandraBatchingBolt.executeBatch(
> > CassandraBatchingBolt.java:49) ~[classes/:na]
> >
> > at
> com.hmsonline.storm.cassandra.bolt.AbstractBatchingBolt$BatchThread.run(
> 

Re: UML sequence diagrams on Wiki for explaining read/write path

2015-11-30 Thread Jack Krupansky
Great stuff!

You wrote "When the replica node comes back online the coordinator node
will send the data to the replica node", which is partially true - if the
replica comes back online within the timeout window of three hours. So, you
probably want to say something like:

"If the replica node comes back online within the hinted handoff window the
coordinator node will send the data to the replica node, otherwise the
replica node will need to be repaired."

And maybe mention the configuration of the window. Change "The data is
stored for a default period of 3 hours" to "The data is stored for a
default period of 3 hours, configurable using the
max_hint_window_in_ms property
in cassandra.yaml."

-- Jack Krupansky

On Mon, Nov 30, 2015 at 2:24 AM, Michael Edge 
wrote:

> Write path docs updated on Wiki - please review diagram/text and let me
> have your comments (or update text in place).
>
> https://wiki.apache.org/cassandra/WritePathForUsers
>
> Cheers,
>
> Michael
>
> On 26 November 2015 at 11:25, Michael Shuler 
> wrote:
>
> > On 11/25/2015 07:36 PM, Michael Edge wrote:
> > > I'd like to update the read/write path description on the wiki (see
> link
> > > below) by adding a couple of UML sequence diagrams I drew a while ago.
> I
> > > think they are much better than long textual descriptions for
> describing
> > > the order of operations on components. Before publishing them I'd
> prefer
> > a
> > > knowledgeable volunteer to review them to ensure they are accurate.
> > >
> > > Let me know if you're interested and I'll send you a copy.
> > >
> > > https://wiki.apache.org/cassandra/ArchitectureOverview
> >
> > I don't think the mailing list likes attachments, so throw them up on a
> > web server somewhere and send the list link(s) to your diagrams.
> > Alternatively, go ahead and post them on the wiki and ask for feedback.
> >
> > --
> > Kind regards,
> > Michael
> >
>


Re: UML sequence diagrams on Wiki for explaining read/write path

2015-12-01 Thread Jack Krupansky
I just remembered... the new Materialized View support in 3.0 - writes to
the materialized views get triggered when a write occurs to the base table.
That needs to be in the write path flow/description as well.

-- Jack Krupansky

On Mon, Nov 30, 2015 at 9:30 PM, Michael Edge 
wrote:

> Thanks for the feedback guys. I've made the updates.
>
> On 1 December 2015 at 00:56, Jack Krupansky 
> wrote:
>
> > Great stuff!
> >
> > You wrote "When the replica node comes back online the coordinator node
> > will send the data to the replica node", which is partially true - if the
> > replica comes back online within the timeout window of three hours. So,
> you
> > probably want to say something like:
> >
> > "If the replica node comes back online within the hinted handoff window
> the
> > coordinator node will send the data to the replica node, otherwise the
> > replica node will need to be repaired."
> >
> > And maybe mention the configuration of the window. Change "The data is
> > stored for a default period of 3 hours" to "The data is stored for a
> > default period of 3 hours, configurable using the
> > max_hint_window_in_ms property
> > in cassandra.yaml."
> >
> > -- Jack Krupansky
> >
> > On Mon, Nov 30, 2015 at 2:24 AM, Michael Edge 
> > wrote:
> >
> > > Write path docs updated on Wiki - please review diagram/text and let me
> > > have your comments (or update text in place).
> > >
> > > https://wiki.apache.org/cassandra/WritePathForUsers
> > >
> > > Cheers,
> > >
> > > Michael
> > >
> > > On 26 November 2015 at 11:25, Michael Shuler 
> > > wrote:
> > >
> > > > On 11/25/2015 07:36 PM, Michael Edge wrote:
> > > > > I'd like to update the read/write path description on the wiki (see
> > > link
> > > > > below) by adding a couple of UML sequence diagrams I drew a while
> > ago.
> > > I
> > > > > think they are much better than long textual descriptions for
> > > describing
> > > > > the order of operations on components. Before publishing them I'd
> > > prefer
> > > > a
> > > > > knowledgeable volunteer to review them to ensure they are accurate.
> > > > >
> > > > > Let me know if you're interested and I'll send you a copy.
> > > > >
> > > > > https://wiki.apache.org/cassandra/ArchitectureOverview
> > > >
> > > > I don't think the mailing list likes attachments, so throw them up
> on a
> > > > web server somewhere and send the list link(s) to your diagrams.
> > > > Alternatively, go ahead and post them on the wiki and ask for
> feedback.
> > > >
> > > > --
> > > > Kind regards,
> > > > Michael
> > > >
> > >
> >
>


Re: UML sequence diagrams on Wiki for explaining read/write path

2015-12-02 Thread Jack Krupansky
Thanks. I took a quick look and it seems fine, but... I don't have the
depth of expertise in that code to be 100% sure of each detail. Hopefully
Carl or Jake, et al can review.

One additional point on MV: When a new MV is created for a base table that
is already populated, that kicks off a backfilling process that will read
each existing row from the base table and apply the MV update process. I'm
not sure of the precise details there either (like, which consistency those
backfill writes use.) And that backfilling process has to occur on each
node of the cluster (but not each replica.) Again, Carl, Jake, et al need
to review precise details.

-- Jack Krupansky

On Wed, Dec 2, 2015 at 1:02 AM, Michael Edge  wrote:

> Yes - good point.
>
> I've updated the text for the write path page below - could you please
> review? Once my understanding is correct I'll update the sequence diagram
> to match.
>
> https://wiki.apache.org/cassandra/WritePathForUsers
>
> On 1 December 2015 at 22:51, Jack Krupansky 
> wrote:
>
> > I just remembered... the new Materialized View support in 3.0 - writes to
> > the materialized views get triggered when a write occurs to the base
> table.
> > That needs to be in the write path flow/description as well.
> >
> > -- Jack Krupansky
> >
> > On Mon, Nov 30, 2015 at 9:30 PM, Michael Edge 
> > wrote:
> >
> > > Thanks for the feedback guys. I've made the updates.
> > >
> > > On 1 December 2015 at 00:56, Jack Krupansky 
> > > wrote:
> > >
> > > > Great stuff!
> > > >
> > > > You wrote "When the replica node comes back online the coordinator
> node
> > > > will send the data to the replica node", which is partially true - if
> > the
> > > > replica comes back online within the timeout window of three hours.
> So,
> > > you
> > > > probably want to say something like:
> > > >
> > > > "If the replica node comes back online within the hinted handoff
> window
> > > the
> > > > coordinator node will send the data to the replica node, otherwise
> the
> > > > replica node will need to be repaired."
> > > >
> > > > And maybe mention the configuration of the window. Change "The data
> is
> > > > stored for a default period of 3 hours" to "The data is stored for a
> > > > default period of 3 hours, configurable using the
> > > > max_hint_window_in_ms property
> > > > in cassandra.yaml."
> > > >
> > > > -- Jack Krupansky
> > > >
> > > > On Mon, Nov 30, 2015 at 2:24 AM, Michael Edge <
> edge.mich...@gmail.com>
> > > > wrote:
> > > >
> > > > > Write path docs updated on Wiki - please review diagram/text and
> let
> > me
> > > > > have your comments (or update text in place).
> > > > >
> > > > > https://wiki.apache.org/cassandra/WritePathForUsers
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Michael
> > > > >
> > > > > On 26 November 2015 at 11:25, Michael Shuler <
> mich...@pbandjelly.org
> > >
> > > > > wrote:
> > > > >
> > > > > > On 11/25/2015 07:36 PM, Michael Edge wrote:
> > > > > > > I'd like to update the read/write path description on the wiki
> > (see
> > > > > link
> > > > > > > below) by adding a couple of UML sequence diagrams I drew a
> while
> > > > ago.
> > > > > I
> > > > > > > think they are much better than long textual descriptions for
> > > > > describing
> > > > > > > the order of operations on components. Before publishing them
> I'd
> > > > > prefer
> > > > > > a
> > > > > > > knowledgeable volunteer to review them to ensure they are
> > accurate.
> > > > > > >
> > > > > > > Let me know if you're interested and I'll send you a copy.
> > > > > > >
> > > > > > > https://wiki.apache.org/cassandra/ArchitectureOverview
> > > > > >
> > > > > > I don't think the mailing list likes attachments, so throw them
> up
> > > on a
> > > > > > web server somewhere and send the list link(s) to your diagrams.
> > > > > > Alternatively, go ahead and post them on the wiki and ask for
> > > feedback.
> > > > > >
> > > > > > --
> > > > > > Kind regards,
> > > > > > Michael
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Apache Cassandra - Question about data model

2015-12-31 Thread Jack Krupansky
It's best to ask usage and data modeling questions on the user email list -
this list is the dev list, for development of Cassandra itself, not for
development of applications.

See:
http://cassandra.apache.org/


-- Jack Krupansky

On Thu, Dec 31, 2015 at 8:36 AM, Lior Menashe 
wrote:

> Hi,
>
> Just got your mail from the #cassandra channel on the web chat because i
> couldn't get an answer...
>
> I have a question that i'll be glad if you can help me or give me a
> direction.
>
> I have an activity feed like the activity feed on Instagram. When user
> (lets say UserA) enters his page he can see all the activities that are
> related to him,
> for example, user B liked your post.user C commented on your post etc...
>
> the cassandra data model that i thought about is:
>
> userID UDID (partition key)
> datetimeadded timestamp (clustering column DESC)
> userID_Name text
> userID_Picture_URL text
> userID_From UDID (this is userB from the example)
> userID_From_Name text
> userID_From_Picture_URL
>
> With this structure i can get the different activities to a user and it
> works just fine. My problem is that userID_From can change his name and his
> pictire and i need this data to be updated all arround the different tables
> because i want to show the current right values.
>
> The problem is that the update is a table scan and it's not efficient.
> Should i hold only the ID and every time that i select a slice of the data
> and get a several ID's i'll do a nother query to query about
> the values of the users name and picture path? Should i do something else?
>
> Best regards,
> Lior
>


Re: Versioning policy?

2016-01-14 Thread Jack Krupansky
Mark, what would the policy dictate if the bug is in a feature that was
introduced in 3.x and the feature wasn't in 3.0 and the current release is
3.x+k where k is greater than two - technically 3.x is no longer supported,
so does that mean no backporting of the fix and only 3.x+k+1 would get the
bug fix (or maybe 3.x+k-1.y in your rare cases)? In this case the user on
3.x would have to upgrade more than two releases to get the big fix even
though 3.x may only have been released a few months earlier. I think this
is the gist of the original discussion that the EOL policy for 3.x is way
too short for more conservative customers.

Maybe the right answer for them is to only use x.0.z releases and not use
features introduced in x.y releases. But even then... if they adopt 3.0.z
just a month or two before x+1.0 comes out they are back in that same boat
that no matter what release they pick it will go EOL within just a few
months.


-- Jack Krupansky

On Thu, Jan 14, 2016 at 11:39 AM, Mark Dewey  wrote:

> Let's do a couple examples:
>
>1. Current release: 3.4.0, bug found in 3.1.0 that also exists in
>subsequent versions; the bug fix will be ported back to 3.0.x and 3.5.0.
>2. Current release 3.3.0, bug found in 3.3.0; the bug fix will be ported
>back to 3.0.x and 3.4.0. In rare cases, a 3.3.1 may also be released
> (we've
>already seen one example of this).
>3. Current release 3.6.0, bug found in 3.1.0; the bug fix will be ported
>back to 3.0.x and 3.7.
>
> Essentially, any bug fixes will be released in the next minor version and
> backported to the 3.0.x branches (unless the feature doesn't exist in 3.0).
> We have seen one instance where we released a point version as well, but
> that was for a major regression that was discovered almost immediately
> after a release.
>
> On Wed, Jan 13, 2016 at 12:55 PM Maciek Sakrejda 
> wrote:
>
> > Can anyone chime in here? We're getting ready to run a decent number of
> > nodes; we'd like to have a better idea of what to expect with respect to
> > patching and upgrading. A clear versioning policy like the one laid out
> by
> > Postgres would be very helpful.
> > ​
> >
>


Re: Versioning policy?

2016-01-14 Thread Jack Krupansky
Jonathan, just to complete the list, it would be help to state:

3.1.x will be maintained until 
3.2 will be maintained until 
And in general, 3.x (x != 0) will be maintained until  (and does x
even vs. odd affect the rule)

And what exactly is the general rule/criteria for when 3.x will be
considered the official "stable release"? And will 3.0.x always be
considered the recommended non-production release until 4.0 comes out or is
there general guidance/criteria for when a 3.x would become recommended for
non-production? Or will tick-tock completely replace that "traditional"
section? In which case, the question of criteria for defining "stable
release" remains, unless it becomes no different than the latest tick-tock
release.


-- Jack Krupansky

On Thu, Jan 14, 2016 at 12:57 PM, Anuj Wadehra 
wrote:

> Hi Jonathan,
> Thanks for the crisp communication regarding the tick tock release & EOL.
> I think its worth considering some points regarding EOL policy and it
> would be great if you can share your thoughts on below points:
> 1.  EOL of a release should be based on "most stable"/"production ready"
> version date rather than "GA" date of subsequent major releases.
> 2.  I think we should have "Formal EOL Announcement" on Apache Cassandra
> website.
> 3. "Formal EOL Announcement" should come at least 6 months before the EOL,
> so that users get reasonable time to  upgrade.
> 4. EOL Policy (even if flexible) should be stated on Apache Cassandra
> website
>
> EOL thread on users mailing list ended with the conclusion of raising a
> Wishlist JIRA but I think above points are more about working on policy and
> processes rather than just a wish list.
>
> ThanksAnuj
>
>
>
> Sent from Yahoo Mail on Android
>
>   On Thu, 14 Jan, 2016 at 10:57 pm, Jonathan Ellis
> wrote:   Hi Maciek,
>
> First let's talk about the tick-tock series, currently 3.x.  This is pretty
> simple: outside of the regular monthly releases, we will release fixes for
> critical bugs against the most recent bugfix release, the way we did
> recently with 3.1.1 for CASSANDRA-10822 [1].  No older tick-tock releases
> will be patched.
>
> Now, we also have three other release series currently being supported:
>
> 2.1.x: supported with critical fixes only until 4.0 is released, projected
> in November 2016 [2]
> 2.2.x: maintained until 4.0 is released
> 3.0.x: maintained for 6 months after 4.0, i.e. projected until May 2017
>
> I will add this information to the releases page [3].
>
> [1]
>
> https://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201512.mbox/%3CCAKkz8Q3StqRFHfMgCMRYaaPdg+HE5N5muBtFVt-=v690pzp...@mail.gmail.com%3E
> [2] 4.0 will be an ordinary tick-tock release after 3.11, but we will be
> sunsetting deprecated features like Thrift so bumping the major version
> seems appropriate
> [3] http://cassandra.apache.org/download/
>
> On Sun, Jan 10, 2016 at 9:29 PM, Maciek Sakrejda 
> wrote:
>
> > There was a discussion recently about changing the Cassandra EOL policy
> on
> > the users list [1], but it didn't really go anywhere. I wanted to ask
> here
> > instead to clear up the status quo first. What's the current versioning
> > policy? The tick-tock versioning blog post [2] states in passing that two
> > major releases are maintained, but I have not found this as an official
> > policy stated anywhere. For comparison, the Postgres project lays this
> out
> > very clearly [3]. To be clear, I'm not looking for any official support,
> > I'm just asking for clarification regarding the maintenance policy: if a
> > critical bug or security vulnerability is found in version X.Y.Z, when
> can
> > I expect it to be fixed in a bugfix patch to that major version, and when
> > do I need to upgrade to the next major version.
> >
> > [1]: http://www.mail-archive.com/user@cassandra.apache.org/msg45324.html
> > [2]: http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/
> > [3]: http://www.postgresql.org/support/versioning/
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced
>
>


Re: Versioning policy?

2016-01-14 Thread Jack Krupansky
Thanks, Jonathan. The end-of-life (EOL) question is still dangling out
there - when does 3.x go off support, after 3.x+3 or six months after 4.0?
Or... six months after 5.0?


-- Jack Krupansky

On Thu, Jan 14, 2016 at 6:15 PM, Jonathan Ellis  wrote:

> On Thu, Jan 14, 2016 at 4:26 PM, Jack Krupansky 
> wrote:
>
> > Jonathan, just to complete the list, it would be help to state:
> >
> > 3.1.x will be maintained until 
> > 3.2 will be maintained until 
> >
>
> One of the confusing things about tick tock is that we're stuck with
> numbers that look like the old ones but mean different things.
>
> In the old world, 2.1 was a release that took a year of work, and it got
> maintained with roughly-monthly updates of 2.1.x.
>
> In the tick tock world, the corresponding series is just "3," and the
> monthly updates are 3.1, 3.2, and so forth, with new features allowed in
> the even releases every two months.  So in general, there will be no 3.1.x
> or 3.2.y releases.  When a bug is critical enough to make an exception to
> the "wait for the next monthly release" rule, it will be fixed in the most
> recent bugfix tock.
>
> will tick-tock completely replace that "traditional"
> > section?
>
>
> Yes.
>
>
> > In which case, the question of criteria for defining "stable
> > release" remains, unless it becomes no different than the latest
> tick-tock
> > release.
> >
>
> That's the idea, and that's why we're getting very religious about test
> engineering, so that those monthly releases will always be stable.
>


3.1 status?

2016-01-19 Thread Jack Krupansky
It's great to see clear support status marked on the 3.0.x and 2.x releases
on the download page now. A couple more questions...

1. What is the support and stability status of 3.1 and 3.2 (as opposed to
3.2.1)? Are they "for non-production development only"? Are they considered
"stable"? The page should say.

2. Is there simply no "stable" release for 3.x, or is the latest tick-tock
release by definition considered "stable"?

3. The first paragraph says "If a critical bug is found, a patch will be
released against the most recent bug fix release", but in fact the latest
critical patch (3.2.1) is against a feature release, not a bug fix release.
Should that simply say "... against the most recent tick-tock release"
regardless of whether it was an even (feature) or odd (bug fix) release?

Thanks.

-- Jack Krupansky


Re: DSE Release Planned That Corresponds with Cassandra 3.2

2016-02-03 Thread Jack Krupansky
Stack Overflow has reasonable coverage for DataStax Enterprise - at least I
paid attention to it when I was consulting for DataStax on DSE Search.

See:
http://stackoverflow.com/questions/tagged/datastax-enterprise
http://stackoverflow.com/questions/34293295/cassandra-3-x-with-datastax-enterprise

That doesn't give you a desirable answer to your question, but that's the
best you can do for now, other than the sweet nothings that a sales person
is willing to whisper in your ear for private consumption in the name of
enhancing revenue for DataStax.


-- Jack Krupansky

On Wed, Feb 3, 2016 at 10:36 AM, Corry Opdenakker  wrote:

> Thanks Ryan, I just sent them a copy of the post.
> Cheers, Corry
>
> On Wed, Feb 3, 2016 at 4:34 PM, Ryan Svihla  wrote:
>
> > I’d just contact sa...@datastax.com instead.
> >
> > > On Feb 3, 2016, at 9:32 AM, Corry Opdenakker 
> wrote:
> > >
> > > You are Right Ryan, but since there is no DSE mailinglist, I thought I
> > > could ask it overhere.
> > > Thanks anyway for the reply.
> > > Regards, Corry
> > >
> > > On Wed, Feb 3, 2016 at 4:30 PM, Ryan Svihla  wrote:
> > >
> > >> Corry,
> > >>
> > >> This is the Cassandra developer mailing list aimed at contributors to
> > the
> > >> Cassandra code base (not all of whom work for DataStax for example).
> > Can I
> > >> suggest you contact DataStax and ask the same question?
> > >>
> > >>
> > >> Regards,
> > >>
> > >> Ryan Svihla
> > >>
> > >>> On Feb 3, 2016, at 9:27 AM, Corry Opdenakker 
> > wrote:
> > >>>
> > >>> Hi everyone,
> > >>>
> > >>> Yesterday evening I installed DSE for the first time at my macbook,
> > and I
> > >>> must say that untill now it runs very fast with 0 users and 0 data
> > >> records:)
> > >>>
> > >>> I was reading the release notes of C* 3.x and it reports "The Storage
> > >>> Engine has been refactored", "Java 8 required", and the introduction
> of
> > >>> "Materialized Views"  as some of the most important changes.
> > >>> When will there be a DSE version released that Corresponds with
> > Cassandra
> > >>> 3.2 or any other 3.x subversion?
> > >>> If there isn't one planned yet then I'll probably switch to C* 3.2
> > before
> > >>> starting development.
> > >>>
> > >>> I've searched at several places but I couldn't find a DSE release
> > >>> announcement or product roadmap that covers this topic.
> > >>> If anyone could mention the relevant page, then that would be great.
> > >>>
> > >>> Cheers, Corry
> > >>
> > >>
> > >
> > >
> > > --
> > > --
> > > Bestdata.be
> > > Optimised ict
> > > Tel:+32(0)496609576
> > > co...@bestdata.be
> > > --
> >
> >
>
>
> --
> --
> Bestdata.be
> Optimised ict
> Tel:+32(0)496609576
> co...@bestdata.be
> --
>


Re: Custom Java class for secondary index implementation

2016-03-06 Thread Jack Krupansky
Is RegularColumnIndex representative of what a typical custom index
needs... or not?

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/composites/RegularColumnIndex.java

Ditto for CassandraIndex (abstract class) - should all (or at least most)
custom indexes extend it?

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java


-- Jack Krupansky

On Sun, Mar 6, 2016 at 11:03 PM, Henry Manasseh 
wrote:

> Thank you. This is a perfect class for to start experimenting with.
>
>
> On Sun, Mar 6, 2016 at 2:38 PM Sam Tunnicliffe  wrote:
>
> > You might find o.a.c.i.StubIndex in the test source tree useful.
> > On 6 Mar 2016 19:24, "Henry Manasseh"  wrote:
> >
> > > Hello,
> > > I was wondering if anyone is aware of a minimal reference
> implementation
> > > for a java class implementing a secondary index or some documentation
> of
> > > the interface(s) I would need to implement (I looked at the SASI 2i
> code
> > > but I am trying to find the bare bones test or sample class for a
> newbie
> > if
> > > it exists).
> > >
> > > e.g. I am referring to the class you would use for
> > > 'path.to.the.IndexClass'.
> > >
> > > CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass';
> > > >
> > >
> > >
> > > I am trying to understand if it is possible to index a UDT collection
> > based
> > > on only one field of the UDT and still use the cassandra index file
> > > management (I don't want to provide my own file storage). In other
> > words, I
> > > am looking to see if the custom index class can serve as a
> transformation
> > > function to only index my subfield. This probably will need a tweak to
> > the
> > > CQL to prevent a syntax error but not concerned about that at the
> moment.
> > >
> > > Thank you for any helps and tips,
> > > Henry
> > >
> >
>


Re: Wiki

2016-04-04 Thread Jack Krupansky
Interesting question as to what the future of the nodetool wiki page is.
You can get more/full detail in the DataStax doc:

https://docs.datastax.com/en/cassandra/3.x/cassandra/tools/toolsNodetool.html


-- Jack Krupansky

On Mon, Apr 4, 2016 at 12:08 PM, Anubhav Kale 
wrote:

> There are others missing as well. For instance, getendpoints (which is
> quite handy).
>
> -Original Message-
> From: Dave Brosius [mailto:dbros...@mebigfatguy.com]
> Sent: Sunday, April 3, 2016 8:13 AM
> To: dev@cassandra.apache.org
> Subject: Re: Wiki
>
> Done
>
> On 04/03/2016 10:56 AM, Pedro Gordo wrote:
> > Hi Jonathan
> >
> > I forgot about that, sorry. It's "PedroGordo".
> >
> > Best regards
> >
> > Pedro Gordo
> >
> > On 3 April 2016 at 15:41, Jonathan Ellis  wrote:
> >
> >> Hi Pedro,
> >>
> >> We need your wiki username to add it to the editors list.  Thanks!
> >>
> >> On Sun, Apr 3, 2016 at 9:15 AM, Pedro Gordo
> >> 
> >> wrote:
> >>
> >>> Hi
> >>>
> >>> I would like to contribute to C* Wiki if possible, so sending the
> >>> email
> >> as
> >>> requested on the wiki's front page. The reason for this is that the
> >>> nodetool page
> >>> <https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fwi
> >>> ki.apache.org%2fcassandra%2fNodeTool&data=01%7c01%7cAnubhav.Kale%
> 40microsoft.com%7c1d8be3522d2d494617df08d35bd273e5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=UBQCfJSoDBisqLOYqqk34DE5y8a1YIf3z%2fbl6TRSEqw%3d>
> is missing the "gossipinfo" command. I haven't checked if there's more
> commands missing.
> >>>
> >>> Let me know if I can help!
> >>>
> >>> Best regards
> >>> Pedro Gordo
> >>>
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder,
> >> https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fwww.d
> >> atastax.com&data=01%7c01%7cAnubhav.Kale%40microsoft.com%7c1d8be3522d2
> >> d494617df08d35bd273e5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=Lj
> >> hh7VAb8bqQ5aTq6DYIlEXnhAFqyTX7aZeHRqGzCG4%3d
> >> @spyced
> >>
>
>


max_mutation_size_in_kb addition not noted in CHANGES.txt

2016-04-10 Thread Jack Krupansky
I don't find mention of max_mutation_size_in_kb in CHANGES.txt, but I do
see it mentioned for 3.0.0 in NEWS.txt. It doesn't have (DataStax) doc
either.

It looks like it was added on 8/19/2015: "patch by Aleksey Yeschenko;
reviewed by Benedict Elliott Smith for CASSANDRA-6230":
https://github.com/apache/cassandra/commit/96d41f0e0e44d9b3114a5d80dedf12053d36a76b#diff-b66584c9ce7b64019b5db5a531deeda1

It probably should be (or have been) added to CHANGES.txt as well.

I do see that comments on this was added to the yaml file on 10/2/2015 as
part of:
https://issues.apache.org/jira/browse/CASSANDRA-10256

-- Jack Krupansky


Typo in CommitLog.java: maxiumum s.b. maximum

2016-04-10 Thread Jack Krupansky
I was baffled why I couldn't find a user's reported log message of
"Mutation 32MB too large for maximum size of 16Mb" even when I searched
GitHub for "too large for maximum size". The reason my search failed was
that the user (or their email client) must have corrected the typo in the
message that the code actually produces - the code has the text "too large
for the maxiumum size", when "maxiumum" should be "maximum".

In CommitLog.java, the current code in the add method in trunk:

if (totalSize > MAX_MUTATION_SIZE)
{
throw new IllegalArgumentException(String.format("Mutation of
%s is too large for the maxiumum size of %s",

 FBUtilities.prettyPrintMemory(totalSize),

 FBUtilities.prettyPrintMemory(MAX_MUTATION_SIZE)));
}

The corrected code:

if (totalSize > MAX_MUTATION_SIZE)
{
throw new IllegalArgumentException(String.format("Mutation of
%s is too large for the maximum size of %s",

 FBUtilities.prettyPrintMemory(totalSize),

 FBUtilities.prettyPrintMemory(MAX_MUTATION_SIZE)));
}

-- Jack Krupansky


COMPACT STORAGE in 4.0?

2016-04-11 Thread Jack Krupansky
My understanding is Thrift is being removed from Cassandra in 4.0, but will
COMPACT STORAGE be removed as well? Clearly the two are related, but
COMPACT STORAGE had a performance advantage in addition to Thrift
compatibility, so its status is ambiguous.

I recall vague chatter, but no explicit deprecation notice or 4.0 plan for
removal of COMPACT STORAGE. Actually, I don't even see a deprecation notice
for Thrift itself in CHANGES.txt.

Will a table with only a single non-PK column automatically be implemented
at a comparable level of efficiency compared to the old/current Compact
STORAGE? That will still leave the question of how to migrate a non-Thrift
COMPACT STORAGE table (i.e., used for performance by a CQL-oriented
developer rather than Thrift compatibility per se) to pure CQL.

-- Jack Krupansky


Undocumented config properties

2016-04-11 Thread Jack Krupansky
When checking into doc for max_mutation_size_in_kb I noticed that there are
more Config properties that are neither in the yaml nor the DataStax doc -
or the old (outdated) Config wiki. For example, the first is
permissions_cache_max_entries.

Before suggesting/requesting that the DataStax doc guys add all the missing
properties, are there any properties in Config.java that should not be
documented or disclosed to users?

A separate question is whether "doc" for all Config properties should be
added to the yaml file, even if the actual properties are commented out.

If the answer is that not all of the properties should be documented, there
should be an annotation convention for those that should be hidden.

See:
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/Config.java
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml

For (outdated) reference:
http://wiki.apache.org/cassandra/StorageConfiguration

-- Jack Krupansky


Re: COMPACT STORAGE in 4.0?

2016-04-11 Thread Jack Krupansky
Thanks, Benedict. Is this only true as of 3.x (new storage engine), or was
the equivalent efficiency also true with 2.x?

It would be good to have an explicit statement on this efficiency question
in the spec/doc since the spec currently does say: "The option also *provides
a slightly more compact layout of data on disk* but at the price of
diminished flexibility and extensibility for the table." So, if that "slightly
more compact layout of data on disk" benefit is no longer true, away with
it. See:
https://cassandra.apache.org/doc/cql3/CQL-3.0.html

And I would recommend that your own statement be added there instead.

-- Jack Krupansky

On Mon, Apr 11, 2016 at 5:03 PM, Jeremiah Jordan 
wrote:

> As I understand it "COMPACT STORAGE" only has meaning in the CQL parser
> for backwards compatibility as of 3.0. The on disk storage is not affected
> by its usage.
>
> > On Apr 11, 2016, at 3:33 PM, Benedict Elliott Smith 
> wrote:
> >
> > Compact storage should really have been named "not wasteful storage" -
> now
> > everything is "not wasteful storage" so it's void of meaning. This is
> true
> > without constraint. You do not need to limit yourself to a single non-PK
> > column; you can have many and it will remain as or more efficient than
> > "compact storage"
> >
> > On Mon, 11 Apr 2016 at 15:04, Jack Krupansky 
> > wrote:
> >
> >> My understanding is Thrift is being removed from Cassandra in 4.0, but
> will
> >> COMPACT STORAGE be removed as well? Clearly the two are related, but
> >> COMPACT STORAGE had a performance advantage in addition to Thrift
> >> compatibility, so its status is ambiguous.
> >>
> >> I recall vague chatter, but no explicit deprecation notice or 4.0 plan
> for
> >> removal of COMPACT STORAGE. Actually, I don't even see a deprecation
> notice
> >> for Thrift itself in CHANGES.txt.
> >>
> >> Will a table with only a single non-PK column automatically be
> implemented
> >> at a comparable level of efficiency compared to the old/current Compact
> >> STORAGE? That will still leave the question of how to migrate a
> non-Thrift
> >> COMPACT STORAGE table (i.e., used for performance by a CQL-oriented
> >> developer rather than Thrift compatibility per se) to pure CQL.
> >>
> >> -- Jack Krupansky
> >>
>


CQL spec error: COUNT(column)

2016-04-18 Thread Jack Krupansky
The CQL spec for COUNT says:

"It also can be used to count the non null value of a given column. Example:

SELECT COUNT(scores) FROM plays;"

But, the parser only recognizes COUNT(*) and COUNT(1).

See:
https://cassandra.apache.org/doc/cql3/CQL-3.0.html
https://github.com/apache/cassandra/blob/trunk/src/antlr/Parser.g#L276

Does this need a Jira, or can somebody just fix it to avoid the paperwork?

Or, was this actually supposed to  work and the bug is some missing
implementation?

Thanks.

-- Jack Krupansky


Re: CQL spec error: COUNT(column)

2016-04-18 Thread Jack Krupansky
No, I didn't test, I was just reading the code, but I hadn't checked for
all occurrences of K_COUNT, so I hadn't noticed that it also occurs in the
allowedFunctionName grammar production rule. And I found the code that
dynamically creates a count function for each type here:
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L885

So, sorry for the confusion. Some comments in the grammar would have helped.

And it's not in the DataStax doc, but that's another issue I can follow up
on.

-- Jack Krupansky

On Mon, Apr 18, 2016 at 10:12 AM, Benjamin Lerer <
benjamin.le...@datastax.com> wrote:

> May be I misunderstood you.
> Do you mean that you tested it and that it is not working on the version
> you used?
>
> On Mon, Apr 18, 2016 at 3:50 PM, Jack Krupansky 
> wrote:
>
> > The CQL spec for COUNT says:
> >
> > "It also can be used to count the non null value of a given column.
> > Example:
> >
> > SELECT COUNT(scores) FROM plays;"
> >
> > But, the parser only recognizes COUNT(*) and COUNT(1).
> >
> > See:
> > https://cassandra.apache.org/doc/cql3/CQL-3.0.html
> > https://github.com/apache/cassandra/blob/trunk/src/antlr/Parser.g#L276
> >
> > Does this need a Jira, or can somebody just fix it to avoid the
> paperwork?
> >
> > Or, was this actually supposed to  work and the bug is some missing
> > implementation?
> >
> > Thanks.
> >
> > -- Jack Krupansky
> >
>


Re: C-136A Questionnaire for Apache Cassandra

2016-04-18 Thread Jack Krupansky
Here's a recent security assessment discussion - list of questions,
proposed response, and discussion, which you might find helpful, courtesy
of Oleg Yusim:
https://docs.google.com/document/d/13-yu-1a0MMkBiJFPNkYoTd1Hzed9tgKltWi6hFLZbsk/edit?ts=56c3a130#heading=h.xq6exsjcda8


-- Jack Krupansky

On Mon, Apr 18, 2016 at 7:31 AM, Johnson, Tom (ES & CSO) <
thomas.johns...@ngc.com> wrote:

> Apache Cassandra Support,
>
>
>
> In order to install this software on our servers, I need help from Planet
> Cassandra in completing the attached questionnaire for Northrop Grumman
> Information Security.  Please provide as much detail as possible for all
> questions.  If you have any questions for me, please let me know.  Thanks.
>
>
>
>
>
> *Thomas (Tom) Johnson*
>
> *Software Development Analyst*
>
> *Engineering Tools Support Team (ETST)*
>
> *Northrop Grumman Information Technology*
>
> *Office: 321-674-3168 <321-674-3168>*
>
> *thomas.johns...@ngc.com *
>
>
>


Typo in comment for maxFunctionForCounter in AggregateFcts.java

2016-04-19 Thread Jack Krupansky
Comment typo for maxFunctionForCounter in AggregateFcts.java:

/**
 * AVG function for counter column values.
 */
public static final AggregateFunction maxFunctionForCounter =
new NativeAggregateFunction("max", CounterColumnType.instance,
CounterColumnType.instance)

That comment should be "MAX" rather than "AVG"

See:
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L760

Oops... I see more copy and paste typos, where the type for the function is
wrong in the comment:
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L284
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L320
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L362
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L398

Let me know if this needs a Jia ticket or if somebody can just fix it
without the paperwork.

-- Jack Krupansky


Re: Typo in comment for maxFunctionForCounter in AggregateFcts.java

2016-04-19 Thread Jack Krupansky
Thanks for the prompt service!

-- Jack Krupansky

On Tue, Apr 19, 2016 at 12:16 PM, Benjamin Lerer <
benjamin.le...@datastax.com> wrote:

> Done. Thanks for reporting the problem.
>
> On Tuesday, April 19, 2016, Benjamin Lerer 
> wrote:
>
> > I will fix it. I am the one that has probably done them anyway.
> >
> > Benjamin
> >
> > On Tue, Apr 19, 2016 at 4:57 PM, Jack Krupansky <
> jack.krupan...@gmail.com
> > > wrote:
> >
> >> Comment typo for maxFunctionForCounter in AggregateFcts.java:
> >>
> >> /**
> >>  * AVG function for counter column values.
> >>  */
> >> public static final AggregateFunction maxFunctionForCounter =
> >> new NativeAggregateFunction("max", CounterColumnType.instance,
> >> CounterColumnType.instance)
> >>
> >> That comment should be "MAX" rather than "AVG"
> >>
> >> See:
> >>
> >>
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L760
> >>
> >> Oops... I see more copy and paste typos, where the type for the function
> >> is
> >> wrong in the comment:
> >>
> >>
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L284
> >>
> >>
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L320
> >>
> >>
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L362
> >>
> >>
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java#L398
> >>
> >> Let me know if this needs a Jia ticket or if somebody can just fix it
> >> without the paperwork.
> >>
> >> -- Jack Krupansky
> >>
> >
> >
>


Re: Criteria for upgrading to 3.x releases in PROD

2016-04-23 Thread Jack Krupansky
Is the question whether a new application can go into production with 3.x,
or whether an existing application in production with 2.x.y should be
upgraded to 3.x?

For the latter, a "If it ain't broke, don't fix it" philosophy is best. And
if there are critical bug fixes needed, simply upgrade the 2.x line that
you are already on. Or if your production is on 3.0.x, upgrade to 3.0.x+k.

For the former, we aren't hearing people hollering that 3.x is crap, so it
is reasonably safe for a new app going into production, subject to your own
testing.

Given the relative stability of 3.x due to the tick-tock and "trunk always
releasable" strategies, users are no longer faced with the kind of wild
instabilities of the past.

Ultimately, stability really is subjective and in the eye of the beholder -
how conservative or adventurous are you and your organization. Sure, maybe
2.2.x is more stable in some abstract sense, but for a new app, why start
so far behind the curve? In fact, for a new app you should be trying to
take advantage of new features and performance improvements, like
materialized views, SASI, and wide rows coming soon.

In the past, upgrading from 2.x to 2.y was a big deal. That just isn't a
problem with upgrading from 3.x to 3.y. At least in theory, and again,
nobody has been hollering about having problems doing that.

For EOL, you will have to judge for yourself how long it may take your
organization to carefully migrate a production 2.x system to 3.x somewhere
down the road. No need to rush, but don't wait until the last minute
either. And I suspect that you won't even want to think about upgrading 2.x
to 4.x - IOW, upgrade to 3.x well before 3.x EOL.

-- Jack Krupansky

On Sat, Apr 23, 2016 at 3:28 PM, Anuj Wadehra <
anujw_2...@yahoo.co.in.invalid> wrote:

> Jonathan,
> I understand you point. In my perspective, people in production usually
> prefer stability over features and would always want at least emergency fix
> releases if not fully supported versions.I am glad that today we have such
> releases which are very stable and not yet EOL. Its just that users are
> tempted to use latest odd releases as per the tick-tock strategy
> highlighted on the website and then probably fallback to previous ones
> after discussing stable versions on various forums. I just wanted to make
> their decisions simpler :) I agree with you - Every thing cant be white and
> black..stable and unstable..At the same..I feel.. most of the time there
> would be a single stable release which is not EOL.
> Thanks for your time.
>
>
> Anuj
> Sent from Yahoo Mail on Android
>
>   On Tue, 19 Apr, 2016 at 7:06 AM, Jonathan Ellis
> wrote:   Anuj,
>
> The problem is that this question defies a simplistic answer like "version
> X is the most stable" (are you willing to use unsupported releases?  what
> about emergency-fix-only?  what features can you not live without?) so
> we're intentionally resisting the urge to oversimplify the situation.
>
> On Mon, Apr 18, 2016 at 8:25 PM, Anuj Wadehra <
> anujw_2...@yahoo.co.in.invalid> wrote:
>
> > Hi All,
> > Let me reiterate, my question is not about selecting right Cassandra for
> > me. The intent is to get dev community response on below question.
> > Question:
> > Would it be a wise decision to mention the "most stable/production
> > ready" version (as it used to be before 3.x) on the Apache website till
> > tick-tock release strategy evolves and matures?
> >
> > Drivers for posting above info on website:
> >  I have read all the posts/forums and realized that there is no absolute
> > answer for selecting Production Ready Cassandra version one should
> > use..Even now, people often hesitate to recommend latest releases for
> Prod
> > and go back to 2.1 and 2.2..In every suggestion there are too many
> > ifs..like I said...if you want features x..if u want rock solid y..if you
> > are adventurous zno offense but  who would not want a rock solid
> > version for Production? Who would want features for stability in Prod?
> And
> > who would want to take risks in Prod?
> >  The stability of a release should NOT depend my risk appetite and use
> > case..if some version of 2.1 or 2.2 or 3.0.x is stable for production why
> > not put that info until tick-tock matures?
> >
> > Please realize that everyone goes for thorough testing before upgrading
> > but the scope of application testing cant uncover most critical
> > bugs..Community guidance and a bigger picture on stability can help the
> > community until tick-tock matures and we deliver stable production ready
> > releases.
> >
> >
> >
> > ThanksAnuj
> > Sent from Yahoo Ma

Re: Criteria for upgrading to 3.x releases in PROD

2016-04-24 Thread Jack Krupansky
The old "most stable" labeling was super-important in the old days since
newer releases tended to have a fair amount of instability - hence the
admonition to wait for a x.y.5 release before using the x.y release line.
But that dramatic degree of release instability is no longer the case.
Besides, we already have 3.5 and 3.0.5, so that x.y.5 point is now moot.
And, generally, there won't be any 3.x.y releases other than 3.0.y and
3.x.0 unless something especially unusual goes wrong - there have only been
two, 3.1.1 and 3.2.1.

The simple fact that there have only been five subsequent patch releases
for 3.0.0 in over five months is a testament to its stability.

That's not an argument for why anybody should switch from 2.x to 3.x if 2.x
is working fine for them, but a challenge to the presumption that 3.x is
unstable and unsuitable for production.

To put it more simply, the goal is that no 3.x (or x.y for x > 2) goes out
the door unless it is suitable for production.

To put it even more strongly, I think the development and release process
is now robust enough that no new releases, 2.x or 3.x, get out the door
unless suitable for production.

In short, the only reason to go with 2.2.x at this stage is if you are
currently using 2.2.x and simply don't want to rock the boat in even the
smallest way. That's reasonable.

But if you are using 2.1.x or earlier and feel the need to upgrade due to
EOL or whatever, in may be less risky to complete the full upgrade path to
3.x rather than go halfway since sometimes halfhearted measures don't get
the degree of attention needed for the full effort, such as people trying
to cut corners to do it on the cheap.

-- Jack Krupansky

On Sat, Apr 23, 2016 at 10:50 PM, Anuj Wadehra <
anujw_2...@yahoo.co.in.invalid> wrote:

> Jack,
> The question was about publishing "most stable" release on Apache website
> as it done before 3.x.
> Regarding your comments, I still feel adventure cant happen in production
> systems. And you should certainly test every release before upgrading but
> you woulf not like to upgrade to latest releases based on your limited
> testing. I feel that you cant do exhaustive testing of the database and can
> easily miss critical corner cases which may trigger in production. But its
> just my perspective of looking at things. People may think differently.
> Thanks All of you for your comments !!
>
> ThanksAnuj
> Sent from Yahoo Mail on Android
>
>   On Sun, 24 Apr, 2016 at 1:28 AM, Jack Krupansky
> wrote:   Is the question whether a new application can go into production
> with 3.x,
> or whether an existing application in production with 2.x.y should be
> upgraded to 3.x?
>
> For the latter, a "If it ain't broke, don't fix it" philosophy is best. And
> if there are critical bug fixes needed, simply upgrade the 2.x line that
> you are already on. Or if your production is on 3.0.x, upgrade to 3.0.x+k.
>
> For the former, we aren't hearing people hollering that 3.x is crap, so it
> is reasonably safe for a new app going into production, subject to your own
> testing.
>
> Given the relative stability of 3.x due to the tick-tock and "trunk always
> releasable" strategies, users are no longer faced with the kind of wild
> instabilities of the past.
>
> Ultimately, stability really is subjective and in the eye of the beholder -
> how conservative or adventurous are you and your organization. Sure, maybe
> 2.2.x is more stable in some abstract sense, but for a new app, why start
> so far behind the curve? In fact, for a new app you should be trying to
> take advantage of new features and performance improvements, like
> materialized views, SASI, and wide rows coming soon.
>
> In the past, upgrading from 2.x to 2.y was a big deal. That just isn't a
> problem with upgrading from 3.x to 3.y. At least in theory, and again,
> nobody has been hollering about having problems doing that.
>
> For EOL, you will have to judge for yourself how long it may take your
> organization to carefully migrate a production 2.x system to 3.x somewhere
> down the road. No need to rush, but don't wait until the last minute
> either. And I suspect that you won't even want to think about upgrading 2.x
> to 4.x - IOW, upgrade to 3.x well before 3.x EOL.
>
> -- Jack Krupansky
>
> On Sat, Apr 23, 2016 at 3:28 PM, Anuj Wadehra <
> anujw_2...@yahoo.co.in.invalid> wrote:
>
> > Jonathan,
> > I understand you point. In my perspective, people in production usually
> > prefer stability over features and would always want at least emergency
> fix
> > releases if not fully supported versions.I am glad that today we have
> such
> > releases which are very stable and not yet EOL. Its just that users

Re: [Proposal] Mandatory comments

2016-05-02 Thread Jack Krupansky
+1 for some kind of (unspecified) improvement on the comment front.

+1 for updated style guide giving recommended practices for commenting

Specific rules/guidelines? So hard to get it right at an abstract level -
what can sound great in theory can fail miserably in practice. And what can
work great initially, when people are all excited about it, can slowly rot
as that initial enthusiasm fades.

How to start? Suggestion: Instead of trying to seek consensus on abstract
rules/guidelines, just randomly pick some module that is in need of better
comments and have dueling patches for how to best comment it. And then once
the dust settles and there is some general consensus on what the
real/implied rules/guidelines should be, based on the reality of that
initial module, pick another module.and see if the deduced rules/guidelines
from the first module can be methodically applied.

-- Jack Krupansky

On Mon, May 2, 2016 at 1:29 PM, Josh McKenzie  wrote:

> o.a.c.hints/package-info.java is a pretty good example of what that looks
> like. My previous statement about dangers of comment atrophy and needing to
> be diligent holds doubly-true if the comments themselves aren't localized
> to the files we're touching on modification.
>
> I see a case for better documentation on all three: public API's for
> classes, classes, and package level. Each serve different and important
> purposes IMO.
>
> On Mon, May 2, 2016 at 1:16 PM, Jonathan Ellis  wrote:
>
> > What I'd like to see is more comments like the one in StreamSession:
> > something that can give me the "big picture" for a piece of
> functionality.
> >
> > I wonder if focusing on class-based comments might miss an opportunity
> > here.  StreamSession was chosen somewhat arbitrarily to be where we
> > described the streaming life cycle.  If we focused just on describing
> each
> > class in isolation then we might miss something more valuable.
> >
> > Is this a case for package-level javadoc, and organizing our class
> > hierarchy better along those lines?
> >
> > On Mon, May 2, 2016 at 11:26 AM, Sylvain Lebresne 
> > wrote:
> >
> > > There could be disagreement on this, but I think that we are in general
> > not
> > > very good at commenting Cassandra code and I would suggest that we
> make a
> > > collective effort to improve on this. And to help with that goal, I
> would
> > > like
> > > to suggest that we add the following rule to our code style guide
> > > (https://wiki.apache.org/cassandra/CodeStyle):
> > >   - Every public class and method must have a quality javadoc. That
> > > javadoc, on
> > > top of describing what the class/method does, should call
> particular
> > > attention to the assumptions that the class/method does, if any.
> > >
> > > And of course, we'd also start enforcing that rule by not +1ing code
> > unless
> > > it
> > > adheres to this rule.
> > >
> > > Note that I'm not pretending this arguably simplistic rule will
> magically
> > > make
> > > comments perfect, it won't. It's still up to us to write good and
> > complete
> > > comments, and it doesn't even talk about comments within methods that
> are
> > > important too. But I think some basic rule here would be beneficial and
> > > that
> > > one feels sensible to me.
> > >
> > > Looking forward to other's opinions and feedbacks on this proposal.
> > >
> > > --
> > > Sylvain
> > >
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder, http://www.datastax.com
> > @spyced
> >
>


Re: [Proposal] Mandatory comments

2016-05-03 Thread Jack Krupansky
Not so much wiggle room in that case so much as a guideline for commenting
getters and setters and the field they access.

Some consensus is needed whether there should be some rote comment for
getters and setters, or whether the Javadoc should be simply skipped for
simple getters and setters, provided that there is separate doc for the
field that they name.

I'm not sure what the best way is to document or distinguish a private
field vs. a pseudo-field whose getters and setters are implemented by
calling methods rather than merely returning or setting the private field.

I mean, you don't want to clutter the public Javadoc with details of every
private field, but those private fields that have getters and setters
clearly need to be documented - at least in terms that will make sense to
the users of the getters and setters.

One alternative is to document the field/pseudo-field on either the getter
or the setter with a "See..." linked on the other. I mean, if you are
looking at the code for one, you should be able to quickly get to the doc
for the field/pseudo-field itself.

And if there are any special rules or checks in the setter, they should
have Javadoc, either for the setter or the field.


-- Jack Krupansky

On Tue, May 3, 2016 at 12:57 PM, Eric Evans 
wrote:

> On Mon, May 2, 2016 at 11:26 AM, Sylvain Lebresne 
> wrote:
> > Looking forward to other's opinions and feedbacks on this proposal.
>
> We might want to leave just a little wiggle room for judgment on the
> part of the reviewer, for the very simple cases.  Documenting
> something like setFoo(int) with "Sets foo" can get pretty tiresome for
> everyone, and doesn't add any value.
>
> Otherwise I think this is perfectly reasonable; +1
>
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>


Re: [Proposal] Mandatory comments

2016-05-05 Thread Jack Krupansky
FWIW, I recently wrote up a bunch of notes on Code Quality and published
them on Medium. There are notes on comments and consistency and boilerplate
buried in there.

WARNING: There's a lot of stuff there and it is not for the  faint of heart
or those not truly committed to code quality.

tl;dr - I'm not a fan of boiler plate just to say you did something, but...
I am a fan of consistency, but that doesn't mean every situation is the
same, just that similar situations should be treated similarly - unless
there is some reasonable reason to do otherwise.

See:
https://medium.com/@jackkrupansky/code-quality-preamble-932626a3131c#.ynrjbryus
https://medium.com/@jackkrupansky/software-and-product-quality-notes-no-1-346ab1d8df24#.xzg1ihuxb
https://medium.com/@jackkrupansky/code-quality-notes-no-1-4dc522a5e29c#.cm7tan2zu
https://medium.com/@jackkrupansky/code-quality-notes-no-2-7939377b73c6#.zco8oq3dj


-- Jack Krupansky

On Thu, May 5, 2016 at 10:55 AM, Eric Evans 
wrote:

> On Wed, May 4, 2016 at 12:14 PM, Jonathan Ellis  wrote:
> > On Wed, May 4, 2016 at 2:27 AM, Sylvain Lebresne 
> > wrote:
> >
> >> On Tue, May 3, 2016 at 6:57 PM, Eric Evans 
> >> wrote:
> >>
> >> > On Mon, May 2, 2016 at 11:26 AM, Sylvain Lebresne <
> sylv...@datastax.com>
> >> > wrote:
> >> > > Looking forward to other's opinions and feedbacks on this proposal.
> >> >
> >> > We might want to leave just a little wiggle room for judgment on the
> >> > part of the reviewer, for the very simple cases.  Documenting
> >> > something like setFoo(int) with "Sets foo" can get pretty tiresome for
> >> > everyone, and doesn't add any value.
> >> >
> >>
> >> I knew someone was going to bring this :). In principle, I don't really
> >> disagree. In practice though,
> >> I suspect it's sometimes just easier to adhere to such simple rule
> somewhat
> >> strictly. In particular,
> >> I can guarantee that we don't all agree where the border lies between
> what
> >> warrants a javadoc
> >> and what doesn't. Sure, there is a few cases where you're just
> paraphrasing
> >> the method name
> >> (and while it might often be the case for getters and setters, it's
> worth
> >> noting that we don't really
> >> do much of those in C*), but how hard is it to write a one line comment?
> >> Surely that's a negligeable
> >> part of writing a patch and we're not that lazy.
> >>
> >
> > I'm more concerned that this kind of boilerplate commenting obscures
> rather
> > than clarifies.  When I'm reading code i look for comments to help me
> > understand key points, points that aren't self-evident.  If we institute
> a
> > boilerplate "comment everything" rule then I lose that signpost.
>
> This.
>
> Additionally you could also probably argue that it obscures the true
> purpose to leaving a comment; It becomes a check box to tick, having
> some javadoc attached to every method, rather than genuinely looking
> for the value that could be added with quality comments (or even
> altering the approach so that the code is more obvious in the absence
> of them).
>
> The reason I suggested "wiggle room", is that I think everyone
> basically agrees that the default should be to leave good comments
> (and that that hasn't been the case), that we should start making this
> a requirement to successful review, and that we can afford to leave
> some room for judgment on the part of the reviewer.  Worse-case is
> that we find in doing so that there isn't much common ground on what
> constitutes a quality comment versus useless boilerplate, and that we
> have to remove any wiggle room and make it 100% mandatory (I don't
> think that will (has to) be the case, though).
>
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>


Re: Expiring Tables or Columns?

2014-07-15 Thread Jack Krupansky

The "default_time_to_live" property for CREATE TABLE. See:
http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/tabProp.html

That's in the DataStax CQL doc, but I don't see it in the Apache Cassandra 
CQL spec:

https://cassandra.apache.org/doc/cql3/CQL-2.0.html

-- Jack Krupansky

-Original Message- 
From: Brandon Williams

Sent: Tuesday, July 15, 2014 5:36 PM
To: dev@cassandra.apache.org
Subject: Re: Expiring Tables or Columns?

https://issues.apache.org/jira/browse/CASSANDRA-3974


On Tue, Jul 15, 2014 at 4:14 PM, Shehaaz Saif  wrote:


Hi!

I just received a question from someone at work about setting a TTL for a
table...I said that they can't do that...it can only be set at the column
level. Is that right?

These are the options that they have:

http://www.datastax.com/documentation/cql/3.0/cql/cql_using/use_remove_data_c.html
<http://s.bl-1.com/h/hPXFFQr>

Thank you!

-Shehaaz

--
Shehaaz Saif
Twitter: @Shehaaz <http://s.bl-1.com/h/hPXYLQ9>
Blog: www.shehaaz.com <http://s.bl-1.com/h/hPXYRpC>
Tel: 425-516-2160 <http://s.bl-1.com/h/hPXYWCF>





Re: Embedding Cassandra

2015-03-24 Thread Jack Krupansky
Are you trying to use Cassandra as a simple single-node data store within a
process, or are you trying to wrap Cassandra and run a cluster of custom
nodes? IOW, what are you trying to accomplish.

You could always run Cassandra as a spawned, background process, fully
controlled by a parent process.

Who is doing the queries, the program in which Cassandra is embedded, or an
external client over the net?


-- Jack Krupansky

On Tue, Mar 24, 2015 at 5:38 AM, Ersin Er  wrote:

> Hi all,
>
> As far as I understand it seems to be enough to initialize and activate
> CassandraDaemon for embedding Cassandra. (Right?)
>
> My question is what's the next step? Which class or classes should I use in
> order to execute CQL queries for example?
>
> Any pointers would be great.
>
> Thanks!
>
> --
> Ersin Er
>