Best place to discuss CQL Binary Protcol spec?
Hey, all, I've been working on a greenfield Perl client for the CQL Binary Protocol. Since this is a client-in-progress, and my question is actually about the protocol, I guessed dev@ seemed like the better list, but please let me know if I should relocate to client-dev@. As always happens when working from a spec, I have ended up with a quick clarification request, a more involved question, and would like to know how best to contribute to the document. * 4.1.2. CREDENTIALS My quick clarification is from this bit of text: The body is a list of key/value informations. It is a [short] n, followed by n pair of [string]. These key/value pairs [...] Is this just a string map, and the text just isn't using consistent terminology? * 4.2.5.2. Rows My more involved question is about this text describing the column contents: - is composed of ... where m is . Each is composed of ... where n is and where is a [bytes] representing the value returned for the jth column of the ith row. In other words, is composed of ( * ) [bytes]. I read this and thought, "Oh, sure I'll need to figure out the width of the java types for the different columns, tedious but easily doable", and then noticed some of the options are things like Blob or Varchar, both which I would assume to be variable width. So how should one determine how many bytes to read for different types? I'm guessing the actual information about how much space the different values take up is located somewhere else. At the very least it seems like that should be mentioned, though even more ideal, it seems to me all that information should be called out in the spec itself. * Updating the docs Which kind of brings me to my final question: what would be the best way to contribute cleanups, etc. for the document, and how far could I take it? At the very least, there are a lot of typos I'd be happy to fix. I also think the text could be tightened up in various ways. And I think some things could be moved around to make the spec more accessible to implementors. But most importantly, I think it needs to be in some format that can produce a hyperlinked document, because right now having to scroll back and forth through everything is tedious indeed. But it seems improbable to me that this is the native format for the document---did someone really do that TOC by hand? So is there a source doc where it would be best to actually work on edits? And if not, could I contribute by converting it to textile (which seems already in use in the tree) or perhaps markdown? Mike.
Re: Best place to discuss CQL Binary Protcol spec?
I can't usefully speak to your other questions, but the answers to the technical questions are below. On Mon, Feb 18, 2013 at 1:16 PM, Michael Alan Dorman < mdor...@ironicdesign.com> wrote: > * 4.1.2. CREDENTIALS > > My quick clarification is from this bit of text: > > The body is a list of key/value informations. It is a [short] n, > followed by n pair of [string]. These key/value pairs [...] > > Is this just a string map, and the text just isn't using consistent > terminology? > It has the same structure as a string map, but might not necessarily *be* a string map. I would guess that this phrasing is used because it may be possible to have multiple identical "keys" in this structure, which would not make sense in a [string map]. (Although I don't think it's explicitly stated, it seems safe to imply that [string map] is intended to be a plain lookup table, not a set of arbitrary pairs.) * 4.2.5.2. Rows > > My more involved question is about this text describing the column > contents: > > - is composed of ... where m is > . > Each is composed of ... where n is > and where is a [bytes] representing the value > returned for the jth column of the ith row. In other words, > > is composed of ( * ) [bytes]. > > I read this and thought, "Oh, sure I'll need to figure out the width of > the java types for the different columns, tedious but easily doable", > and then noticed some of the options are things like Blob or Varchar, > both which I would assume to be variable width. So how should one > determine how many bytes to read for different types? > As the doc says, each is a [bytes], which means it's represented on the wire as an [int] x followed by x bytes. p
Re: Best place to discuss CQL Binary Protcol spec?
paul cannon writes: > It has the same structure as a string map, but might not necessarily *be* a > string map. I would guess that this phrasing is used because it may be > possible to have multiple identical "keys" in this structure, which would > not make sense in a [string map]. (Although I don't think it's explicitly > stated, it seems safe to imply that [string map] is intended to be a plain > lookup table, not a set of arbitrary pairs.) OK, so it is distinct. Thanks for the clarification. > As the doc says, each is a [bytes], which means it's represented > on the wire as an [int] x followed by x bytes. Thank you for pointing out what I had succeeded in reading repeatedly without actually processing. ;) At the same time, that seems to gloss over the structure of the content---it's not all encoded as string values, or is it? Mike.
Re: Best place to discuss CQL Binary Protcol spec?
On Mon, Feb 18, 2013 at 6:48 PM, Michael Alan Dorman < mdor...@ironicdesign.com> wrote: > paul cannon writes: > > As the doc says, each is a [bytes], which means it's represented > > on the wire as an [int] x followed by x bytes. > > Thank you for pointing out what I had succeeded in reading repeatedly > without actually processing. ;) > > At the same time, that seems to gloss over the structure of the > content---it's not all encoded as string values, or is it? > No, the values are serialized according to whatever data type definitions they have. The data types and serializations are technically details of Cassandra usage in general (not specific to the native protocol), and the types aren't limited to the ones which are assigned type IDs in the native protocol, so it is arguably appropriate to leave out type serialization details in the native protocol document (I could see it either way). If you do need details, the builtin cassandra types and serialization formats are defined in the various org.apache.cassandra.db.marshal.*Type classes. Or read deserialization code from the other C* libraries. p