+1 On Wed, May 3, 2017 at 5:43 PM Anthony Baker <aba...@pivotal.io> wrote:
> > 2) I think we should pull out the message fragmentation support to avoid > some significant complexity. We can later add a fragmentation / envelope > layer on top without disrupting the current proposal. I do think we should > add the capability for chunking data (more on that below). > +10 Like any good engineering practice we need to keep objects well encapsulated and focused on their singular task. A message would not be represented as a object of "partials" but as a whole message object, so why treat it any differently when serialized. The layer below it can chunk it if necessary. Initially between the message (lowest level in our stack) and the TCP socket is nothing. TCP will fragment as needed, full message is delivered up the stack. If in the future we ant to mulitchannel/interleave/pipeline (whatever you want to call it) we can negotiate the support with the client and inject a layer between the message and TCP layers that identifies unique streams of data channels. In the interim, the naive approach to multiple channels is to open a second socket. The important thing is that at the message layer it doesn't know and doesn't care. > 4) Following is an alternative definition with these characteristics: > > - Serialized data can either be primitive or encoded values. Encoded > values are chunked as needed to break up large objects into a series of > smaller parts. > +1 > - Because values can be chunked, the size field is removed. This allows > the message to be streamed to the socket incrementally. > +1 > - The apiVersion is removed because we can just define a new body type > with a new apiId (e.g. GetRequest2 with aipId = 1292). > +1 Think of the message as a class, you don't want to have class that has more than a single personality. If the first argument to your class (version) is the personality then you need to think about a new class. You don't want the writer of the protocol to have to deduce the personality of the object based on an argument and then have to decide which fields are require or optional or obsolete. By making a new message you strongly type the messages both in definition and in implementation. > - The GetRequest tells the server what kind of encoding the client is able > to understand. > + 1 I would suggest that a default ordered list be established at initial handshake. If a list is not provided at handshake then ALL are supported. Then on individual request messages if the list of encodings is given then it overrides the list allowed for that single request. If no list is provided on the request the handshake negotiated list is assumed. If a value being returned is not encoded in any of the encodings listed then it is transcoded to the highest priority encoding with an available transcoder between the source and destination encoding. GetRequest => 0 acceptEncoding key > 0 => the API id > acceptEncoding => (define some encodings for byte[], JSON, PDX, *, etc) > key => EncodedValue > Change: acceptedEncodings => encodingId* Would it make sense to make 'key' a 'key+' or does a GetAllRequest and GetAllResponse vary that much from GetRequest and GetResponse? PutRequest => 2 eventId key value > 2 => the API id > eventId => clientId threadId sequenceId > clientId => string > threadId => integer > sequenceId => integer > key => EncodedValue > value => EncodedValue > The eventId is really just a once token right? Meaning that its rather opaque to the server and intended to keep the server from replaying a request that the client may have retried that was actually successful. If it is opaque to the server then why encode all these specific identifiers? Seems to me it could be optional for one and could simply be a variant int or byte[]. The server just needs to stash the once tokens and make sure it doesn't get a duplicate on this client stream. -Jake