On Sunday, January 19, 2014 12:48:36 PM UTC-6, Brian Craft wrote:
>
> That helps, thanks. It's still unclear to me that this is important enough
> to worry about. What application or service is hindered by string encoding
> a date in JSON? An example would really help. It's not compelling to assert
> or imagine some hypothetical application that benefits from knowing a field
> is a date without having any other knowledge of it. I would guess that
> cases where this matters are vanishingly few.
>
This isn't dates, but it's along the same lines:
I'm working on the central hub of a communications distributor/router/thing
which has to deal with a wide range of diverse clients that each speak
whatever "format" was the most expedient for whoever wrote that piece. The
closest thing that we have to a standard is "we mostly want to use JSON."
A huge chunk of our messages include UUIDs. This left me with two real
options:
1. Special case every incoming message, based on what I know about the
sender, and convert the fields that I know are supposed to represent a UUID
(based on extremely informal verbal "specs") as a message is read
2. Take the generic, weakly coupled approach: pass every incoming message
through a parser that converts every value string into a UUID (if that
conversion is possible).
The pain here could be alleviated with a formalized schema, but we're
working too fast and furious for that. We tossed out the last one of those
we had almost 2 months ago.
Dates really present the same challenge, but everything that's using those
is tied to a relational database. So there, at least, we're forced to stick
to something fairly standardized. (Though I *do* have another conversion
function for converting those messages, which tends to change about once a
week when some other developer decides to change column names...but that's
a different story).
FWIW,
James
>
> On Sunday, January 19, 2014 9:03:53 AM UTC-8, jonah wrote:
>>
>> I read these self-describing, extensible points in the context of EDN,
>> which has a syntax/wire format for some types- maps, strings, etc- and also
>> has an extensibility syntax:
>>
>> #myapp/Person {:first "Fred" :last "Mertz"}
>>
>> These tagged elements are "extensions" because they allow values of types
>> not known to EDN to be included in the stream, and are "self-describing" in
>> two senses:
>>
>> * if a wire format reader does know how to create a myapp/Person{}, that
>> blob of data contains all the information needed to do so
>> * if a wire format reader doesn't known how to create a myapp/Person, it
>> can still read past this particular element in the stream, because tags
>> have a defined envelope, so a reader can figure out where data comprising
>> this element ends
>>
>> The JSON example is mostly about the "extensibility" attribute. JSON's
>> format natively supports some types (like strings) but not others (like
>> dates), and for those others, JSON's format does not include a way to
>> "bucket" or "envelope" data comprising those unknown types. So JSON is not
>> extensible.
>>
>> The google example is mostly about the "self-describing" attribute, and
>> to my mind is more accurately framed as a statement about the Internet as a
>> whole. Hypothetically, if all data exchange occurred using data formats
>> whose details were private arrangements between writers and readers- for
>> instance, all servers only spoke ProtocolBuffers and used a different
>> schema for each client- there would be no Internet at all, much less a
>> google who as a third party is able to broadly read and understand data
>> made available by servers. (Or, to your point, any ability to parse
>> anything useful from a server data stream by clients lacking knowledge of
>> the schema would be at best be inferential and heuristic- possible, but
>> infeasible on a large scale.)
>>
>> With all that said- my read is that Rich bundled those two points
>> together in the JSON date example- JSON doesn't have an extensibility
>> syntax to support dates, but people still have to transmit dates over JSON,
>> so how do they do that? One way is by adopting a "convention", which in
>> some ways is better than an out of band schema, because, as you say, a
>> convention gives a reader additional information to heuristically interpret
>> the stream, but in other ways is worse because it isn't consistent- some
>> people will want date fields to look like "dateModified", others will want
>> "modifiedDate", and others use "modificationDatetime".
>>
>> So in a broad sense, it is not desirable to use a data format that does
>> not include an extensibility capability which itself is self-describing,
>> because a format that lacks extensibility creates a combinatorial explosion
>> in conventions to convey values not known to the format, and extensions
>> that are not self-describing require out of band agreements between readers
>> and writers that can preclude the scalable third-party interoperability
>> that is so important to the Internet.
>>
>> Hope that helps.
>>
>>
>> On Sat, Jan 18, 2014 at 6:08 PM, Brian Craft <[email protected]> wrote:
>>
>>> Ok, so consider a different system (besides google) that handles the
>>> JSON example. If it has no prior knowledge of the date field, of what use
>>> is it to know that it's a date? What is a situation where a system reading
>>> the JSON needs to know a field is a date, but has no idea what the field is
>>> for?
>>>
>>>
>>> On Saturday, January 18, 2014 1:27:31 PM UTC-8, Jonas wrote:
>>>>
>>>> IIRC in that particular part of the talk he was specifically talking
>>>> about (non-self describing) protocol buffers and not JSON.
>>>>
>>>> On Saturday, January 18, 2014 10:00:09 PM UTC+2, Brian Craft wrote:
>>>>>
>>>>> Regarding Rich's talk (http://www.youtube.com/watch?v=ROor6_NGIWU),
>>>>> can anyone explain the points he's trying to make about self-describing
>>>>> and
>>>>> extensible data formats, with the JSON and google examples?
>>>>>
>>>>> He argues that google couldn't exist if the web depended on
>>>>> out-of-band schemas. He gives as an example of such a schema a JSON
>>>>> encoding where an out-of-band agreement is made that field names with
>>>>> substring "date" refer to string-encoded dates.
>>>>>
>>>>> However, this is exactly the sort of thing google does. It finds
>>>>> dates, and other data types, heuristically, and not through the formats
>>>>> of
>>>>> the web being self-describing or extensible.
>>>>>
>>>>>
>>>>> --
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to [email protected]
>>> Note that posts from new members are moderated - please be patient with
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/clojure?hl=en
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>
--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.