Hi Andrew,

I have to agree with Pieter here. What you want to be thinking about is: What data should be where? And what shall happen with it when it is there?

I feel this is what you also want to do, but I think Pieter is right in pointing out that thinking about all possible data flow is a mistake, and protocols are a better vantage point. The problem, imho, is briefly: "data flow" encompasses all possible events in your system, protocols define only what events are legal in your system.

To use your analogy: The national highway system in Germany isn't greatly defined (with it's countless driveways, reststops, roadworks, etc.), and I don't suppose any one person in Germany knows *all* about it, let alone know how all vehicles behave at any one point in time on the highway system. However, the law on traffic in Germany (STVO) is still just 53 paragraphs long and only one of them (§18) is concerned specifically about highways. Still, our highway system works, everday, with all the countless vehicles driving there every day. I'd say the STVO scaled extremely well!

And here is the interesting part: In almost all cases accidents aren't faults of the driving rules, but of the drivers not upholding the rules. As a computer scientist, you could say that bad drivers are the bugs of the road system. And for the few cases where your protocol was faulty, you only have to change one or two sentences in your law/protocol (not all the drivers).

So I would also recommend to you to not think about all the moving parts in all their possible configurations (i.e. all the cars on the road), but to really boil it down to: What data should be where? And how should receivers of that data respond to it? That is, write a protocol.

Best,
alex.

P.S. This is also why I came to love finite state machines: as they translate your protocol-law perfectly into code.

Am 25.08.2016 um 21:36 schrieb Andrew Hume:
pieter,

        i can’t figure out what you’re saying here. based on my several years 
experience
with you, i have to conclude that i have somehow misspoken.

        i have a distributed application. that is, there are a number of 
discrete components
that (for this discussion) are distinct processes running on some number of 
servers.
some are accessed thru RPC calls, others generate work that then processed by
some number of worker components.

        in order for this to work, i clearly need to figure out how data will 
flow
through this system. as i have understood the design methodology around zeromq,
this means figuring out the fixed points in the data flow and organising the
components around those fixed points. some of these allow for “scaling”
(as in increasing the size of a worker pool).

        so i don’t understand
1) how can this sort of design be a “mistake”? how can you do anything without
(trying to) understand where the data needs to go?

2) many protocols (which i guess you mean things defined by RFCs) don’t scale;
they simply detail teh bi-lateral contract between (roughly) user and supplier 
of a service
(like NTP). i know some address this directly (like the Gossip protocol), but 
surely you don’t mean just those?

        andrew


On Aug 24, 2016, at 11:06 AM, Pieter Hintjens <[email protected]> wrote:

FWIW I have come to believe that trying to design or even understand
the overall flows of data is a mistake, at least when you want to
scale.

What does scale is to speak of protocols and implementations. I know
this isn't a happy answer yet it's a proven one (RFCs, Internet).

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to