Here it is https://docs.google.com/document/d/1YDFNvLTX6Sg3WDrNFKiWLaJvuEtK4eyxEaA0w9cVlG4/edit#heading=h.d6uy2uxfs2xq
cheers /karthik On Sun, May 6, 2018 at 8:20 AM, Bill Graham <[email protected]> wrote: > Can you share the doc please? > > On Sat, May 5, 2018 at 4:18 PM Ning Wang <[email protected]> wrote: > > > Thanks. > > > > Yeah I have read the design doc. It has a section for scaling and covers > > some designs but not reaching this level of details I am afraid. > > > > On Sat, May 5, 2018 at 9:45 AM, Bill Graham <[email protected]> > wrote: > > > >> The stateful processing design included a large section on scaling, > which > >> was intended to be done as a future phase. It's very similar to what's > >> being described. Sanjeev and I worked on it about a 1.5 years ago with > >> Maosong and it was in a google doc. Sanjeev do you have that design > doc? I > >> can't seem locate it. > >> > >> On Sat, May 5, 2018 at 12:03 AM, Ning Wang <[email protected]> > wrote: > >> > >> > If we go this way, we need key -> state map for each component so that > >> the > >> > state data can be repartitioned. > >> > > >> > On Fri, May 4, 2018 at 11:44 PM, Karthik Ramasamy <[email protected] > > > >> > wrote: > >> > > >> > > Instead - if it references > >> > > > >> > > topology name + component name + key range > >> > > > >> > > will it be better? > >> > > > >> > > cheers > >> > > /karthik > >> > > > >> > > > >> > > On Fri, May 4, 2018 at 11:23 PM, Ning Wang <[email protected]> > >> wrote: > >> > > > >> > > > Currently I think each Instance serializes the state object into a > >> byte > >> > > > array and checkpoint manager saves the byte array into a file. The > >> file > >> > > is > >> > > > referenced by topology name + component name + instance id. > >> > > > > >> > > > On Fri, May 4, 2018 at 11:10 PM, Karthik Ramasamy < > >> [email protected]> > >> > > > wrote: > >> > > > > >> > > > > I am not sure I understand why the state is tied to an instance? > >> > > > > > >> > > > > cheers > >> > > > > /karthik > >> > > > > > >> > > > > On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper < > >> > [email protected]> > >> > > > > wrote: > >> > > > > > >> > > > > > Yeah, state recovery is a bit more difficult with Heron's > >> > > architecture. > >> > > > > In > >> > > > > > Storm, the task IDs are not just values used for routing they > >> > > actually > >> > > > > > equate to a task instance within the executor. An executor > which > >> > > > > currently > >> > > > > > processes the keys 4-8 actually contains 5 task instances of > the > >> > same > >> > > > > > component. So for each task, they just save its state attached > >> to > >> > the > >> > > > > > single task ID and reassemble executors with the new task > >> > instances. > >> > > > > > > >> > > > > > We don't want or have to do that with Heron instances but we > >> would > >> > > need > >> > > > > to > >> > > > > > have some way to have a state change tied to the task (or > >> routing > >> > key > >> > > > if > >> > > > > we > >> > > > > > go to the key range idea). For something like a word count you > >> > might > >> > > > save > >> > > > > > counts using a nested map like: { routing key : {word : count > >> }}. > >> > The > >> > > > > > routing key could be included in the Tuple instance. However, > >> > whether > >> > > > > this > >> > > > > > pattern would work for more generic state cases I don't know? > >> > > > > > > >> > > > > > Tom Cooper > >> > > > > > W: www.tomcooper.org.uk | Twitter: @tomncooper > >> > > > > > <https://twitter.com/tomncooper> > >> > > > > > > >> > > > > > > >> > > > > > On Fri, 4 May 2018 at 15:54, Neng Lu <[email protected]> > >> wrote: > >> > > > > > > >> > > > > > > +1 for this idea. As long as the predefined key space is > large > >> > > > enough, > >> > > > > it > >> > > > > > > should work for most of the cases. > >> > > > > > > > >> > > > > > > Based on my experience with topologies, I never saw one > >> component > >> > > has > >> > > > > > more > >> > > > > > > than 1000 instances in a topology. > >> > > > > > > > >> > > > > > > For recovering states from an update, there will be some > >> problems > >> > > > > though. > >> > > > > > > Since the states stored in heron are strongly connected with > >> each > >> > > > > > instance, > >> > > > > > > we either need to have > >> > > > > > > some resolver does the state repartitioning or stores states > >> with > >> > > the > >> > > > > key > >> > > > > > > instead of with each instance. > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > On Fri, May 4, 2018 at 3:01 PM, Karthik Ramasamy < > >> > > > [email protected]> > >> > > > > > > wrote: > >> > > > > > > > >> > > > > > > > Thanks for sharing. I like the Storm approach > >> > > > > > > > > >> > > > > > > > - keeps the implementation simpler > >> > > > > > > > - state is deterministic across restarts > >> > > > > > > > - makes it easy to reason and debug > >> > > > > > > > > >> > > > > > > > The hard limit is not a problem at all since most of the > >> > > topologies > >> > > > > > will > >> > > > > > > > be never that big. > >> > > > > > > > If you can handle Twitter topologies cleanly, it is more > >> that > >> > > > > > sufficient > >> > > > > > > I > >> > > > > > > > believe. > >> > > > > > > > > >> > > > > > > > cheers > >> > > > > > > > /karthik > >> > > > > > > > > >> > > > > > > > > On May 4, 2018, at 2:31 PM, Thomas Cooper < > >> > > > [email protected]> > >> > > > > > > > wrote: > >> > > > > > > > > > >> > > > > > > > > Hi all, > >> > > > > > > > > > >> > > > > > > > > A while ago I emailed about the issue of how fields > (key) > >> > > grouped > >> > > > > > > routing > >> > > > > > > > > in Heron was not consistent across an update and how > this > >> > makes > >> > > > > > > > preserving > >> > > > > > > > > state across an update very difficult and also makes it > >> > > > > > > > > difficult/impossible to analyse or predict tuple flows > >> > through > >> > > a > >> > > > > > > > > current/proposed topology physical plan. > >> > > > > > > > > > >> > > > > > > > > I suggested adopting Storms approach of pre-defining a > >> > routing > >> > > > key > >> > > > > > > > > space for each component (eg 0-999), so that instead of > an > >> > > > instance > >> > > > > > > > having > >> > > > > > > > > a single task id that gets reset at every update (eg 10) > >> it > >> > > has a > >> > > > > > range > >> > > > > > > > of > >> > > > > > > > > id's (eg 10-16) that changes depending on the > parallelism > >> of > >> > > the > >> > > > > > > > component. > >> > > > > > > > > This has the advantage that a key will always hash to > the > >> > same > >> > > > task > >> > > > > > ID > >> > > > > > > > for > >> > > > > > > > > the lifetime of the topology. Meaning recovering state > >> for an > >> > > > > > instance > >> > > > > > > > > after a crash or update is just a case of pulling the > >> state > >> > > > linked > >> > > > > to > >> > > > > > > the > >> > > > > > > > > keys in its task ID range. > >> > > > > > > > > > >> > > > > > > > > I know the above proposal has issues, not least of all > >> > placing > >> > > a > >> > > > > hard > >> > > > > > > > upper > >> > > > > > > > > limit on the scale out of a component, and that some > >> > > alternative > >> > > > > > ideas > >> > > > > > > > are > >> > > > > > > > > being floated for solving the stateful update issue. > >> > However, I > >> > > > > just > >> > > > > > > > wanted > >> > > > > > > > > to throw some more weight behind the Storm approach. > There > >> > was > >> > > a > >> > > > > > recent > >> > > > > > > > > paper about high-performance network load balancing > >> > > > > > > > > <https://blog.acolyer.org/2018/05/03/stateless- > >> > > > > > > > datacenter-load-balancing-with-beamer/>that > >> > > > > > > > > describes an approach using a fixed key space similar to > >> > > Storm's > >> > > > > (see > >> > > > > > > the > >> > > > > > > > > section called Stable Hashing - they assign a range 100x > >> the > >> > > > > expected > >> > > > > > > > > connection pool size - which we could do with heron to > >> > prevent > >> > > > ever > >> > > > > > > > hitting > >> > > > > > > > > the upper scaling limit). Also, this new load balancer, > >> > Beamer, > >> > > > > > claims > >> > > > > > > to > >> > > > > > > > > be twice as fast as Google's Maglev > >> > > > > > > > > <https://blog.acolyer.org/2016/03/21/maglev-a-fast-and- > >> > > > > > > > reliable-software-network-load-balancer/> > >> > > > > > > > > which again uses a pre-defined keyspace and ID ranges to > >> > create > >> > > > > > look-up > >> > > > > > > > > tables deterministically. > >> > > > > > > > > > >> > > > > > > > > I know a load balancer is a different beast to a stream > >> > > grouping > >> > > > > but > >> > > > > > > > there > >> > > > > > > > > are some interesting ideas in those papers (The links > >> point > >> > to > >> > > > > > summary > >> > > > > > > > blog > >> > > > > > > > > posts so you don't have to read the whole paper). > >> > > > > > > > > > >> > > > > > > > > Anyway, I just thought I would those papers out there > and > >> see > >> > > > what > >> > > > > > > people > >> > > > > > > > > think. > >> > > > > > > > > > >> > > > > > > > > Tom Cooper > >> > > > > > > > > W: www.tomcooper.org.uk | Twitter: @tomncooper > >> > > > > > > > > <https://twitter.com/tomncooper> > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > > -- > Sent from Gmail Mobile >
