On Thursday, April 27, 2017 5:20 AM, Jim Jagielski <j...@jagunet.com> wrote:
> This proposal looks like something I'd be interested in helping
> out with... some basic questions: (1) How does the compare/contrast
> w/ Apache Kafka and (2) What is seen as the most applicable use-case
> for Pulsar?
There are few design choices in which Pulsar differs from Kafka. These choices
were primarily dictated by requirements for Pulsar to be a core low-level
platform backing serving systems:
* Multi-Tenant platform
* Scale to ~1M topics in one cluster
* Guaranteed durability. Everything must be replicated and flushed to disk
before acknowledgment.
* Low publish latency (99pct <5ms) with that durability, even under
conditions where dispatch has to do cold reads from disk
* Geo-Replication as a 1st class feature
Some of these requirements are made possible by using Apache BookKeeper as the
storage for Pulsar topics, and separating brokers from storage. There are
several advantages in having 2 different layers (brokers + storage) in the
system:
* Shift traffic across different brokers (automatic load balancer, rapid
failover)
* Easily scale up each layer independently
* New machines automatically start serving traffic
* Use disk I/O effectively
> (2) What is seen as the most applicable use-case for Pulsar?
At Yahoo, Pulsar has been supporting a broad range of applications. Main usage
patterns could be summarized as :
* Message queues
* Notification / Application Integration
* Data source and sink for stream processing
* Durable message bus for other platforms (storage, dbs)
In particular the low-publish latency and high-availability properties, make
Pulsar suitable to be used in the critical data path of online event
processing/serving. The multi-tenancy allows one organization to operate a
single cluster serving multiple applications.
> Funny aside: the 1st new car I ever bought with my own money was
> a Nissan Pulsar. It was a piece of crap :)
Our intention is certainly to do better than that :)
We have been running Pulsar in production at scale for about 2 years now, and
more detailed write-up (though a little dated) is available here
https://yahooeng.tumblr.com/post/150078336821/open-sourcing-pulsar-pub-sub-messaging-at-scale
Cheers,
Joe
> On Apr 26, 2017, at 5:19 PM, Joe Francis <j...@yahoo-inc.com.INVALID> wrote:
>
>
> Dear Apache Incubator Community,
>
>
> We would like to submit the Pulsar proposal to the incubator. Our draft is
> available at:
> https://wiki.apache.org/incubator/PulsarProposal
>
> A quick overview of Pulsar:
>
> Pulsar is a highly scalable, low latency messaging platform running on
> commodity hardware. It provides simple pub-sub semantics over topics,
> guaranteed at-least-once delivery of messages, automatic cursor management for
> subscribers, and cross-datacenter replication.
>
> We are obviously looking for feedback and comments on the proposal, as well
> as a few mentors. Bryan Call has accepted to be our Champion.
>
> Thank you,
>
> -Joe Francis
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org