liangyepianzhou commented on PR #25516:
URL: https://github.com/apache/pulsar/pull/25516#issuecomment-4285812281
> > I have a question: if my ScalableTopic has a large number of
sub-partitions — say, 1 million — would there be a situation where every broker
needs to perform calculations across all 1 million sub-partitions of that
ScalableTopic just to determine which sub-partitions belong to it? I'm a bit
worried about scalability issues caused by this kind of metadata computation.
>
> The number of "segments" is dynamically adjusted based on several factors
(eg: current/historic traffic, consumer parallelisms), though there would never
be a reason to have such a large number of segments. Most likely scenario is
for segments to be at most few 100s, and in some exceptional cases to touch
1000s.
>
> The reverse can be true though: eg. having millions of scalable topics
(with most of the having few or just one segment)
In scenarios where users require strict ordering and very slow consumption
speeds, and considering that k-shared mode may cause out-of-order delivery upon
consumer disconnection — or in cases where a proxy is inserted between the
broker and client, making k-shared unusable — the only option is to increase
the number of partitions to boost consumption concurrency. This could result in
a single ScalableTopic having millions of sub-partitions.
In such scenarios, can the new design avoid the scalability issues caused by
metadata computation?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]