I'm curious if there are any efforts ongoing to amortize the
background tasks in Cassandra over time?
Specifically, the cost of compaction and AE, rebalancing, etc seems to
be a problem for some users when they are expecting more steady-state
performance. While this may sometimes be the result of a cluster which
is at its marginal capacity, users are still surprised with the
performance hit or downtime required for common operations. Making the
cluster able to make finer-grained and measurable progress towards the
ideal state may help other users, too.

Is there a feasible design or enhancement which may allow these types
of background tasks to be broken apart into smaller pieces without
compromising overall consistency?
It would be excellent if the user could see the over-all state of the
storage cluster, and to choose the proportion of resources allocated
to recovering backlog vs servicing clients, etc.
Even better, if there were some basic heuristics which worked well for
the general case, and users would only have to see the scheduling plan
in special situations.

How would you go about doing that? Does the current architecture lend
itself to this type of optimization, or otherwise?

Reply via email to