This has come up several time over the years:
https://gcc.gnu.org/legacy-ml/gcc/2006-07/msg00158.html
https://gcc.gnu.org/legacy-ml/gcc/2006-07/msg00155.html
https://gcc.gnu.org/pipermail/gcc/2010-March/190234.html

, but maybe now (or maybe a while ago) is the right time to do this, considering the changes in relative costs of basic operations?
Multiply and barrel shift are cheap on many modern microarchitectures.
Control flow and non-linear memory access is expensive.

FWIW, [Dietz92]
https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1312&context=ecetr
mentions multiply in passing as unpractical for SPARC because of cost, but modern CPUs can often do a multiply in a single cycle.

Approximating division and scaling, for a case value x, we can calculate an index or offset into a table as
f(x) = x*C1 >> C2 & M

For an index, M masks off the upper bits so that the index fits into
a table that has a number of elements that is a power of two.
For architectures where a non-scaled index in cheaper to use than
a scaled one, we compute an offset by having M also mask off the lower
bits.

Each table entry contains a jump address (or offset) and a key - at least if both are the same size; for different sizes, it might be cheaper to have two tables. If we have found values for C1 and C2 that give a perfect hash, we can immediately dispatch to the default case for a non-match; otherwise, we can have decision trees at the jump destinations, each using the comparison with the key from the table for the first decision. No separate range check is necessary, so if the multiply is fast enough, this should be close in performance to an ordinary tablejump.

This dispatch method can be used for tables that are too sparse for a tablejump,but have enough cases to justify the overhead (depending on multiple conditional vs single indirect branch costs, the latter might be a low bar).

I suppose we could make tree-switch-conversion.cc use rtx costs to compare the hash implementation to a decision tree, or have a hook make the decision - and the default for the hook might use rtx costs...

Reply via email to