https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100045
Bug ID: 100045 Summary: Precomputing division Product: gcc Version: unknown Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- Created attachment 50567 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50567&action=edit Test case We use the method given in "Division by Invariant Integers using Multiplication" by Granlund and Montgomery for optimizing division by divisors known to be constant at compile time. There can also be an advantage if many numbers are divided by the same numbers; in this case, the invariant inverse can be moved out of the loop. This is target-dependent. The attached test case performs 10000000 unsigned divisions of uint32_t values read in randomly by a constant randomly chosen to be 12345678 - using the method from figure 4.1 from the publication cited above (timing in seconds given as pre_divide) - using a simple loop with divisions (timing in seconcs given as divide). On a AMD Ryzen 7 1700X, the timings are pre_divide: t = 0.013330 s divide : t = 0.052511 s OTOH, on POWER (gcc135), the difference is so small so that is very probably not worth the bother: pre_divide: t = 0.015183 s divide : t = 0.017454 s