https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100045

            Bug ID: 100045
           Summary: Precomputing division
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

Created attachment 50567
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50567&action=edit
Test case

We use the method given in "Division by Invariant Integers using
Multiplication"
by Granlund and Montgomery for optimizing division by divisors known to
be constant at compile time.

There can also be an advantage if many numbers are divided by the
same numbers; in this case, the invariant inverse can be moved out of
the loop.  This is target-dependent.

The attached test case performs 10000000 unsigned divisions of uint32_t
values read in randomly by a constant randomly chosen to be 12345678

- using the method from figure 4.1 from the publication cited above
  (timing in seconds given as pre_divide)

- using a simple loop with divisions (timing in seconcs given as divide).

On a AMD Ryzen 7 1700X, the timings are

pre_divide: t = 0.013330 s
divide    : t = 0.052511 s

OTOH, on POWER (gcc135), the difference is so small so that is very probably
not worth the bother:

pre_divide: t = 0.015183 s
divide    : t = 0.017454 s

Reply via email to