https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99937
antto <antto at mail dot bg> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |antto at mail dot bg --- Comment #6 from antto <antto at mail dot bg> --- i hit this too, from what i found out, the Cortex-M0+ "MULS" instruction can be implemented either with the "small" option (iterative, takes a variable amount of clock cycles) or the "single-cycle" one, which takes 1 clock cycle and my chip has the "single-cycle" one, but when i have multiplication by a constant - GCC emits a long sequence of instructions instead of MULS i guess there should be perhaps a "-mcpu=cortex-m0plus.fast-multiply" but there don't seem to be, or some other way to tell GCC that MULS is *not* slow and should be prefered https://arm.godbolt.org/z/16T6G4cc5 >From ARM's documents about this core: --- The MULS instruction provides a 32-bit x 32-bit multiply that returns the least-significant 32-bits of the result. The processor can implement MULS in one of two ways: • As a fast single-cycle array. • As a 32-cycle iterative multiplier. The iterative multiplier has no impact on interrupt response time because the processor abandons multiply operations to take any pending interrupt. --- the chip i'm using (note the 3rd line in the Features): https://onlinedocs.microchip.com/oxy/GUID-22527069-B4D6-49B9-BACC-3AF1C52EB48C-en-US-20/GUID-3F4ED249-7406-42EB-8EF5-C3FEED4A4889.html