https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122871
--- Comment #12 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
(In reply to Torbjorn SVENSSON from comment #11)
> The new test case fail for Cortex-M0 and Cortex-M23. Is this a thumb2-only
> improvement?
In principle the optimization is valid for thumb1 cores, but since we lack
shift+add patterns there, we should end up with something like
lsls r2, r0, #1
adds r1, r1, r2
bx lr
That's still much better than the sequence using adc(s), but obviously not
quite as simple as a single shift+add pattern.
And obviously this won't match the expected output in the current testcase.
The current test should probably add
/* { dg-require-effective-target arm32 } */
And then create a separate test for thumb1 targets.
If the thumb1 code generator isn't generating something like the above sequence
we should create a new PR for that as it's likely a costing issue in the
backend.