https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702
--- Comment #14 from Avinash Jayakar <avinashd at linux dot ibm.com> --- (In reply to Surya Kumari Jangala from comment #12) > Ok. We also need to tackle the original issue, which is that a shift left > can be optimized by generating a vector add. Perhaps tackle this issue first? I looked furthur into how vector multiply is lowered to shifts. This happens in the "tree vectorization slp" pass, which transforms the gimple into vectorized form. The logic for handling these generic patterns is written as a pattern recognition function, and this specific function "vect_synth_mult_by_constant" does the same thing as "expand_mult_const" in the expand rtl pass, but on gimple tree. If I disable this multiply pattern during vectorization, then the expand pass converts mult to shift instruction. If we want to convert mult to an add in a machine dependent pass and not change the machine independent gimple and rtl passes, then I see that only way would be to combine the 2 instructions (splat and shift) into one add. @Segher/@Surya, do you have any other suggestions? (In reply to Segher Boessenkool from comment #13) > mults. And in most cases additions are faster than shifts (or you can do > more > of them concurrently or similar), so in many cases they are preferred, but > that > is not so super obvious already. You might be able to do four adds > concurrently, > but you might be able to do two shifts concurrently additionally, so it all Here concurrency means the instruction level parallelism right for e.g., the concurrent execution of shift/add depends on the number of functional units in the processor right? Just wanted to be on the same page.