https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100
--- Comment #7 from GCC Commits ---
The master branch has been updated by Paul-Antoine Arras :
https://gcc.gnu.org/g:3ada458d344b13a49183278435d372fe9c7fef4b
commit r16-1418-g3ada458d344b13a49183278435d372fe9c7fef4b
Author: Paul-Antoine Arras
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100
Paul-Antoine Arras changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100
--- Comment #5 from GCC Commits ---
The master branch has been updated by Paul-Antoine Arras :
https://gcc.gnu.org/g:b437418bc9547073ec2704398c85c52e060e1fab
commit r16-1071-gb437418bc9547073ec2704398c85c52e060e1fab
Author: Paul-Antoine Arras
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100
Andrew Waterman changed:
What|Removed |Added
CC||andrew at sifive dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100
--- Comment #3 from Paul-Antoine Arras ---
(In reply to Jeffrey A. Law from comment #2)
> Paul -- have you run your patch on any design? And if so what did you run
> and what was the performance delta before/after?
Thanks for your input, Jeff!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100
--- Comment #2 from Jeffrey A. Law ---
It's even more complicated than that. You have to consider that there can be a
cost to move data across the units. ie, it may actually be cheaper to use the
variant that broadcasts the value across a vect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100
--- Comment #1 from Richard Biener ---
doesn't late-combine and/or forwprop not have the single-BB restriction? Also
when the vec-duplicate is hoisted out of a loop this then becomes a
register pressure in vector vs. scalar regset issue only?