On Wed, 24 Sep 2025, Avinash Jayakar wrote:
> On Tue, 2025-09-23 at 15:56 +0200, Richard Biener wrote:
> > On Tue, 23 Sep 2025, Avinash Jayakar wrote:
> >
> > > Hi,
> > >
> > > I had a question regarding the function vect_pattern_recog that is
> > > triggered in the slp/vectorization pass.
> > > In case the original code is already in vector form, for example
> > > below
> > > is the original gimple dump of a vector function
> > >
> > >
> > > ;; Function lshift1_64 (null)
> > > ;; enabled by -tree-original
> > >
> > >
> > > {
> > > __vector unsigned long long D.4059 = { 2, 2 };
> > >
> > > return <<< Unknown tree: compound_literal_expr
> > > __vector unsigned long long D.4059 = { 2, 2 }; >>> * a;
> > > }
> > >
> > >
> > > This however does not go through the pattern recognition since
> > > 1. It is already in vector form
> > > 2. This check fails in slp pass
> > > if (bb_vinfo->grouped_stores.is_empty ()
> > > && bb_vinfo->roots.is_empty ())
> > >
> > > So my main question is, suppose this line of code (in this example
> > > v1 *
> > > {2, 2}) do go through the pattern recognition, it could have
> > > generated
> > > better code like v1 + v1 or v1 << {1,1}) which could result in
> > > vectorization in certain target which does not have native double
> > > word
> > > multiplication support, but might have double word shift or add.
> > > Is this intended or is there a way have this code pattern
> > > recognized?
> >
> > The vectorizer is currently not set up for re-vectorizing already
> > vectorized code. Instead for the situation you describe, a target
> > without vector multiplication support, it would be the task of the
> > vector lowering pass (tree-vect-generic.cc) to turn this into a
> > supported operation.
> >
> I looked into the tree-vect-generic.cc, the function
> expand_vector_operations_1 function.
>
>
> I encountered this while fixing the PR119702. If I write a simple
> scalar code
>
> void lshift1_64(uint64_t *a) {
> a[0] *= 2;
> a[1] *= 2;
> }
> This does get vectorized as a << {1,1}. But writing this
> vector unsigned long long
> lshift1_64 (vector unsigned long long a, vector unsigned long long b)
> {
> return a * (vector unsigned long long) { 2, 2 };
> }
> gets converted to scalar code during veclower pass.
That's expected.
> I see 2 ways of fixing it for multiply expression
> 1. In expand_vector_operations_1 function, before lowering to scalar we
> can check if code is MULT_EXPR and see if it can be implemented with
> shifts/add/sub (as done in pattern recognition), like it is done for
> LROTATE_EXPR and RROTATE_EXPR.
Yes.
> 2. Implement mulv2di3 for this specific target (which does exactly what
> scalar code would do), and let expand pass (expand_mult) take care of
> converting mult to shift/add/sub.
The expand pass wouldn't do this when the target implements mulv2di3.
Currently only 1. is viable I think which is of course a bit unfortunate
as it leads to duplicating code in another place. Long-term the plan
is to replace vector lowering by re-vectorizing, but that needs
a) vector-to-vector support in the vectorizer, b) "vectorize" using
scalar code as ultimate fallback.
I'll note that I'd like to play with removing these kind of patterns
from vectorizer scalar pattern recognition and instead have the
vector code emission transparently handle such cases. We'd then
have sth like can_vec_mult_p (vectype, op0, op1) and emit_vec_mult
which should be usable in both places - vectorizer code analysis
and emission and in vector lowering.
Richard.
> Thanks and regards,
> Avinash Jayakar
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)