> Hmm, reassoc isn't supposed to apply width > 1 before vectorization;
> IIRC I "fixed" that at some point, but this is what I see:
>
>   # sum_126 = PHI <sum_145(5), 0(2)>
> ...
>   _23 = *pix_134; 
>   _24 = (unsigned int) _23;
>   _31 = MEM[(uint8_t *)pix_134 + 1B]; 
>   _32 = (unsigned int) _31;
>   _118 = _24 + _32;
>   sum_33 = _118 + sum_126; 
> ...
>
> I think the place you add the condition to on swap_ops_for_binary_stmt
> simply lacks a !reassoc_insert_powi_p check like we have on the earlier
> rewrite_expr_tree_parallel.

An earlier version of my patch had a less strict check, i.e. wouldn't allow
the swap for more sequences.  This resulted in two regressions on x86
where we expect this exact swap to happend in order for the bswap pass
to recognize a bswap idiom.

It was something like
bleh = ...
l1 = MEM... << 8
l2 = MEM... >> 8
bla1 = l1 & ...
bla2 = l2 & ...
res1 = bla1 | bleh
res2 = bla2 | res1

reassoc would reorder res1, bla1, and bla2 so that bswap sees bla1 | bla2
and recognizes this as bswap.  When not allowing the swap before
vectorization this will fail.

Or did I miss the point?

Reply via email to