> Pengxuan Zheng writes:
> > There was a bug in aarch64_evpc_reencode which could leave zero_op0_p
> > and zero_op1_p of the struct "newd" uninitialized.
> > r16-701-gd77c3bc1c35e303 fixed the issue by zero initializing "newd."
> > This patch provides an alternative fix as suggested by Richard
> >
> Pengxuan Zheng writes:
> > Some fields (e.g., zero_op0_p and zero_op1_p) of the struct "newd" may
> > be left uninitialized in aarch64_evpc_reencode. This can cause reading
> > of uninitialized data. I found this oversight when testing my patches
> > on and/fmov optimizations. This patch fixes t
> Pengxuan Zheng writes:
> > diff --git a/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c
> > b/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c
> > new file mode 100644
> > index 000..adbf87243f6
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c
> > @@ -0,0 +1,130 @@
> > +
> Pengxuan Zheng writes:
> > diff --git a/gcc/config/aarch64/aarch64.cc
> > b/gcc/config/aarch64/aarch64.cc index 15f08cebeb1..98ce85dfdae 100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -23621,6 +23621,36 @@ aarch64_simd_valid_and_imm (rtx op)
> >
> Pengxuan Zheng writes:
> **...
> **and v0.8b, (?:v0.8b, v[0-9]+.8b|v[0-9]+.8b, v0.8b)
> **ret
>
> Same for other tests that can't use a move immediate.
>
> Please leave 24 hours for others to comment on the target-independent
part,
> but otherwise the patch is ok with the chang
> Richard Biener writes:
> > On Sat, Apr 26, 2025 at 2:42 AM Pengxuan Zheng
> wrote:
> >>
> >> Certain permute that blends a vector with zero can be interpreted as
> >> an AND of a mask. This idea was suggested by Richard Sandiford when
> >> he was reviewing my patch which tries to optimizes cert
> Pengxuan Zheng writes:
> > We can optimize AND with certain vector of immediates as FMOV if the
> > result of the AND is as if the upper lane of the input vector is set
> > to zero and the lower lane remains unchanged.
> >
> > For example, at present:
> >
> > v4hi
> > f_v4hi (v4hi x)
> > {
> >
> Pengxuan Zheng writes:
> > Similar to the canonicalization done in combine, we canonicalize
> > vec_merge with swap_communattive_operands_p in
> simplify_ternary_operation too.
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-protos.h (aarch64_exact_log2_inverse):
> New.
> > * con
> Richard Sandiford writes:
> > I think this would also simplify the evpc detection, since the
> > requirement for using AND is the same for big-endian and
> > little-endian, namely that index I of the result must either come from
> > index I of the nonzero vector or from any element of the zero v
> > Pengxuan Zheng writes:
> > > This patch optimizes certain vector permute expansion with the FMOV
> > > instruction when one of the input vectors is a vector of all zeros
> > > and the result of the vector permute is as if the upper lane of the
> > > non-zero input vector is set to zero and the
> Pengxuan Zheng writes:
> > This patch optimizes certain vector permute expansion with the FMOV
> > instruction when one of the input vectors is a vector of all zeros and
> > the result of the vector permute is as if the upper lane of the
> > non-zero input vector is set to zero and the lower lan
11 matches
Mail list logo