https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121126
--- Comment #5 from Robin Dapp ---
I'll have a look. I'm currently tied down with other things, so maybe next
week.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120297
Robin Dapp changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121065
--- Comment #3 from Robin Dapp ---
Should be fixed. My armhf qemu tests weren't really successful due to a qemu
configuration issue. I did verify that the test here (and others) don't ICE,
of course.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121073
--- Comment #2 from Robin Dapp ---
Yes, the issue is that Wdm was a memory constraint before, giving reload more
freedom. In the case here we have a real mask operand that only the strided
alternatives support. Need to think of another solutio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121073
--- Comment #1 from Robin Dapp ---
That's very likely due to my recent broadcast changes. Will have a look.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120930
--- Comment #2 from Robin Dapp ---
I'm seeing a difference between -O2 and -O3 where the -O2 version gets the
proper result (3). In the -O3 version we completely unroll the loop but don't
seem to populate the "b" array entirely but just the fir
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120297
Robin Dapp changed:
What|Removed |Added
Status|WAITING |ASSIGNED
--- Comment #8 from Robin Dapp -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120297
--- Comment #7 from Robin Dapp ---
Picking a random commit in May (r16-649-g5c012971969db9) also shows the issue.
It looks as if we pick the wrong LMUL for a store and this rule is to blame:
DEF_SEW_LMUL_RULE (
ratio_and_ge_sew, sew_only, se
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120297
--- Comment #6 from Robin Dapp ---
I was able to reproduce it on our internal tree. Disabling scheduling as well
as using the simple vsetvl strategy make the problem disappear so everything
points to a vsetvl issue.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120297
--- Comment #5 from Robin Dapp ---
Tried to reproduce again with the latest trunk and didn't succeed. I'm always
getting 234635118 no matter the VLEN and options. I'll try to bisect a failing
commit.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121048
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
--- Comment #2
|RESOLVED
CC||rdapp at gcc dot gnu.org
--- Comment #3 from Robin Dapp ---
Fixed on trunk. I don't suppose we want to backport this as it's pretty
harmless (all div variants expand to u/sdiv anyway)?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118734
Robin Dapp changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120930
Robin Dapp changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rdapp at gcc dot gnu.org
|--- |FIXED
CC||rdapp at gcc dot gnu.org
--- Comment #2 from Robin Dapp ---
Fixed on trunk.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120461
Robin Dapp changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922
--- Comment #6 from Robin Dapp ---
(In reply to Tamar Christina from comment #5)
> Question, can I count on
>
> -march=rv64gcv_zvl1024b -mrvv-vector-bits=zvl -mrvv-max-lmul=m8
>
> always being available as a codegen option for RVV? or do I nee
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120461
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714
Robin Dapp changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120782
--- Comment #8 from Robin Dapp ---
The vlse comes from a vec_duplicate:V2DI that has a reg pointing to a
"real(kind=4)", so a float.
What's interesting, though, is that the MEM is supposedly 64-bit aligned (see
below, A64).
(insn 285 282 287 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120782
--- Comment #7 from Robin Dapp ---
Ok, I was able to reproduce it with r15-9904-g2498cbbcdb23da.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120782
--- Comment #5 from Robin Dapp ---
I tried reproducing this with a recent trunk (r16-1965-gc512c9090f52e7) but
didn't see the exact code sequence. wrf also ran to completion on the Banana
Pi.
Did you use a stock GCC 15.1 or a specific commit?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120782
--- Comment #3 from Robin Dapp ---
(In reply to Jeffrey A. Law from comment #2)
> Yea, this bug may have been filed while we were discussing it in a team
> meeting.
>
> I think the question is whether or not to include the new guards in the
> c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120782
Robin Dapp changed:
What|Removed |Added
Last reconfirmed||2025-6-23
--- Comment #1 from Robin Dapp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639
--- Comment #5 from Robin Dapp ---
> Well, consider the desired index vector being a real induction (just
> store it somewhere). If we can handle that, we should be able to
> handle the scatter. If not, we can't handle the scatter.
Hmm, I thi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639
--- Comment #3 from Robin Dapp ---
> We could use scatter stores, building the index vector somehow cleverly with
> i_width contiguous indexes interspaced by i_dst_stride. In fact this vector
> could be built as inductions when building the i_h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120687
--- Comment #3 from Robin Dapp ---
Yeah, for 8 elements we still have a mode but beyond 8 we at least cannot do a
segment access anymore. Then we try with even/odd or interleaved permutations.
I kind of wonder why the cost model doesn't reject
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639
--- Comment #1 from Robin Dapp ---
I'm just realizing that without knowing the stride statically, we'd generate a
lot of code as we don't have a way of setting an element size for loads
dynamically. Although riscv offers a dynamic element size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639
Robin Dapp changed:
What|Removed |Added
Severity|normal |enhancement
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: rdapp at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org
Target Milestone: ---
Target: riscv
In x264 we have several variations of the following loop:
void foo (uint8_t *dst, int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110812
--- Comment #14 from Robin Dapp ---
I managed to have a look now but the whole builtin and LTO machinery is kind of
new to me.
As Andreas mentioned already the issue is that we do not register vector
builtins when the current target is !TARGET_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120459
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
--- Comment #2
|--- |FIXED
CC||rdapp at gcc dot gnu.org
--- Comment #2 from Robin Dapp ---
Fixed on trunk.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110812
--- Comment #11 from Robin Dapp ---
Tried building highway to reproduce and hit another error in fre...
Do we have a minimal example for this issue?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120378
--- Comment #3 from Robin Dapp ---
vnclipu is basically a scaling (narrowing), rounding shift with subsequent
"clip" i.e. saturation. Its input and output is unsigned, though, so for the
function above we first need to "clip" the negative value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120378
--- Comment #4 from Robin Dapp ---
Does it make sense to have the vmax/vmin/truncate pattern as a fallback for
other targets? On riscv it would save one predicated instruction.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120297
--- Comment #4 from Robin Dapp ---
I can reproduce this, but only with a qemu VLEN=128, VLEN >= 256 result in the
correct value of 234635118.
Assignee: unassigned at gcc dot gnu.org
Reporter: rdapp at gcc dot gnu.org
Target Milestone: ---
Target: riscv
x264 contains a variation of the following loop (in hpel_filter):
typedef unsigned char uint8_t;
typedef short int16_t;
inline
uint8_t
x264_clip_uint8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120362
--- Comment #11 from Robin Dapp ---
> Yes. I am sure. And SPIKE and QEMU have no problem.
So vlre/vsre should execute despite a VILL in VTYPE? At first sight I don't
find any specifics in the vector spec.
qemu is not very pedantic in that res
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120362
--- Comment #9 from Robin Dapp ---
> No. vlre should not depend on vtype. It should be hardware bug.
Are you sure about that? vmv1r also doesn't depend on a specific vtype, each
one is OK, but the vtype must at least be valid. So we get a SIG
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120362
--- Comment #6 from Robin Dapp ---
(In reply to Kito Cheng from comment #5)
> Oh, vsetvli/vill issue should only appeared for whole reg move not whole reg
> load store
On the Banana Pi I get a SIGILL for
int
main() {
asm volatile ("lui a5, 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120362
--- Comment #4 from Robin Dapp ---
> I see, but when I changed to
>
> addia5,a5,912
>
> aka load from 0xdd390, the board still has the illegal insn. 0xdd390 is
> aligned for -O2 -march=rv64gcv -mrvv-vector-bits=zvl build, right?
Hmm, righ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120362
--- Comment #1 from Robin Dapp ---
That's a misaligned vector load I suppose?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950
Robin Dapp changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950
--- Comment #13 from Robin Dapp ---
Going to push this to the 14 branch later today if the x86 testsuite shows no
regressions.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120067
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
--- Comment #4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119577
--- Comment #3 from Robin Dapp ---
I manage to have a quick look at the code now. It looks like we force live
every induction and build slp instances for the IV increments.
I don't think adjusting the actual IV creation in vectorizable_inducti
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119832
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
Robin Dapp changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116595
Robin Dapp changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
--- Comment #16 from Robin Dapp ---
> Yes, it is precisely the issue I have encountered in cvtScale8s64f (actually
> in cvt_64f). After the commit 34ae3a99, the default value of
> LOGICAL_OP_NON_SHORT_CIRCUIT has changed from 0 into 1, it will c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119672
--- Comment #8 from Robin Dapp ---
(In reply to Jakub Jelinek from comment #7)
> Thanks, I've posted it to gcc-patches in case some CI picks it up too:
> https://gcc.gnu.org/pipermail/gcc-patches/2025-April/680408.html
Testing looked good on rv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119672
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
--- Comment #6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
--- Comment #15 from Robin Dapp ---
> Yes, it is precisely the issue I have encountered in cvtScale8s64f (actually
> in cvt_64f). After the commit 34ae3a99, the default value of
> LOGICAL_OP_NON_SHORT_CIRCUIT has changed from 0 into 1, it will c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
--- Comment #13 from Robin Dapp ---
Hmm, now I compiled with -O3 on top of --param logical-op-non-short-circuit=0
(which shouldn't actually be necessary or change anything as it's the default)
but there is a segmentation fault in
_ZN2cv12cpu_b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
--- Comment #12 from Robin Dapp ---
> I recompile the opencv application with current gcc(commit b6aafe9a5b), and
> it still reproduce this bug. Do you have apply the patch of step 3 which
> enable vector implement of cvt_64f function?
Yes, I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
--- Comment #10 from Robin Dapp ---
> 4. run
> ```
> export LD_LIBRARY_PATH=//lib
> ./opencv_test_core --gtest_filter="Core_ConvertScale/ElemWiseTest.accuracy/0"
> ```
[==] Running 1 test from 1 test case.
[--] Global test e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116595
--- Comment #7 from Robin Dapp ---
Ah, not a regression but just a checking assert, sorry.
||a/show_bug.cgi?id=119547
CC||jeffreyalaw at gmail dot com,
||rdapp at gcc dot gnu.org
--- Comment #6 from Robin Dapp ---
This happens when compiling opencv (commit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
--- Comment #9 from Robin Dapp ---
> cmake --build cross-build/$BUILD_DIR-gcc --target opencv_test_core -j10
> ```
> 4. run
> ```
> export LD_LIBRARY_PATH=//lib
> ./opencv_test_core --gtest_filter="Core_ConvertScale/ElemWiseTest.accuracy/0"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119373
--- Comment #5 from Robin Dapp ---
> The analysis of SPEC2017's 510.parest_r shows that the topmost basic block
> is a tight loop (see attached reducer). Once vectorised, by unrolling and
> mutualising 4 instructions, AArch64 achieves a 22% redu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119572
Robin Dapp changed:
What|Removed |Added
Priority|P1 |P3
--- Comment #3 from Robin Dapp ---
(In
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: rdapp at gcc dot gnu.org
Target Milestone: ---
Target: riscv
Given this function (basically vect-early-break_133_pfa1.c):
#define SZ 1020
char string[SZ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
--- Comment #5 from Robin Dapp ---
Do you happen to have an excution test ready so I can have a look?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119361
--- Comment #4 from Robin Dapp ---
(In reply to Edwin Lu from comment #3)
> I'm not familiar enough with how the two modes interact with each other but
> I guess my question is, why do we have so many conversions between the two
> modes? What's
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119361
--- Comment #2 from Robin Dapp ---
I looked into this some more and it points to a general deficiency in how we
handle the split between VLA and VLS modes.
With ...bits=zvl the RVVM1SI etc modes. become VLS modes. In turn, this means
that whene
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119361
--- Comment #1 from Robin Dapp ---
The issue is due to:
_279 = BIT_FIELD_REF <_480, 64, 0>;
_330 = BIT_FIELD_REF <_480, 64, 64>;
_340 = BIT_FIELD_REF <_481, 64, 0>;
_350 = BIT_FIELD_REF <_481, 64, 64>;
Ideally they expand to simple sl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
Robin Dapp changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116398
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
--- Comment #17
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117955
Robin Dapp changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119115
Robin Dapp changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119224
--- Comment #7 from Robin Dapp ---
> So this why you weren't seeing it but I'm confused about the rationale...
> I unpack above to following statements
>
> 1. -mno-vector-strict-align allows us to unroll - seems ok.
> 2. Otherwise (-mvector-str
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119224
--- Comment #4 from Robin Dapp ---
Ah, sorry, I always specify -mno-vector-strict-align by default. It's always
that option that allows us to unroll, otherwise unrolling will lead to
misaligned accesses. And -mtune=generic-ooo defaults to
-mno
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119224
--- Comment #2 from Robin Dapp ---
I'm afraid that's due to scheduling (and not RA spilling). Of course there
shouldn't be any vector stores in this loop and with -fno-schedule-insns there
aren't any.
It's much worse for zvl128b even. While t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #22 from Robin Dapp ---
> Is that not happening? What value does _164 actually end up being?
>
> In other words, if the XOR is happening in GPRs, it doesn't matter whether
> the register holds 1 or -1 (or 3) for a true boolean. Th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #20 from Robin Dapp ---
Hmm, so right now we return "1" or "0" when extracting from a mask, not "-1" or
"0" and that's what aarch64/SVE does as well. We cannot start returning a
sign-extended -1 all of a sudden.
There is an inconsi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #17 from Robin Dapp ---
> No you got it wrong.
> _121 will either be -1 or 0. _11 should be -1 or 0 too.
> So the question is what was the VEC_EXTRACT doing the right thing? Is it
> 0/-1 or 0/1?
I literally mentioned VEC_EXTRACT in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #11 from Robin Dapp ---
/* In GIMPLE, getting rid of 2 conversions for one new results
in smaller IL. */
(simplify
(convert (bitop:cs@2 (nop_convert:s @0) @1))
(if (GIMPLE
&& TREE_CODE (@1) != INTEGER_CST
&&
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #10 from Robin Dapp ---
The test passes with -fno-vrp, so maybe the optimized tree isn't correct after
all?
Folding statement: _157 = _26 ? -1 : 0;
Matching expression match.pd:161, gimple-match-10.cc:33
Matching expression match.pd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #9 from Robin Dapp ---
I suspect the problem lies somewhere here:
_11 = .VEC_EXTRACT (mask__83.22_110, 0);
_23 = MEM[(short int *)&t + 20B];
_24 = _23 & _132;
_25 = _24 != 0;
_121 = () _25;
_157 = _11 ^ _121;
For
_121
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #6 from Robin Dapp ---
As convoluted (and redundant) as it looks but the optimized tree looks at least
correct to me. Maybe a backend issue?
But I don't see costing for what we emit in the vectorizer and I didn't yet
find where we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #4 from Robin Dapp ---
Very weird indeed. It looks like we're not even vectorizing? I mean, sure, we
use vector instructions but they are all broadcast from scalars?
(VMAT_INVARIANT) And in the end we extract the first element wit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119115
--- Comment #5 from Robin Dapp ---
The problematic vsetvl is
vsetvli zero,a3,e16,m1,ta,ma
which was a
vsetvli a4,a3,e8,mf2,ta,ma
vsetvli t1,a3,e8,mf2,ta,ma
with the simple strategy.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119115
Robin Dapp changed:
What|Removed |Added
Last reconfirmed||2025-3-4
--- Comment #3 from Robin Dapp -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119115
--- Comment #2 from Robin Dapp ---
(In reply to Andrew Pinski from comment #1)
> Could this be another one of the vsetivli failures?
100% as I get "0" with --param=vsetvl-strategy=simple. But at first sight
unrelated to the previous ones. Wil
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117955
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
--- Comment #7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950
--- Comment #11 from Robin Dapp ---
I figured this particular problem on RISC-V won't be fixed on GCC 14 because we
don't have the zeroing of masked elements there. But you're referring to
backporting just this patch, right?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118595
--- Comment #2 from Robin Dapp ---
Hmm I'm not seeing those locally with -march=rv64gcv_zvl256b at least. Which
exact options were used to run the test suite? Or have those fails disappeared
in the meanwhile?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950
Robin Dapp changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116773
Bug 116773 depends on bug 114516, which changed state.
Bug 114516 Summary: RISC-V: TSVC2 s315 has spill with dynamic lmul
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114516
What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114516
Robin Dapp changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114516
--- Comment #1 from Robin Dapp ---
The issue is that we're not considering pattern statements for costing. It's
rather straightforward to include those as well which would fix this PR.
I'm going to test a patch locally.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116686
Robin Dapp changed:
What|Removed |Added
CC||rdapp at gcc dot gnu.org
--- Comment #7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950
Robin Dapp changed:
What|Removed |Added
CC||rguenth at gcc dot gnu.org
--- Comment #6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950
--- Comment #5 from Robin Dapp ---
Yeah, the original statement is recognized as a mask conversion pattern:
pr118950.c:9:21: note: vect_recog_mask_conversion_pattern: detected: _152 =
.MASK_LOAD (_230, 8B, _229, 0);
pr118950.c:9:21: note: m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950
--- Comment #4 from Robin Dapp ---
It indeed appears is if we need zeroing of the loaded gather values but
bool type_mode_padding_p
= TYPE_PRECISION (scalar_type) < GET_MODE_PRECISION (GET_MODE_INNER
(mode));
is false.
The last of the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116242
Bug 116242 depends on bug 115703, which changed state.
Bug 115703 Summary: [15 Regression] rv64gcv_zvl256b miscompile since
r15-1579-g792f97b44ff
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115703
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115703
Robin Dapp changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116351
--- Comment #3 from Robin Dapp ---
I started with a fix here:
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671939.html
but, due to other priorities, dropped the ball :/ Feel free to pick up from
there.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832
--- Comment #8 from Robin Dapp ---
I think for vec_duplicate the idea is the same as for all the other splits -
keep it in simple shape so we can combine/fwprop etc. It also helps converting
e.g.
vmv.v.x v3,a3 vadd.vv v1, v2, v3
into
vad
1 - 100 of 443 matches
Mail list logo