https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
Bug ID: 111376
Summary: missed optimization of one bit test on MIPS32r1
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Componen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111378
Bug ID: 111378
Summary: Missed optimization for comparing with exact_log2
constants
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111384
Bug ID: 111384
Summary: missed optimization: GCC adds extra any extend when
storing subreg#0 multiple times
Product: gcc
Version: unknown
Status: UNCONFIRMED
S
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111384
--- Comment #2 from Siarhei Volkau ---
Well what the godbolt says with -O2 -fomit-frame-pointer.
ARM:
uxthr0, r0 @ << zero extend
strhr0, [r1]
strhr0, [r2]
bx lr
ARM64:
and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111626
Bug ID: 111626
Summary: missed optimization combining offset of array member
in struct with offset inside the array
Product: gcc
Version: unknown
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #3 from Siarhei Volkau ---
I know that the patch breaks condmove cases, that's why it is silly.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #6 from Siarhei Volkau ---
Well, it is work mostly well.
However, it still has issues, addressed in my patch:
1) Doesn't work for -Os : highly likely costing issue.
2) Breaks condmoves, as mine does. I have no idea how to avoid tha
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #8 from Siarhei Volkau ---
Created attachment 58377
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58377&action=edit
condmove testcase
Tested with current GCC master branch:
- Work with -Os confirmed.
- Condmove issue present
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
Bug ID: 111835
Summary: Suboptimal codegen: zero extended load instead of sign
extended one
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104387
--- Comment #4 from Siarhei Volkau ---
*** Bug 111384 has been marked as a duplicate of this bug. ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111384
Siarhei Volkau changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
--- Comment #3 from Siarhei Volkau ---
I don't think that it is duplicate of the bug 104387 because there's only one
store.
And this bug is simply disappears if we change the source code a bit.
e.g.
- change (int8_t)*src; to *(int8_t*)src;
or c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112398
Bug ID: 112398
Summary: Suboptimal code generation for xor pattern on subword
data
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Pr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112398
--- Comment #3 from Siarhei Volkau ---
Well, let's rewrite it in that way:
void neg8 (uint8_t *restrict dst, const uint8_t *restrict src)
{
uint8_t work = ~*src; // or *src ^ 0xff;
dst[0] = (work >> 4) | (work << 4);
}
Wherever upper b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112474
Bug ID: 112474
Summary: MIPS: missed optimization for assigning HI reg to zero
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Comp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112474
--- Comment #1 from Siarhei Volkau ---
Minimal example for showcase the issue:
#include
uint64_t mthi_example(uint32_t a, uint32_t b, uint32_t c, uint32_t d)
{
uint64_t ret;
ret = (uint64_t)a * b + (uint64_t)c * d + 1u;
return ret
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60749
Siarhei Volkau changed:
What|Removed |Added
CC||lis8215 at gmail dot com
--- Comment #2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #12 from Siarhei Volkau ---
Highly likely it's because of data dependency, and not direct cost of shift
operations on LoongArch, although can't find information to prove that.
So, I guess it still might get performance benefit in cas
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #15 from Siarhei Volkau ---
Created attachment 58437
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58437&action=edit
application to test performance of shift
Here is the test application (MIPS32 specific) I wrote.
It allows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #16 from Siarhei Volkau ---
Might it be that LoongArch have register reuse dependency?
I observed similar behavior on XBurst with load/store/reuse pattern:
e.g. this code
LW $v0, 0($t1)# Xburst load latency is 4 but it has bypa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115505
Bug ID: 115505
Summary: missing optimization: thumb1 use ldmia/stmia for load
store DI/DF data when possible
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Sev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921
Bug ID: 115921
Summary: Missed optimization: and->ashift might be cheaper than
ashift->and on typical RISC targets
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115922
Bug ID: 115922
Summary: Missed optimization: MIPS: clear bit 15
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921
--- Comment #1 from Siarhei Volkau ---
Also take in account examples like this:
uint32_t high_const_and_compare(uint32_t x)
{
if ( (x & 0x7000) == 0x3000)
return do_some();
return do_other();
}
It might be profitable to u
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70557
Siarhei Volkau changed:
What|Removed |Added
CC||lis8215 at gmail dot com
--- Comment #9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70557
--- Comment #10 from Siarhei Volkau ---
Created attachment 59965
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59965&action=edit
MIPS patch
Ditto for 32-bit MIPS.
MIPS emits:
move$3,$0
move$2,$0
sw $3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70557
--- Comment #12 from Siarhei Volkau ---
Hi Jeffrey,
Thanks for your interest in those patches. But unfortunately I'm not sure that
I can and will pass all required steps to make these patches ready for review.
I have no experience with the RV32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70557
--- Comment #13 from Siarhei Volkau ---
Moreover, I think that the patches deal for limited possible cases.
E.g. if upper or lower part of DI memory shall be set to zero the patch won't
help.
It seems feasible to make a special code path for zer
28 matches
Mail list logo