https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121062
--- Comment #4 from Uroš Bizjak ---
Created attachment 61864
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61864&action=edit
Actually tested version v3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121062
Uroš Bizjak changed:
What|Removed |Added
Attachment #61860|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121062
--- Comment #2 from Uroš Bizjak ---
Created attachment 61860
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61860&action=edit
Slightly cleaned version
Slightly cleaned up version of the patch, also fixes splitter for constant
vector store
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120973
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |INVALID
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120908
--- Comment #2 from Uroš Bizjak ---
I think the patch should be committed to all release branches (after some soak
time in the mainline to avoid surprises).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120881
--- Comment #23 from Uroš Bizjak ---
(In reply to H.J. Lu from comment #22)
> Should we add TARGET_FENTRY to default to -mfentry on Linux?
IMO --enable-fentry configure option would be better.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120881
--- Comment #19 from Uroš Bizjak ---
(In reply to H.J. Lu from comment #18)
> > Perhaps a better approach is to error out for -fstack-protector-all -pg
> > without -fentry.
>
> It can happen without -fstack-protector-all.
I see.
I suggest to w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120881
--- Comment #17 from Uroš Bizjak ---
(In reply to H.J. Lu from comment #16)
> SHRINK_WRAPPING_ENABLED is for checking if shrink wrapping is enabled.
True, but we could place mcount at entry only for functions where it is
strictly required. IIRC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120881
--- Comment #15 from Uroš Bizjak ---
Comment on attachment 61783
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61783
A patch to place mcount at the function entry with shrink
>@@ -493,7 +493,7 @@ ix86_using_red_zone (void)
> static bool
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98612
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106113
Uroš Bizjak changed:
What|Removed |Added
CC||guillaume.piolat at gmail dot
com
--- Co
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120839
Uroš Bizjak changed:
What|Removed |Added
Keywords|needs-bisection |
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120818
--- Comment #3 from Uroš Bizjak ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #2)
> Hi Lili,
>
> > Thank you for reporting this issue and giving the actual output. I have
> > relaxed
> > the testcase check. Could you test this pa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120719
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Target Milestone|---
at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #3 from Uroš Bizjak ---
Created attachment 61714
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61714&action=edit
Proposed patch
Patch in testing.
,
||ubizjak at gmail dot com
--- Comment #2 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #1)
> Confirmed. the backend needs some crc expanders added.
Indeed, crcMN4 expanders are missing to handle:
uint32_t crc_v1(uint32_t x, uint8_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120691
Uroš Bizjak changed:
What|Removed |Added
CC||hjl.tools at gmail dot com
--- Comment #2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120604
--- Comment #16 from Uroš Bizjak ---
(In reply to David Binderman from comment #15)
> Uros writes:
> > if ((diff > 0) != ((cf < 0) != (ct < 0) ? cf < 0 : cf < ct))
>
> Crikey. IMHO that would fail any code review I took part in.
That's because
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120604
--- Comment #13 from Uroš Bizjak ---
Created attachment 61627
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61627&action=edit
Additional patch to make sure we can represent the difference
Actually, we have to make sure we can represent t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426
Bug 63426 depends on bug 120604, which changed state.
Bug 120604 Summary: runtime error in ix86_expand_int_movcc
i386/i386-expand.cc:3612:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120604
What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120604
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120604
--- Comment #8 from Uroš Bizjak ---
Created attachment 61617
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61617&action=edit
Proposed patch
David, can you please bootstrap with the attached patch?
at gcc dot gnu.org |ubizjak at gmail dot com
Target Milestone|--- |16.0
Ever confirmed|0 |1
Status|UNCONFIRMED |ASSIGNED
--- Comment #7 from Uroš Bizjak ---
(In reply to Richard Biener from comment #6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120604
--- Comment #4 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #3)
> (In reply to David Binderman from comment #0)
>
> > After a full git download, git blame says:
> >
> > 2bf6d93547e5 gcc/config/i386/i386-expand.c (Martin Liska
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120604
--- Comment #3 from Uroš Bizjak ---
(In reply to David Binderman from comment #0)
> After a full git download, git blame says:
>
> 2bf6d93547e5 gcc/config/i386/i386-expand.c (Martin Liska
> 2019-05-06 09:18:26 +0200 3612) d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120553
Uroš Bizjak changed:
What|Removed |Added
Last reconfirmed||2025-06-05
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120553
--- Comment #8 from Uroš Bizjak ---
(In reply to Richard Biener from comment #1)
> might be also interesting on x86-64 when using bts can use a smaller
> immediate than the now used orq and thus improve instruction size (but it
> clobbers flags)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120553
--- Comment #5 from Uroš Bizjak ---
This patch fixes the non-optimal testcase in Comment #4 for x86_64:
--cut here--
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 40b43cf092a..8eee44756eb 100644
--- a/gcc/config/i386/i386
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120553
--- Comment #4 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #3)
> (In reply to Richard Biener from comment #1)
> > might be also interesting on x86-64 when using bts can use a smaller
> > immediate than the now used orq and thus im
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120553
--- Comment #3 from Uroš Bizjak ---
(In reply to Richard Biener from comment #1)
> might be also interesting on x86-64 when using bts can use a smaller
> immediate than the now used orq and thus improve instruction size (but it
> clobbers flags)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120553
--- Comment #2 from Uroš Bizjak ---
(In reply to Richard Biener from comment #1)
> might be also interesting on x86-64 when using bts can use a smaller
> immediate than the now used orq and thus improve instruction size (but it
> clobbers flags)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111901
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|NEW
Assignee|ubizjak at gmail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111901
--- Comment #16 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #15)
> FTR, clobbers are already marked as having side effects:
Oops, I got the logic wrong. Please disregard this comment.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111901
--- Comment #15 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #13)
> (In reply to Eric Botcazou from comment #12)
>
> > So what about just marking inline asms with memory clobber as having side
> > effects?
> I guess this would wor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111901
--- Comment #13 from Uroš Bizjak ---
(In reply to Eric Botcazou from comment #12)
> So what about just marking inline asms with memory clobber as having side
> effects?
I guess this would work, the compiler must preserve side effects when
optim
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111901
--- Comment #11 from Uroš Bizjak ---
(In reply to Eric Botcazou from comment #10)
> Is it really invalid to perform CSE on val though?
Later in the compilation pipeline, DCE removes insns 6 and 8 as dead code,
leaving:
_.c.331r.cmpelim:
10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111901
Uroš Bizjak changed:
What|Removed |Added
CC||ebotcazou at gcc dot gnu.org,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111901
--- Comment #8 from Uroš Bizjak ---
Compiling the testcase for aarch64-linux-gnu (-O2 -funroll-all-loops) still
performs CSE, despite the patch:
_.c.325r.reload:
6: {x0:SI=asm_operands;clobber [scratch];}
8: {x2:SI=asm_operands;clobber
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111901
Uroš Bizjak changed:
What|Removed |Added
Keywords||patch
--- Comment #7 from Uroš Bizjak --
at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #6 from Uroš Bizjak ---
Created attachment 61544
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61544&action=edit
Proposed patch
Proposed patch avoids simplifications of ASMs that have memory clobber.
The testcase now c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120294
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Uroš Bizjak changed:
What|Removed |Added
CC||kaelfandrew at gmail dot com
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120019
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120019
--- Comment #9 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #8)
> Created attachment 61306 [details]
> Simplified patch
Rainer, can you please test this patch on your target?
at gcc dot gnu.org |ubizjak at gmail dot com
Status|UNCONFIRMED |ASSIGNED
Ever confirmed|0 |1
--- Comment #8 from Uroš Bizjak ---
Created attachment 61306
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61306&acti
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120019
--- Comment #7 from Uroš Bizjak ---
Comment on attachment 61270
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61270
Proposed patch
># HG changeset patch
># Parent 45d1a47b563d28797adaefc1f063c6e3d2541ec3
>i386: Fix rep movs[qldb] handli
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120019
--- Comment #6 from Uroš Bizjak ---
(In reply to Rainer Orth from comment #4)
> (In reply to Uroš Bizjak from comment #3)
> > (In reply to Uroš Bizjak from comment #2)
> >
> > > Please detect support during configure time and create an operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102266
--- Comment #6 from Uroš Bizjak ---
(In reply to H. Peter Anvin from comment #5)
> (I'm asking because of our is far enough back then we can convert the kernel
> code immediately.)
I think I already converted to %a in d689863c1a60b9.
https://g
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120019
--- Comment #3 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #2)
> Please detect support during configure time and create an operand modifier
> that will output "gs " or "fs " for non-default address spaces. Then output
> e.g. "%^%
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120019
--- Comment #2 from Uroš Bizjak ---
(In reply to Rainer Orth from comment #0)
> Between 20250428 and 20250429, Solaris/x86 bootstrap with the native as got
> broken compiling libgcc:
>
> /var/gcc/regression/master/11.4-gcc/build/./gcc/xgcc
> -B
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102266
--- Comment #3 from Uroš Bizjak ---
(In reply to Sam James from comment #2)
> It's r15-2213-g062e46a8137996, so let's set the milestone.
Actually, %a always did this on x86_64.
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
typedef unsigned long uword __attribute__ ((mode (word)));
struct a { uword arr[30]; };
__seg_gs struct a m;
void fromgs (struct a *dst) { *dst = m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111657
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|REOPENED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111657
Uroš Bizjak changed:
What|Removed |Added
Status|RESOLVED|REOPENED
Resolution|FIXED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111657
--- Comment #10 from Uroš Bizjak ---
Created attachment 61152
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61152&action=edit
Patch that allows rep_prefix_{1,4,8}_byte algorithms from non-default address
space
Attached patch allows rep_p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115568
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596
--- Comment #17 from Uroš Bizjak ---
(In reply to Alexander Monakov from comment #16)
> Mateusz, please have a look at PR 95435 for the previous round of tuning for
> AMD, there's a benchmarking script linked from there in PR 43052.
FYI, this b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119539
--- Comment #3 from Uroš Bizjak ---
Comment on attachment 60925
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60925
Untested fix
>+;; Avoid useless masking of count operand.
>+(define_insn_and_split "*3_mask_nf"
>+ [(set (match_operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119450
--- Comment #4 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #3)
> Created attachment 60872 [details]
> gcc15-pr119450.patch
>
> Untested fix.
OK.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386
Uroš Bizjak changed:
What|Removed |Added
CC||hjl.tools at gmail dot com,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119357
--- Comment #9 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #7)
[...]
> Or perhaps better just force it into REG:
Just force it into REG. Combine has more freedom this way.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119071
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119071
--- Comment #13 from Uroš Bizjak ---
(In reply to Sam James from comment #12)
> This works for me on trunk. Did Uros' r15-7793-ga92dc3fe31c95d fix it?
Yes, this is the same issue.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119083
--- Comment #1 from Uroš Bizjak ---
SSE_FIRST_REG is in ic86_class_likely_spilled_p because it is a single-member
class. It is there because of SSE4 pcmpistrm patterns.
%eax (and other single_class) registers are also listed in
CLASS_LIKELY_SPI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118465
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |INVALID
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
--- Comment #17 from Uroš Bizjak ---
V2 patch at [1]:
[1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/676494.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118994
Uroš Bizjak changed:
What|Removed |Added
See Also||https://gcc.gnu.org/bugzill
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118996
--- Comment #4 from Uroš Bizjak ---
(In reply to Hongtao Liu from comment #1)
> Looking at the hook description, it looks like x86 still need nozero return
> values under apx (due to AREG, DREG, CREG, BREG, SIREG, DIREG)
Please note that we also
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936
--- Comment #10 from Uroš Bizjak ---
IMO, the original patch that caused ICE is not ready to be committed. HJ, can
you please revert the original patch (+ my dependant patch)?
We will try again for gcc-16.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936
--- Comment #9 from Uroš Bizjak ---
Also wrong is this part:
+static void
+ix86_find_all_reg_use_1 (rtx set, HARD_REG_SET &stack_slot_access,
+auto_bitmap &worklist)
+{
+ rtx dest = SET_DEST (set);
+ if (!REG_P (dest))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936
--- Comment #7 from Uroš Bizjak ---
(In reply to H.J. Lu from comment #6)
> This works:
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 560e6525b56..f5d46296570 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/confi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936
--- Comment #5 from Uroš Bizjak ---
For func_40 the new ix86_find_max_used_stack_alignment finds stack_alignment =
256.
The only access with 256 bit alignment in func_40 is:
101: [`g_1679']=xmm0:V2DI
103: [const(`g_1679'+0x10)]=xmm0:V2DI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936
--- Comment #4 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #3)
> It is due to r15-7575.
Eh, r15-7573.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936
Uroš Bizjak changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118288
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
dot gnu.org |ubizjak at gmail dot com
Keywords||patch
Status|NEW |ASSIGNED
--- Comment #10 from Uroš Bizjak ---
Patch at [1].
[1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675917.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
--- Comment #15 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #14)
> However (and as shown in Comment #11) the flags register is far from UNUSED
> (let alone DEAD), because it is used in i3. So, the proposed solution is to
> simply
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
--- Comment #14 from Uroš Bizjak ---
Untested patch:
--cut here--
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 3beeb514b81..99cd64ada1f 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -14559,7 +14559,8 @@ distribute_notes (rtx notes,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
--- Comment #38 from Uroš Bizjak ---
(In reply to Sam James from comment #29)
> $ gcc-14 p.c -o p -O2 -march=znver1 -fno-stack-protector
> -fno-stack-clash-protection && ./p
> Segmentation fault (core dumped)
Adding -mpreferred-stack-boundary=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
--- Comment #37 from Uroš Bizjak ---
(In reply to H.J. Lu from comment #23)
> Created attachment 55424 [details]
> An updated patch
Is this patch similar to the one in PR109093#c17 ? As argued in PR109093#c35,
it looks that the current detectio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109093
--- Comment #35 from Uroš Bizjak ---
(In reply to H.J. Lu from comment #17)
> Created attachment 54666 [details]
> A patch
>
> Change ix86_find_max_used_stack_alignment to find alignments of all stack
> slot accesses.
HJ, it looks that the cur
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
--- Comment #36 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #35)
> It is not a good idea to CSE address that refers to virtual stack vars to a
> temporary. This defeats stack/frame pointer detection, mentioned in Comment
> #33, a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
--- Comment #35 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #34)
> The problematic code is expanded from:
>
> ;; Generating RTL for gimple basic block 5
>
> ;; __builtin_memset (&k, 0, 40);
>
> (insn 21 20 22 (parallel [
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
--- Comment #34 from Uroš Bizjak ---
The problematic code is expanded from:
;; Generating RTL for gimple basic block 5
;; __builtin_memset (&k, 0, 40);
(insn 21 20 22 (parallel [
(set (reg:DI 107)
(plus:DI (reg/f:D
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
--- Comment #33 from Uroš Bizjak ---
FTR, ix86_find_max_used_stack_alignment increases alignment only when stack
pointer or frame pointer are explicitly mentioned in :
/* Find the maximum stack alignment. */
sub
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115568
--- Comment #9 from Uroš Bizjak ---
The asm dump from Comment #6 now looks correct:
movl%edi, %r14d # 122 [c=4 l=3] *movsi_internal/0
movl%r14d, -44(%rsp)# 476 [c=4 l=5] *movsi_internal/1
-> movl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
--- Comment #11 from Uroš Bizjak ---
Oh, we have this issue:
Trying 16, 22, 21 -> 23:
16: r106:QI=flags:CCNO>0
22: {r120:QI=r106:QI^0x1;clobber flags:CC;}
REG_UNUSED flags:CC
21: r119:QI=flags:CCNO<=0
REG_DEAD flags:CCNO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
--- Comment #12 from Uroš Bizjak ---
(In reply to Sam James from comment #10)
> r15-268-g9dbff9c05520a7
This commit just prevents the transformation in Comment #11 from happening,
because it skips an early combination of:
Trying 15 -> 16:
1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
--- Comment #9 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #7)
> So r115's value will be 0 or 1 (STORE_FLAG_VALUE) so (gt:QI r115 0) is the
> same as (subreg:QI r115). Unless I am missing something here.
No, you are right. Tak
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
Uroš Bizjak changed:
What|Removed |Added
Component|target |rtl-optimization
--- Comment #6 from Uroš
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
--- Comment #5 from Uroš Bizjak ---
So, we have:
Trying 15 -> 16:
15: flags:CCNO=cmp(r115:SI,0)
REG_DEAD r115:SI
16: r106:QI=flags:CCNO>0
Successfully matched this instruction:
(set (reg:CCNO 17 flags)
(compare:CCNO (reg:SI 115
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739
--- Comment #4 from Uroš Bizjak ---
This is the difference I get:
--- pass/pr118739.s 2025-02-04 11:08:20.003694978 +0100
+++ fail/pr118739.s 2025-02-04 11:08:32.943651165 +0100
@@ -21,16 +21,11 @@
.cfi_offset 3, -32
mov
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115673
--- Comment #23 from Uroš Bizjak ---
Comment on attachment 60337
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60337
A patch with tests
>@@ -10225,13 +10225,15 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx
>callarg1,
> fnaddr = g
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115673
--- Comment #22 from Uroš Bizjak ---
Comment on attachment 60337
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60337
A patch with tests
>From 1503e1e1df7a402d4be560fdc446dd6c39127e9c Mon Sep 17 00:00:00 2001
>From: "H.J. Lu"
>Date: Fri,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115673
--- Comment #19 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #18)
> Created attachment 60331 [details]
> Prototype patch
>
> Prototype patch that disables constraints, unwanted with
> flag_force_indirect_call.
HJ, can you please
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115673
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118662
--- Comment #15 from Uroš Bizjak ---
The testcase now generates (-O2 -ftree-slp-vectorize -fno-vect-cost-model
-msse4):
addup:
pmovsxbd(%rdi), %xmm0
movd(%rdi), %xmm1
movdqa %xmm0, %xmm2
pextrb $3,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118662
--- Comment #16 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #15)
> One possible improvement would be to move QImode value to %xmm1 and
V4QImode
1 - 100 of 3296 matches
Mail list logo