rez5427 wrote:
I think register coalescer is pretty much the same thing as gcc's early_remat.
The llvm's register coalescer decide to remat this, because the return register
is been used. And gcc's early_remat decide not to remat this. I put part of the
gcc's log in here:
cast_and_load_1.c.31
rez5427 wrote:
ping
https://github.com/llvm/llvm-project/pull/163047
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
rez5427 wrote:
@arsenm @preames @lukel97 @aengelke ping
https://github.com/llvm/llvm-project/pull/163047
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -202,13 +202,13 @@ define { <4 x i8>, <4 x i1> } @always_usub_const_vector()
nounwind {
; SSE-LABEL: always_usub_const_vector:
; SSE: # %bb.0:
; SSE-NEXT:pcmpeqd %xmm0, %xmm0
-; SSE-NEXT:pcmpeqd %xmm1, %xmm1
+; SSE-NEXT:movdqa %xmm0, %xmm1
rez5427 wrote:
ping
https://github.com/llvm/llvm-project/pull/163047
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -202,13 +202,13 @@ define { <4 x i8>, <4 x i1> } @always_usub_const_vector()
nounwind {
; SSE-LABEL: always_usub_const_vector:
; SSE: # %bb.0:
; SSE-NEXT:pcmpeqd %xmm0, %xmm0
-; SSE-NEXT:pcmpeqd %xmm1, %xmm1
+; SSE-NEXT:movdqa %xmm0, %xmm1
rez5427 wrote:
> From the motivation case, do you know why MachineCSE fails to optimize?
Machine CSE is before this register coalescer, maybe add a CSE after this remat
will work.
https://github.com/llvm/llvm-project/pull/163047
___
cfe-commits maili
@@ -202,13 +202,13 @@ define { <4 x i8>, <4 x i1> } @always_usub_const_vector()
nounwind {
; SSE-LABEL: always_usub_const_vector:
; SSE: # %bb.0:
; SSE-NEXT:pcmpeqd %xmm0, %xmm0
-; SSE-NEXT:pcmpeqd %xmm1, %xmm1
+; SSE-NEXT:movdqa %xmm0, %xmm1
rez5427 wrote:
> > Machine CSE is before this register coalescer, Machine CSE will see
> > something like:
> > ```
> > li a2, 42
> > a0 = copy a2
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > So Machine CSE will not eliminate this
@@ -202,13 +202,13 @@ define { <4 x i8>, <4 x i1> } @always_usub_const_vector()
nounwind {
; SSE-LABEL: always_usub_const_vector:
; SSE: # %bb.0:
; SSE-NEXT:pcmpeqd %xmm0, %xmm0
-; SSE-NEXT:pcmpeqd %xmm1, %xmm1
+; SSE-NEXT:movdqa %xmm0, %xmm1
rez5427 wrote:
> > Machine CSE is before this register coalescer, Machine CSE will see
> > something like:
> > ```
> > li a2, 42
> > a0 = copy a2
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > So Machine CSE will not eliminate this
11 matches
Mail list logo