: enhancement
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: dwarak dot rajagopal at amd dot com
GCC build triplet: x86_64
GCC host triplet: x86_64
GCC target triplet: x86_64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27313
--- Comment #3 from dwarak dot rajagopal at amd dot com 2006-04-25 19:07
---
Yes this is true. The example I posted was a simplest case where it fails.
Below mmight be a typical case where you have to do two stores.
int cmov(int* A ,int B ,int C ,int* D ,int* E ,int F ,int g) {
int
--- Comment #1 from dwarak dot rajagopal at amd dot com 2008-11-20 16:48
---
1) -msse5 includes -mfma switch (because fma is a part of sse5 instructions).
So having "-msse5 -mfma" is same as having just "msse5", though you can just
have -fma (without -msse5).
2) &
--- Comment #4 from dwarak dot rajagopal at amd dot com 2008-11-20 19:35
---
Yes, you are right. "-mfma -msse5" does not make sense. I mistook -mfma for
-mfused-madd and hence the confusion.
Hence these combinations (1 and 2) does not make sense.
Thanks,
Dwarak
--- Comment #6 from dwarak dot rajagopal at amd dot com 2008-11-20 19:49
---
> Should we disallow such combinations?
>
Yes.
- Dwarak
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38201
--- Comment #12 from dwarak dot rajagopal at amd dot com 2005-11-17 17:30
---
(In reply to comment #9)
> (In reply to comment #8)
> > I got the same ICE with one of the SPEC2006 candidate benchmarks on
> > x86_64-linux-gnu.
>
> Was this before or after my fix
Priority: P3
Component: debug
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: dwarak dot rajagopal at amd dot com
GCC build triplet: x86_64
GCC host triplet: x86_64
GCC target triplet: x86_64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32914
rand:TI 2 "nonmemory_operand" "xn")))]
+ (match_operand:TI 2 "nonmemory_operand" "xN")))]
"TARGET_SSE2"
"psll\t{%2, %0|%0, %2}"
[(set_attr "type" "sseishft")
Is this ok?
- Dwarak
--
Sum
--- Comment #1 from dwarak dot rajagopal at amd dot com 2008-10-14 15:29
---
Created an attachment (id=16492)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16492&action=view)
Testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37828
iority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: dwarak dot rajagopal at amd dot com
GCC build triplet: x86_64-unknown-linux-gnu
GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu
http://g
tus: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: dwarak dot rajagopal at amd dot com
GCC build triplet: x86_64-unknown-linux-gnu
GCC host triplet: x86_64-unknown-linux-gnu
GCC target
--- Comment #1 from dwarak dot rajagopal at amd dot com 2008-10-16 15:00
---
Created an attachment (id=16509)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16509&action=view)
Testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37851
--- Comment #13 from dwarak dot rajagopal at amd dot com 2009-02-06 22:35
---
> The patch makes GCC to generate movaps load followed by addps. On Core 2 it
> speeds up the testcase from 7s to 6.2s so I guess it works as expected.
>
> The same however does not reproduc
--- Comment #20 from dwarak dot rajagopal at amd dot com 2009-02-10 16:28
---
Paulo,
(a) movaps (%rax, %rsi), %xmm0
addps %xmm0, %xmm1
(b) movaps %xmm0, %xmm1
addps (%rax, %rsi), %xmm1
Yes, case (a) is slightly better than case (b). It shouldn't matter much t
14 matches
Mail list logo