[Bug target/56484] ICE in assign_by_spills, at lra-assigns.c:1268

2013-02-28 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56484 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd

[Bug target/56484] [4.8 Regression] ICE in assign_by_spills, at lra-assigns.c:1268

2013-03-01 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56484 --- Comment #5 from Venkataramanan 2013-03-01 08:42:42 UTC --- -fno-tree-coalesce-vars for workarround

[Bug tree-optimization/54742] Switch elimination in FSM loop

2013-03-04 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54742 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd

[Bug tree-optimization/54742] Switch elimination in FSM loop

2013-03-04 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54742 --- Comment #6 from Venkataramanan 2013-03-04 08:34:06 UTC --- (In reply to comment #5) > int first; > void thread_backedge (void) > { > int i = 0; > > do > { > if (first ==1) > { > foo (); >

[Bug target/54239] New: Not able to generate "prefetch" (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 Bug #: 54239 Summary: Not able to generate "prefetch" (prefetch read) instruction using -m3dnow or -mprfchw Classification: Unclassified Product: gcc Version: 4.8.0 S

[Bug target/54239] Not able to generate "prefetch" (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 --- Comment #2 from Venkataramanan 2012-08-13 13:51:08 UTC --- (In reply to comment #1) > Both in 4.7 (which is before the prfchw changes) and 4.8 with -m32 -m3dnow and > -m32 -m3dnow -mno-sse I get prefetch + prefetchw insn, which looks ok to me

[Bug target/54239] Not able to generate "prefetch" (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 --- Comment #5 from Venkataramanan 2012-08-13 14:33:14 UTC --- (In reply to comment #4) > BTW, why do you care about the prefetch insn? Isn't it obsoleted by the SSE > ISA prefetches anyway (unlike prefetchw)? Hi Jakub, as for as fam15H proces

[Bug middle-end/53073] [4.8 Regression] 464.h264ref in SPEC CPU 2006 miscompiled

2012-08-20 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53073 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd

[Bug tree-optimization/53397] Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes

2012-10-09 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53397 Venkataramanan changed: What|Removed |Added Status|NEW |RESOLVED Resolution|

[Bug target/52908] xop-mul-1:f9 miscompiled on bulldozer (-mxop)

2012-05-04 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52908 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd

[Bug target/52908] xop-mul-1:f9 miscompiled on bulldozer (-mxop)

2012-05-08 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52908 --- Comment #6 from Venkataramanan 2012-05-09 03:13:01 UTC --- (In reply to comment #5) > (In reply to comment #4) > > A Quick make check on i386.exp result is shown below: > > > > Tests that now fail, but worked before: > > > > gcc.target/i38

[Bug tree-optimization/53290] New: ICE compiling aermod with Ofast

2012-05-09 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53290 Bug #: 53290 Summary: ICE compiling aermod with Ofast Classification: Unclassified Product: gcc Version: tree-ssa Status: UNCONFIRMED Severity: normal Priority: P3

[Bug tree-optimization/53290] ICE compiling aermod with Ofast

2012-05-09 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53290 Venkataramanan changed: What|Removed |Added Target||x86_64-unknown-linux-gnu H

[Bug tree-optimization/53397] New: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes

2012-05-18 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53397 Bug #: 53397 Summary: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes Classification: Unclassified Product: gcc Version

[Bug tree-optimization/53397] Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes

2012-05-18 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53397 Venkataramanan changed: What|Removed |Added Target||x86_64-unknown-linux-gnu

[Bug tree-optimization/54073] [4.7/4.8 Regression] SciMark Monte Carlo test performance has seriously decreased in recent GCC releases

2012-07-26 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54073 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd

[Bug tree-optimization/54136] New: Compiling phoronix/dcraw with gcc 4.8 trunk causes infinite execution.

2012-07-31 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54136 Bug #: 54136 Summary: Compiling phoronix/dcraw with gcc 4.8 trunk causes infinite execution. Classification: Unclassified Product: gcc Version: tree-ssa Status: UNCON

[Bug tree-optimization/54136] Compiling phoronix/dcraw with gcc 4.8 trunk causes infinite execution.

2012-07-31 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54136 --- Comment #2 from Venkataramanan 2012-07-31 09:02:34 UTC --- Created attachment 27904 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27904 Simplied test case

[Bug tree-optimization/54136] Compiling phoronix/dcraw with gcc 4.8 trunk causes infinite execution.

2012-07-31 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54136 --- Comment #4 from Venkataramanan 2012-07-31 09:22:47 UTC --- Ok thanks will adjust the test case. So compiler can generate infinite loop incase of array out of bound acess?

[Bug target/44141] Redundant loads and stores generated for AMD bdver1 target

2012-03-22 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141 --- Comment #4 from Venkataramanan 2012-03-22 13:17:34 UTC --- I dont have permission to confirm this bug. Here is my analysis for the cause. #(insn:TI 4886 4885 4888 132 (set (reg:V2DF 25 xmm4 [8797]) #(mult:V2DF (reg:V2DF 25 xmm4 [879

[Bug target/44141] Redundant loads and stores generated for AMD bdver1 target

2012-03-22 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141 --- Comment #5 from Venkataramanan 2012-03-22 13:23:39 UTC --- Created attachment 26955 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26955 Simple patch to print vmovups at assembly generation stage

[Bug target/44141] Redundant loads and stores generated for AMD bdver1 target

2012-03-22 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141 --- Comment #6 from Venkataramanan 2012-03-22 13:26:16 UTC --- This patch tries to change vmovupd to vmovups during assembly printing stage when tuning flag for bdver1 is set. I am yet to test this one. Please provide your suggestion.

[Bug target/44141] Redundant loads and stores generated for AMD bdver1 target

2012-03-27 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141 --- Comment #8 from Venkataramanan 2012-03-27 10:46:53 UTC --- Created attachment 27013 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27013 Simplied test case form ac.f90

[Bug target/44141] Redundant loads and stores generated for AMD bdver1 target

2012-03-27 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141 --- Comment #9 from Venkataramanan 2012-03-27 10:51:06 UTC --- (In reply to comment #8) > Created attachment 27013 [details] > Simplied test case form ac.f90 GCC revision : 184502 Command to reproduce: gfortran unoptimal_move.f90 -S -march=bdve

[Bug rtl-optimization/44141] Redundant loads and stores generated for AMD bdver1 target

2012-03-27 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141 --- Comment #11 from Venkataramanan 2012-03-28 03:02:19 UTC --- Uros, Can you please assign this bug under my name. I will see what is hapenning at reload.

[Bug rtl-optimization/44141] Redundant loads and stores generated for AMD bdver1 target

2012-03-28 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141 --- Comment #13 from Venkataramanan 2012-03-28 10:32:31 UTC --- (In reply to comment #12) > Having a vector mode changing subreg on the LHS of an instruction is a very > common issue in the i386 backend, and unfortunately e.g. means that lots of

[Bug rtl-optimization/44141] Redundant loads and stores generated for AMD bdver1 target

2012-04-01 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141 --- Comment #15 from Venkataramanan 2012-04-01 07:55:24 UTC --- Hi Uros, I had a look at reload pass. I have an RTL sequence that look like this. (insn 32 31 33 2 (set (subreg:V4SF (reg:V2DF 284) 0) <== psuedo reguster (unspec:V4SF [

[Bug middle-end/51848] New: GCC is not able to vectorize when a constant value is also added to the sum of array expression inside a loop.

2012-01-13 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51848 Bug #: 51848 Summary: GCC is not able to vectorize when a constant value is also added to the sum of array expression inside a loop. Classification: Unclassified Product

[Bug middle-end/55381] [4.8 Regression]: build fails on cris-elf building libgfortran with host-gcc-4.4, ICE compiling matmul_i1.c

2012-11-18 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55381 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd

[Bug rtl-optimization/55845] New: 454.calculix miscompares with -march=btver2 -O3 -ffastmath -fschedule-insns -mvzeroupper for test data run

2013-01-02 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55845 Bug #: 55845 Summary: 454.calculix miscompares with -march=btver2 -O3 -ffastmath -fschedule-insns -mvzeroupper for test data run Classification: Unclassified Pr

[Bug rtl-optimization/55845] 454.calculix miscompares with -march=btver2 -O3 -ffastmath -fschedule-insns -mvzeroupper for test data run

2013-01-04 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55845 --- Comment #5 from Venkataramanan 2013-01-04 09:12:09 UTC --- Hi Uros. Thank you for conforming. I suspected this is a problem with jump optimization. Uros what flag we need to pass to bypass the .jump2 pass? Regards, Venkat.

[Bug rtl-optimization/55845] 454.calculix miscompares with -march=btver2 -O3 -ffastmath -fschedule-insns -mvzeroupper for test data run

2013-01-04 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55845 --- Comment #7 from Venkataramanan 2013-01-04 09:24:47 UTC --- Hi Uros, Sorry , my intention was just know if we have some flags to disable this pass. Not as a solution :) Regards, Venkat. -Original Message- From:

[Bug rtl-optimization/55845] [4.8 Regression] 454.calculix miscompares with -march=btver2 -O3 -ffastmath -fschedule-insns -mvzeroupper for test data run

2013-01-15 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55845 Venkataramanan changed: What|Removed |Added Status|RESOLVED|VERIFIED --- Comment #13 from

[Bug target/80820] _mm_set_epi64x shouldn't store/reload for -mtune=haswell, Zen should avoid store/reload, and generic should think about it.

2017-08-21 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80820 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd dot co

[Bug target/78090] [x86_64]: GCC allows integer register for inter unit conversion under -mtune-ctrl=^inter_unit_conversions .

2017-04-24 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78090 Venkataramanan changed: What|Removed |Added Status|RESOLVED|VERIFIED --- Comment #6 from Venkataram

[Bug target/80689] New: 128 loads generated for structure copying with gcc 7.10 and leads to STLF stalls in avx2 targets.

2017-05-09 Thread venkataramanan.kumar at amd dot com
Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com Target Milestone: --- For the below test case, GCC 7.1.0 started generating 128 bit loads and stores while copying

[Bug target/80689] 128 loads generated for structure copying with gcc 7.1.0 and leads to STLF stalls in avx2 targets.

2017-05-09 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80689 --- Comment #2 from Venkataramanan --- (In reply to Richard Biener from comment #1) > That you use noinline tells that glibc memcpy has the very same issue. Note > that similarly having bytes/shorts in the structure and using longs or ints > to

[Bug target/80689] 128 loads generated for structure copying with gcc 7.1.0 and leads to STLF stalls in avx2 targets.

2017-05-09 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80689 --- Comment #6 from Venkataramanan --- (In reply to Richard Biener from comment #4) > What does ICC do if you use int and/or short fields in st1? Does it perform > struct copying member-wise? It copies member wise. -O2 /-O2 -march=core-avx2 Fo

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-12 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd dot co

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-21 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #9 from Venkataramanan --- (In reply to Jakub Jelinek from comment #6) > Sure, the question is (raised several times over the last couple of years) > is if the generic tuning should not adjust slightly based on the selected > ISAs. >

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-21 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #12 from Venkataramanan --- (In reply to Allan Jensen from comment #11) > Btw, did you benchmark store splitting on AMD? It is also enabled for BDVER > and ZNVER1. I have not done that. As per SWOG for AMD15h (BDVER) it is advisabl

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-21 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #14 from Venkataramanan --- (In reply to Allan Jensen from comment #13) > The question is if the unaligned store is still slow on Excavator and Ryzen > which support AVX2. As far as I understand the bulldozer architectures just > pref

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-21 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #15 from Venkataramanan --- Considering this PR, removing the tuning (splitting of unaligned avx256 loads) for generic is suggested.

[Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2017-01-10 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #20 from Venkataramanan --- I tried Intel SDE on mcf to get the hot blocks dynamic execution counts. .L98: jle .L97 cmpl$2, %r9d jne .L97 .L99: BLOCK: 7 PC: 00403252 ICOUNT: 906

[Bug target/79745] vec_init<> expander misses V2TImode with AVX and V2OImode and V2TImode with AVX512

2017-02-28 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79745 --- Comment #2 from Venkataramanan --- I checked -mprefer-avx128 vs -mno-prefer-avx256. with AVX256 assembly generated with 32 inserts and 28 packs for loading each char type element for forming a vectors with YMM. instead doing loading from m

[Bug target/80313] -march=znver1 produce worse code than -march=haswell

2017-04-06 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80313 --- Comment #3 from Venkataramanan --- Thanks for pointing out. It looks like we need to adjust our branch cost. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 14ac189..8212c56 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/c

[Bug rtl-optimization/78090] New: [x86_64]: GCC allows integer register for inter unit conversion under -mtune-ctrl=^inter_unit_conversions .

2016-10-24 Thread venkataramanan.kumar at amd dot com
: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com Target Milestone: --- GCC is not honoring -mtune-ctrl=^inter_unit_conversions and allows

[Bug tree-optimization/78200] New: [7 regression]: 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-03 Thread venkataramanan.kumar at amd dot com
: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com Target Milestone: --- Noticed 5% regression with 429.mcf of cpu2006 on x86_64 AVX2 (bdver4) with GCC trunk gcc version

[Bug tree-optimization/78200] [7 regression]: 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-03 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #1 from Venkataramanan --- (In reply to Venkataramanan from comment #0) > Noticed 5% regression with 429.mcf of cpu2006 on x86_64 AVX2 (bdver4) with > GCC trunk gcc version 7.0.0 20161028 (experimental) (GCC). > > Flag used is -O3 -m

[Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-06 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #4 from Venkataramanan --- Created attachment 39976 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39976&action=edit Test case for noncanonical gimple formation at tree if conversion. The test case is simulated from primal_bea_

[Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-08 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #7 from Venkataramanan --- Bisecting shows non canonical gimple generation at r238370. --snip-- commit f3dce1cdd016e16cf9dc051d127bdf6eb58430fc Author: rguenth Date: Fri Jul 15 10:53:29 2016 + 2016-07-15 Richard Biener

[Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-09 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #11 from Venkataramanan --- Hi Richard On haswell machine original run time for -O3 -max2 -mprefer-avx2 real2m35.325s user2m35.257s sys 0m0.070s Changing the assembly from .L98: jle .L97 cmpl

[Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-11 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #14 from Venkataramanan --- Between GCC 6.2.0 and GCC 7 (Nov/10/2016) I see three major differences in gimple opts dump. 1. IPA inline is more aggressive in GCC 7. Looks like it is in-lining more in hot function "primal_bea_mpp". Ho

[Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-15 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #16 from Venkataramanan --- GCC7 added early treading pass and gimple thread pass before VRP. When I disable these passes, tree-vrp is able to move the true block same as that of GCC6. It again the tree-if-convert causing the moved b

[Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-15 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #17 from Venkataramanan --- Looking at the check red_cost < 0 && arc->ident == AT_LOWER) || (red_cost > 0 && arc->ident == AT_UPPER The order if-combine created seem to be the best. if (red_cost_86 < 0)

[Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 regresses in GCC trunk for avx2 target.

2016-11-15 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 --- Comment #19 from Venkataramanan --- (In reply to rguent...@suse.de from comment #18) > On Tue, 15 Nov 2016, venkataramanan.kumar at amd dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 > > > &

[Bug target/60617] [4.8 Regression] unable to find a register to spill in class 'LO_REGS'

2014-03-31 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60617 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd dot co

[Bug target/60617] [4.8 Regression] unable to find a register to spill in class 'LO_REGS'

2014-05-12 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60617 --- Comment #3 from Venkataramanan --- The bug is now hidden in trunk by revision 209897 The patch "Remove PUSH_ARGS_REVERSED from the RTL expander" (reference below) seems to change the way arguments are handled in RTL. Ref: http://gcc.gnu.org

[Bug target/60617] [4.8 Regression] unable to find a register to spill in class 'LO_REGS'

2014-05-12 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60617 --- Comment #4 from Venkataramanan --- Reverting this patch in 209897 bug still occurs in trunk with -mno-lra. SPILL failure occurs for regno 110 ("dst" operand) in below instruction (insn 634 633 635 27 (parallel [ (set (reg:SI 3 r

[Bug middle-end/61354] New: GCC bootstrap with LTO fails in trunk when built with isl

2014-05-29 Thread venkataramanan.kumar at amd dot com
Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com Machine: AMD64 Revision tested: trunk 210955 sources/gcc-fsf/gcc/configure --prefix=/work/builds/lto-native-bootstrap-install-trunk --with-build-config

[Bug middle-end/61354] GCC bootstrap with LTO fails in trunk when built with isl

2014-05-30 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61354 --- Comment #2 from Venkataramanan --- Maxim, "sources/gcc-fsf/gcc" is the top level source directory and it contains the contrib folder. gcc compiler sources are in "sources/gcc-fsf/gcc/gcc".

[Bug middle-end/61354] GCC bootstrap with LTO fails in trunk when built with isl

2014-05-30 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61354 Venkataramanan changed: What|Removed |Added CC||hubicka at ucw dot cz --- Comment #3 fr

[Bug bootstrap/61440] New: Bootstrap failure with --with-build-config=bootstrap-lto

2014-06-07 Thread venkataramanan.kumar at amd dot com
Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com Machine - AMD64 (bdver1) GCC FSF 4.9 branch Configure command /work/sources/gcc/configure --with-build-config=bootstrap-lto --prefix=/work/builds

[Bug bootstrap/61440] Bootstrap failure with --with-build-config=bootstrap-lto

2014-06-07 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61440 --- Comment #1 from Venkataramanan --- Tried objdump -d stage2-gcc/gimple.o > stage2-gcc-gimple.s objdump -d stage3-gcc/gimple.o > stage3-gcc-gimple.s diff -u stage2-gcc-gimple.s stage3-gcc-gimple.s --- stage2-gcc-gimple.s 2014-06-07 1

[Bug bootstrap/61442] New: [Aarch64] ICE while bootstraping GCC with --with-build-config=bootstrap-lto

2014-06-07 Thread venkataramanan.kumar at amd dot com
: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com Machine: Aarch64 Build Config line: ./gcc/configure --prefix=/work/GCC_Team/vekumar/ltoinstall/ --with-gmp=/work/GCC_Team/vekumar

[Bug target/62308] A bug with aarch64 big-endian

2014-10-08 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62308 --- Comment #6 from Venkataramanan --- git bisect experiment showed this revision after which bug disappears. https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=215707

[Bug target/62308] A bug with aarch64 big-endian

2014-10-09 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62308 --- Comment #7 from Venkataramanan --- I tried to look at the RTL and assembly code generated after the patch comitted in 215707. The code generated looks good some unoptimal code but it is at -O0. sub sp, sp, #16 // 15 *addd

[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf

2014-10-13 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173 --- Comment #2 from Venkataramanan --- Changed the test case to work with latest GCC trunk #include int16x4x2_t foo(int16_t * __restrict pDataA, int16_t * __restrict pDataB) { int16x4x2_t DataA, DataB,

[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf

2014-10-20 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173 --- Comment #4 from Venkataramanan --- (In reply to Fei Yang from comment #3) > (In reply to ktkachov from comment #1) > > Confirmed. > > Feel free to propose a patch for them on gcc-patches along the > > lines you described in: > https://gcc.gn

[Bug driver/63687] New: Dumps from RTL passes after LTO optimizations are not generated .

2014-10-30 Thread venkataramanan.kumar at amd dot com
Priority: P3 Component: driver Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com I tried to dump RTL passes when compiling aarch64-unknown-linux-gnu compiler with -flto -O3. gcc version 5.0.0 20141030 aarch64-unknown-linux-gnu

[Bug driver/63687] Dumps from RTL passes after LTO optimizations are not generated .

2014-10-30 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63687 Venkataramanan changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug sanitizer/63850] New: Building TSAN for Aarch64 results in assembler

2014-11-13 Thread venkataramanan.kumar at amd dot com
: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org After enabling TSAN for Aarch64

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails,

2014-08-11 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #5 from Venkataramanan --- (In reply to Richard Biener from comment #4) > Also fails with the 4.9.0 release on x86_64. Also fails with the GCC 4.9 on Aarch64 target.

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails,

2014-08-12 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #11 from Venkataramanan --- I am also trying to fix LTO bootstrap compare failure in Aarch64. Bootstrap compare failure is not occurring in GCC FSF trunk (tested on aarch64 as well as x86_64 machine). Now I am doing one more round of

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails,

2014-08-12 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #14 from Venkataramanan --- (In reply to Sven C. Dack from comment #6) > It seems the problem is caused by the use of the jobserver. Changing > bootstrap-lto.mk from: > > ... > STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-l

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-13 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #21 from Venkataramanan --- I randomly tried some revisions and last one that passed was r209650 on 2014-04-22. I am still continuing to go down and see some more revision.

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-13 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #28 from Venkataramanan --- Richard, I am still not able to understand why this problem is not seen in trunk.

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-13 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #29 from Venkataramanan --- Hi Richard, I tried the patch you posted last on GCC patches, on top of GCC 4.9 on Aarch64. https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01324.html I am still getting same number of compare errors. Now

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-13 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #30 from Venkataramanan --- (In reply to Venkataramanan from comment #29) > Hi Richard, > > I tried the patch you posted last on GCC patches, on top of GCC 4.9 on > Aarch64. > https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01324.html

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-14 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #31 from Venkataramanan --- (In reply to Venkataramanan from comment #30) > (In reply to Venkataramanan from comment #29) > > Hi Richard, > > > > I tried the patch you posted last on GCC patches, on top of GCC 4.9 on > > Aarch64. >

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-14 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #34 from Venkataramanan --- Richard, What I understand is that instead of using tune flags for garbage collection, need to try and fix the object code differences?

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-18 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #51 from Venkataramanan --- (In reply to rguent...@suse.de from comment #35) > On Thu, 14 Aug 2014, venkataramanan.kumar at amd dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 > > > &

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-18 Thread venkataramanan.kumar at amd dot com
onday, August 18, 2014 6:41 PM To: Kumar, Venkataramanan Subject: [Bug bootstrap/62077] --with-build-config=bootstrap-lto fails https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 --- Comment #52 from rguenther at suse dot de --- On Mon, 18 Aug 2014, venkataramanan.kumar at amd dot com wrote

[Bug bootstrap/62077] --with-build-config=bootstrap-lto fails

2014-08-18 Thread venkataramanan.kumar at amd dot com
7 --- Comment #54 from rguenther at suse dot de --- On Mon, 18 Aug 2014, venkataramanan.kumar at amd dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077 > > --- Comment #53 from Venkataramanan com> --- Hi Richard, > > >> Well, it would be a workaround onl

[Bug target/63190] Assembler errors when building md5 code from fbb on aarch64

2014-09-07 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63190 Venkataramanan changed: What|Removed |Added CC||venkataramanan.kumar at amd dot co

[Bug target/63304] New: Aarch64 pc-relative load offset out of range

2014-09-18 Thread venkataramanan.kumar at amd dot com
: target Assignee: unassigned at gcc dot gnu.org Reporter: venkataramanan.kumar at amd dot com Constant literal table is kept at large offset, resulting in pc-relative load offset out of range. aarch64-none-linux-gnu-gcc x.c /tmp/ccrOQLEb.s: Assembler messages: /tmp/ccrOQLEb.s:10

[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-18 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304 --- Comment #1 from Venkataramanan --- Created attachment 33515 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33515&action=edit Attached test case

[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-18 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304 --- Comment #2 from Venkataramanan --- Marcus, can you please assign it to me if it is confirmed.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-19 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304 --- Comment #5 from Venkataramanan --- We got inspired by this bug. https://bugs.linaro.org/show_bug.cgi?id=400 It happens at -O0 now.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-19 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304 Venkataramanan changed: What|Removed |Added Status|NEW |RESOLVED CC|

[Bug bootstrap/61442] [Aarch64] ICE while bootstraping GCC with --with-build-config=bootstrap-lto

2014-10-04 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61442 Venkataramanan changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/62308] A bug with aarch64 big-endian

2014-10-07 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62308 --- Comment #5 from Venkataramanan --- Not able to reproduce with latest trunk r215964. Bisecting to find a revision from which bug disappears.