[Bug rtl-optimization/46585] [4.6 Regression] ICE: SIGSEGV in vinsn_create (sel-sched-ir.c:1189) with -fno-dce -fschedule-insns -fselective-scheduling

2010-11-25 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46585 Alexander Monakov changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|

[Bug rtl-optimization/46649] [4.6 Regression] ICE: in move_bb_info, at sel-sched-ir.c:5080 with -fschedule-insns -fselective-scheduling

2010-11-25 Thread amonakov at gcc dot gnu.org
|unassigned at gcc dot |amonakov at gcc dot gnu.org |gnu.org | --- Comment #5 from Alexander Monakov 2010-11-25 21:37:44 UTC --- Okay, I didn't notice at first that it's a C++ testcase (does not fail as C). Well, this time sel-sched can not deal wit

[Bug rtl-optimization/45354] [4.6 regression] ICE with -fselective-scheduling and -freorder-blocks-and-partition

2010-11-29 Thread amonakov at gcc dot gnu.org
|unassigned at gcc dot |amonakov at gcc dot gnu.org |gnu.org | --- Comment #9 from Alexander Monakov 2010-11-29 13:58:45 UTC --- Thanks. We go over-eager when cleaning up degenerate jumps and don't pay attention to EDGE_CROSSING. diff --git a/gcc/sel-

[Bug tree-optimization/46763] gcc 4.5: missed optimization: copy global to local, prefetch

2010-12-02 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46763 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/45354] [4.6 regression] ICE with -fselective-scheduling and -freorder-blocks-and-partition

2010-12-03 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45354 --- Comment #10 from Alexander Monakov 2010-12-03 12:04:22 UTC --- Author: amonakov Date: Fri Dec 3 12:04:16 2010 New Revision: 167415 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167415 Log: PR rtl-optimization/45354 * sel-sch

[Bug rtl-optimization/45354] [4.6 regression] ICE with -fselective-scheduling and -freorder-blocks-and-partition

2010-12-03 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45354 Alexander Monakov changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|

[Bug middle-end/46758] [4.5/4.6 Regression] -fgraphite-identity produces wrong code when using 64bit constants

2010-12-03 Thread amonakov at gcc dot gnu.org
||2010.12.03 15:27:13 CC||amonakov at gcc dot gnu.org Ever Confirmed|0 |1 --- Comment #1 from Alexander Monakov 2010-12-03 15:27:13 UTC --- In this particular example, the problem is in

[Bug middle-end/46761] [4.6 Regression] -fgraphite-identity produces wrong code for array initialization arr[i] = i

2010-12-03 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46761 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug middle-end/46758] [4.5/4.6 Regression] -fgraphite-identity produces wrong code when using 64bit constants

2010-12-08 Thread amonakov at gcc dot gnu.org
|unassigned at gcc dot |amonakov at gcc dot gnu.org |gnu.org | --- Comment #3 from Alexander Monakov 2010-12-08 18:03:57 UTC --- I'll work on a patch

[Bug middle-end/46761] [4.6 Regression] -fgraphite-identity produces wrong code for array initialization arr[i] = i

2010-12-08 Thread amonakov at gcc dot gnu.org
|unassigned at gcc dot |amonakov at gcc dot gnu.org |gnu.org | --- Comment #5 from Alexander Monakov 2010-12-08 18:05:23 UTC --- Thanks. I'll submit this patch

[Bug rtl-optimization/46875] [4.6 Regression] ICE: verify_flow_info failed: too many outgoing branch edges from bb 3 with -Os -fselective-scheduling2

2010-12-13 Thread amonakov at gcc dot gnu.org
|unassigned at gcc dot |amonakov at gcc dot gnu.org |gnu.org | --- Comment #2 from Alexander Monakov 2010-12-13 13:21:35 UTC --- We try to remove a tablejump, but then the edge is not marked fallthrough because of the jump table that is left between basic

[Bug rtl-optimization/46925] New: Can't optimize degenerate table jumps

2010-12-13 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46925 Summary: Can't optimize degenerate table jumps Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: u

[Bug middle-end/46758] [4.5/4.6 Regression] -fgraphite-identity produces wrong code when using 64bit constants

2010-12-13 Thread amonakov at gcc dot gnu.org
gcc dot gnu.org |unassigned at gcc dot ||gnu.org --- Comment #4 from Alexander Monakov 2010-12-13 17:24:59 UTC --- Ouch. The "obvious" fix: diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c index 5036fba..8fda288 10

[Bug middle-end/45388] [4.6 Regression] Global constructor not found

2010-12-13 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45388 Alexander Monakov changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED

[Bug rtl-optimization/46875] [4.6 Regression] ICE: verify_flow_info failed: too many outgoing branch edges from bb 3 with -Os -fselective-scheduling2

2010-12-14 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46875 --- Comment #3 from Alexander Monakov 2010-12-14 12:43:50 UTC --- Author: amonakov Date: Tue Dec 14 12:43:47 2010 New Revision: 167794 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167794 Log: PR rtl-optimization/46875 * sched-vi

[Bug rtl-optimization/46875] [4.6 Regression] ICE: verify_flow_info failed: too many outgoing branch edges from bb 3 with -Os -fselective-scheduling2

2010-12-14 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46875 Alexander Monakov changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|

[Bug rtl-optimization/46649] [4.6 Regression] ICE: in move_bb_info, at sel-sched-ir.c:5080 with -fschedule-insns -fselective-scheduling

2010-12-14 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46649 --- Comment #6 from Alexander Monakov 2010-12-14 13:28:06 UTC --- Even though it is possible to unbreak purge_empty_blocks/maybe_tidy_empty_bb/sel_merge_blocks for this case, I think it's not worth it given that sel-sched generally expects somewh

[Bug rtl-optimization/46649] [4.6 Regression] ICE: in move_bb_info, at sel-sched-ir.c:5080 with -fschedule-insns -fselective-scheduling

2010-12-14 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46649 --- Comment #7 from Alexander Monakov 2010-12-14 14:38:38 UTC --- After a discussion with Andrey and refreshing my memory on that code I think it's actually better to unbreak purge_empty_blocks in this case. It used to be correct, i.e. it never

[Bug middle-end/45852] volatile structs are broken!

2010-12-14 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45852 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/46649] [4.6 Regression] ICE: in move_bb_info, at sel-sched-ir.c:5080 with -fschedule-insns -fselective-scheduling

2010-12-15 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46649 --- Comment #8 from Alexander Monakov 2010-12-15 13:08:47 UTC --- Author: amonakov Date: Wed Dec 15 13:08:41 2010 New Revision: 167854 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167854 Log: PR rtl-optimization/46649 * sel-sche

[Bug rtl-optimization/46649] [4.6 Regression] ICE: in move_bb_info, at sel-sched-ir.c:5080 with -fschedule-insns -fselective-scheduling

2010-12-15 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46649 Alexander Monakov changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|

[Bug rtl-optimization/49304] ICE: code_motion_path_driver, at sel-sched.c:6573 w/ -O3 on ia64 in r174558

2011-06-07 Thread amonakov at gcc dot gnu.org
||amonakov at gcc dot gnu.org Resolution||DUPLICATE --- Comment #1 from Alexander Monakov 2011-06-07 09:28:29 UTC --- This is a duplicate indeed. *** This bug has been marked as a duplicate of bug 49303 ***

[Bug rtl-optimization/49303] ICE: vinsn_detach, at sel-sched-ir.c:1277 w/ -O3 on ia64 in r174558

2011-06-07 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49303 --- Comment #2 from Alexander Monakov 2011-06-07 09:28:30 UTC --- *** Bug 49304 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/49303] ICE: vinsn_detach, at sel-sched-ir.c:1277 w/ -O3 on ia64 in r174558

2011-06-07 Thread amonakov at gcc dot gnu.org
||2011.06.07 09:31:39 CC||amonakov at gcc dot gnu.org AssignedTo|unassigned at gcc dot |amonakov at gcc dot gnu.org |gnu.org | Ever Confirmed|0 |1

[Bug rtl-optimization/49303] ICE: vinsn_detach, at sel-sched-ir.c:1277 w/ -O3 on ia64 in r174558

2011-06-08 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49303 --- Comment #4 from Alexander Monakov 2011-06-08 09:59:26 UTC --- Author: amonakov Date: Wed Jun 8 09:59:23 2011 New Revision: 174801 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=174801 Log: PR rtl-optimization/49303 * sel-sche

[Bug rtl-optimization/49303] ICE: vinsn_detach, at sel-sched-ir.c:1277 w/ -O3 on ia64 in r174558

2011-06-08 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49303 Alexander Monakov changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|

[Bug target/49349] gfortran.dg/char_result_3.f90 fails with -O3

2011-06-10 Thread amonakov at gcc dot gnu.org
||2011.06.10 12:13:45 AssignedTo|unassigned at gcc dot |amonakov at gcc dot gnu.org |gnu.org | Ever Confirmed|0 |1 --- Comment #1 from Alexander Monakov 2011-06-10 12:13:45 UTC --- It's a

[Bug target/49349] gfortran.dg/char_result_3.f90 fails with -O3

2011-06-15 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49349 --- Comment #3 from Alexander Monakov 2011-06-15 08:08:32 UTC --- Author: amonakov Date: Wed Jun 15 08:08:27 2011 New Revision: 175075 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175075 Log: PR target/49349 * sel-sched.c (find_

[Bug target/49349] gfortran.dg/char_result_3.f90 fails with -O3

2011-06-15 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49349 Alexander Monakov changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|

[Bug target/48273] [4.6 Regression] ICE: in create_copy_of_insn_rtx, at sel-sched-ir.c:5604 with -fsel-sched-pipelining -fselective-scheduling2 -march=core2

2011-06-28 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48273 --- Comment #10 from Alexander Monakov 2011-06-28 12:19:23 UTC --- Author: amonakov Date: Tue Jun 28 12:19:18 2011 New Revision: 175581 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175581 Log: Backport from mainline 2011-04-08

[Bug target/48273] [4.6 Regression] ICE: in create_copy_of_insn_rtx, at sel-sched-ir.c:5604 with -fsel-sched-pipelining -fselective-scheduling2 -march=core2

2011-06-28 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48273 Alexander Monakov changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|

[Bug rtl-optimization/48496] [4.7 Regression] 'asm' operand requires impossible reload in libffi/src/ia64/ffi.c

2011-08-03 Thread amonakov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48496 --- Comment #7 from Alexander Monakov 2011-08-03 09:00:34 UTC --- (In reply to comment #6) > Does bootstrap work again? I haven't checked bootstrap, but the reduced testcase still induces the same error, and Andreas' gcc-testresults@ mails sugge

[Bug middle-end/71524] [7 Regression] internal compiler error: in binds_to_current_def_p, at symtab.c:2232

2016-06-26 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71524 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/71702] New: dr_group_sort_cmp violates transitivity required for qsort

2016-06-29 Thread amonakov at gcc dot gnu.org
Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- Created attachment 38793 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38793&

[Bug tree-optimization/71702] dr_group_sort_cmp violates transitivity required for qsort

2016-06-29 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71702 --- Comment #1 from Alexander Monakov --- Created attachment 38794 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38794&action=edit qsortchk.c quick'n'dirty LD_PRELOAD transitivity validator for qsort comparator

[Bug tree-optimization/71702] dr_group_sort_cmp violates transitivity required for qsort

2016-07-07 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71702 Alexander Monakov changed: What|Removed |Added Attachment #38793|0 |1 is obsolete|

[Bug tree-optimization/71702] dr_group_sort_cmp violates transitivity required for qsort

2016-07-07 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71702 --- Comment #3 from Alexander Monakov --- On 6/trunk, this issue is fixed or made latent by r230667 that added + STRIP_NOPS (t1); + STRIP_NOPS (t2); to tree-vect-data-refs.c:compare_tree (patch submission here: https://gcc.gnu.org/ml/gcc-patc

[Bug libgomp/71844] Data mapping of an array section in the target construct

2016-07-11 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71844 --- Comment #2 from Alexander Monakov --- Jakub, this code is taken verbatim from the openmp-examples document. Should we report it at their github issue tracker? The example in their public git has always been without the A[0:4] map on 'omp targ

[Bug c++/71910] ICE on invalid OpenMP code

2016-07-18 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71910 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c++/71910] ICE on invalid OpenMP code

2016-07-18 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71910 --- Comment #3 from Alexander Monakov --- Created attachment 38922 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38922&action=edit preprocessed testcase I can also reproduce this on 5.4 and trunk with a cross-compiler configured with --ta

[Bug tree-optimization/71702] dr_group_sort_cmp violates transitivity required for qsort

2016-07-22 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71702 Alexander Monakov changed: What|Removed |Added CC||tony at kelman dot net --- Comment #

[Bug tree-optimization/71505] -O3 internal compiler error in vect_analyze_data_ref_accesses, at tree-vect-data-refs.c:2596

2016-07-22 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71505 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/71980] [5] libraw on x86_64-linux-musl causes ICE in vect_analyze_data_ref_accesses, at tree-vect-data-refs.c:2596

2016-07-22 Thread amonakov at gcc dot gnu.org
||2016-07-23 CC||amonakov at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Alexander Monakov --- This is most likely a duplicate of bug 71702. I've talked with the reporter briefly o

[Bug c++/71971] Destructor of a global static variable in a shared library is not called on dlclose

2016-07-25 Thread amonakov at gcc dot gnu.org
||amonakov at gcc dot gnu.org Resolution|--- |INVALID --- Comment #2 from Alexander Monakov --- The library is not unloaded on glibc due to STB_GNU_UNIQUE binding on get::i. You can opt-out of creating symbols with that binding using -fno-gnu

[Bug middle-end/72776] New: Too large array size not diagnosed properly when inferred from an initializer

2016-08-02 Thread amonakov at gcc dot gnu.org
: diagnostic Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- For the following testcase: static char c[] = {[~0ul] = 1}; the array c would

[Bug middle-end/77399] New: Fails to use native instructions for vector casts

2016-08-29 Thread amonakov at gcc dot gnu.org
Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org CC: nsz at gcc dot gnu.org Target Milestone: --- For the following testcase: typedef int v4si __attribute__((vector_size(16

[Bug middle-end/77399] Fails to use native instructions for vector casts

2016-08-29 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77399 --- Comment #1 from Alexander Monakov --- I should also mention that for scalar code: void f(float *o, int *i) { *o++ = *i++; *o++ = *i++; *o++ = *i++; *o++ = *i++; } where SLP vectorization succeeds, one can see that c-style casts exist for

[Bug tree-optimization/77399] Poor code generation for vector casts and loads

2016-08-30 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77399 Alexander Monakov changed: What|Removed |Added Summary|SLP does not handle |Poor code generation for

[Bug tree-optimization/77399] Poor code generation for vector casts and loads

2016-08-30 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77399 --- Comment #6 from Alexander Monakov --- Thanks. Any comment on having gimple lowering emit cleaner code in the first place?

[Bug tree-optimization/77399] Poor code generation for vector casts and loads

2016-08-30 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77399 --- Comment #8 from Alexander Monakov --- > The extension is closely modeled after openCL Hm, that doesn't sound right: gcc had vector types long before OpenCL was even a thing; I believe it's modeled after Altivec actually: the discrepancy betw

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-03-08 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/99462] Enhance scheduling to split instructions

2021-03-08 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99462 --- Comment #3 from Alexander Monakov --- (for context, the above patch was for PR 98856, but it's based on incorrect latency analysis, see bug 98856 comment #38 ) Right now schedulers cannot easily split instructions for that purpose, it would

[Bug rtl-optimization/99469] ICE: qsort checking failed with selective scheduling on aarch64

2021-03-09 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99469 Alexander Monakov changed: What|Removed |Added Blocks||82407 --- Comment #2 from Alexander

[Bug middle-end/99619] New: fails to infer local-dynamic TLS model from hidden visibility

2021-03-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- Thread-local variables with hidden visibility don't need to use the "general-dy

[Bug c++/99728] code pessimization when using wrapper classes around SIMD types

2021-03-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/99582] No intrinsics to access rcl or rcr instruction on x86_64

2021-03-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99582 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291

2021-04-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
||amonakov at gcc dot gnu.org, ||zhroma at gcc dot gnu.org --- Comment #1 from Alexander Monakov --- Hi Martin, this is a modulo-scheduling bug; I think you added "Blocks: sel-sched" by mistake — removing, and Cc'ing Roma

[Bug target/97366] [8/9/10/11 Regression] Redundant load with SSE/AVX vector intrinsics

2020-10-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97366 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/97203] [nvptx] 'illegal memory access was encountered' with 'omp simd'/SIMT and cexpf call

2020-10-12 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97203 --- Comment #8 from Alexander Monakov --- No, -msoft-stack-reserve-local is really meant to be in bytes: it may not exceed the amount of .local memory reserved by CUDA driver (which is just 1-2 KB, unless overridden via cuCtxSetLimit, which nvptx

[Bug target/97366] [8/9/10/11 Regression] Redundant load with SSE/AVX vector intrinsics

2020-10-12 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97366 --- Comment #5 from Alexander Monakov --- afaict LRA is just following IRA decisions, and IRA allocates that pseudo to memory due to costs. Not sure where strange cost is coming from, but it depends on x86 tuning options: with -mtune=skylake we

[Bug target/97203] [nvptx] 'illegal memory access was encountered' with 'omp simd'/SIMT and cexpf call

2020-10-12 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97203 --- Comment #11 from Alexander Monakov --- Yes, that.

[Bug inline-asm/97708] Inline asm does not use the local register asm specified with register ... asm() as input

2020-11-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97708 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/97734] GCC using branches when a conditional move would be better

2020-11-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97734 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug inline-asm/97708] Inline asm does not use the local register asm specified with register ... asm() as input

2020-11-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97708 --- Comment #30 from Alexander Monakov --- Asm operand binding should work by looking at bound lvalue: "c"(a) binds an lvalue so if 'a' is a register var the compiler must remember its associated register; "c"(a+0) binds an rvalue, so what kind o

[Bug libstdc++/98226] Slow std::countr_one

2020-12-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98226 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/97127] FMA3 code transformation leads to slowdown on Skylake

2020-09-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127 --- Comment #16 from Alexander Monakov --- Mostly because prior to register allocation the compiler does not naturally see that x = *mem + a*b will need an extra mov when both 'a' and 'b' are live (as in that case registers allocated for them can

[Bug target/97127] FMA3 code transformation leads to slowdown on Skylake

2020-09-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127 --- Comment #17 from Alexander Monakov --- To me this suggests that in fact it's okay to carry the combined form in RTL up to register allocation, but RA should decompose it to load+fma instead of inserting a register copy that preserves the live

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 --- Comment #9 from Alexander Monakov --- (In reply to Richard Biener from comment #8) > Note that currently RTL expansion forces a local vector typed variable > to the stack (instead of allocating a pseudo) when there are > variable-index access

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 --- Comment #11 from Alexander Monakov --- Yeah, for inserts such tactic would be inappropriate due to bad store forwarding stalls anyway. As you've shown in earlier comments, inserts have a very nice generic way to expand them (that does not tou

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 --- Comment #14 from Alexander Monakov --- I see, there are more weaknesses than I thought. For CSE (or rather fwprop?) I was thinking about a simpler case where the extracted-from value is loaded from memory, but even in trivial cases RTL optimi

[Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region

2020-10-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291 --- Comment #1 from Alexander Monakov --- Reshuffling statements and piling up extra abstraction doesn't help solve the core issue that GIMPLE passes can duplicate any basic block, but basic blocks of SIMT loop epilogue should be protected from t

[Bug middle-end/95189] [9/10 Regression] memcmp being wrongly stripped like strcmp

2020-10-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95189 Alexander Monakov changed: What|Removed |Added Known to fail||9.3.0 Known to work|9.3.0

[Bug target/97203] [nvptx] 'illegal memory access was encountered' with 'omp simd'/SIMT and cexpf call

2020-10-09 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97203 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug libgomp/98258] Can't compile programs for both OpenMP (CPU) + OpenACC (GPU)

2021-01-04 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98258 --- Comment #5 from Alexander Monakov --- One possible solution is -foffload=-fno-openmp Another possible solution is separate compilation and linking, with only OpenACC enabled at link step (needs explicit -lgomp): gfortran -fopenmp -fopenacc

[Bug libgomp/98258] Can't compile programs for both OpenMP (CPU) + OpenACC (GPU)

2021-01-04 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98258 --- Comment #8 from Alexander Monakov --- (In reply to Chinoune from comment #7) > $ gfortran-10 -O3 -fopenmp -fopenacc -c bug_omp_acc.f90 > $ gfortran-10 bug_omp_acc.o -lgomp -o test.x Contrary to my suggestion, you have omitted -fopenacc from

[Bug libgomp/98258] Can't compile programs for both OpenMP (CPU) + OpenACC (GPU)

2021-01-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98258 --- Comment #10 from Alexander Monakov --- Thanks for checking. As for this: > Please, stop suggesting untested workarounds. Yes, I should have mentioned those are untested. I was typing the response late at night without access to offloading-c

[Bug tree-optimization/98906] New: [8/9/10/11 Regression] Miscompiles code even at -O1

2021-01-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- Created attachment 50097 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50097&action=edit testcase The a

[Bug tree-optimization/98906] [8/9/10/11 Regression] Miscompiles code even at -O1

2021-02-01 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98906 --- Comment #6 from Alexander Monakov --- Ah, -fsanitize=float-cast-overflow catches it, but it needs to be enabled explicitly (not implied by -fsanitize=undefined). Thank you!

[Bug rtl-optimization/86096] [8 Regression] ICE: qsort checking failed (error: qsort comparator non-negative on sorted output: 0)

2021-02-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86096 --- Comment #8 from Alexander Monakov --- It was fixed on the trunk only, so as the title says it remains an issue on the gcc-8 branch (which is still open). Bugzilla doesn't have separate resolutions for different branches, we cannot have this "

[Bug tree-optimization/100363] gcc generating wider load/store than warranted at -O3

2021-05-01 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c/93031] Wish: When the underlying ISA does not force pointer alignment, option to make GCC not assume it

2021-05-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93031 --- Comment #7 from Alexander Monakov --- In comment #2 I touched upon a potentially more practical way to offer -fno-strict-alignment: Run early work with ABI alignments: compute __alignof correctly, lay out composite types as required by ABI,

[Bug other/99903] 32-bit x86 frontends randomly crash while reporting timing on Windows

2021-05-04 Thread amonakov at gcc dot gnu.org via Gcc-bugs
|UNCONFIRMED CC||amonakov at gcc dot gnu.org --- Comment #4 from Alexander Monakov --- 32-bit Linux should also be affected (perhaps with less probability if clock() is more precise). It is surprising we track time in a 'double'

[Bug c/100618] Add a -fno-semantic-interposition variant which allows variable interposition

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100618 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c/100483] Extend -fno-semantic-interposition to global variables

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100483 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c/100618] Add a -fno-semantic-interposition variant which allows variable interposition

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100618 --- Comment #3 from Alexander Monakov --- Furthermore as discussed in bug 100483 this request appears based on a misunderstanding what the 'semantic-' part of the option is about. It does not affect assembly/linker-level binding mechanism, so th

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #3 from Alexander Monakov --- I understand what you're saying, but it seems we're talking past each other. I agree that if a library is linked with any -Bsymbolic* flag, the main executable is at risk of broken address uniqueness un

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-17 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #5 from Alexander Monakov --- Hm, I still don't think I'm misunderstanding what you're saying. I'm familiar with the ELF standard (and FWIW I have read your blog posts on related matters). I am responding to this sentiment from the o

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-18 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #7 from Alexander Monakov --- Thanks. I agree that inferring address significance on the linker side is problematic. Thinking about your original request, I was about to say that it would be very reasonable to do under -fno-plt flag

[Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c

2021-05-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573 --- Comment #14 from Alexander Monakov --- I would break in gdb on cuModuleGetFunction and x/s $rdx to print the failing symbol (it's the third argument to the function). It seems the "inner" entrypoint (which your patch attempted to nullif

[Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c

2021-05-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573 --- Comment #17 from Alexander Monakov --- Yes, I'd agree normally it's present in the offload table, but ideally if you're trying to stub out the call, it should not be present in the offload table. I think Tobias is saying that on GIMPLE this

[Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c

2021-05-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573 --- Comment #19 from Alexander Monakov --- Ah, does the issue arise because foo._omp_fn.0 is (before the patch) callable in two contexts, in one it's called from host and should be 'omp target entrypoint', and in the other it's called from offlo

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-27 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #10 from Alexander Monakov --- Is there something wrong or undesirable with making this under -fno-plt (or the noplt attribute as in your example)? (after all, it is a kind of PLT-avoidance transformation, just for addressing rather

[Bug target/108322] Using __restrict parameter with -ftree-vectorize (default with -O2) results in massive code bloat

2023-01-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108322 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/108322] Using __restrict parameter with -ftree-vectorize (default with -O2) results in massive code bloat

2023-01-10 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108322 --- Comment #5 from Alexander Monakov --- (In reply to Richard Biener from comment #4) > > For the case at hand loading two vectors from the destination and then > punpck{h,l}bw and storing them again might be the most efficient thing > to do h

[Bug middle-end/108376] TSVC s1279 runs 40% faster with aocc than gcc at zen4

2023-01-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108376 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/108401] gcc defeats vector constant generation with intrinsics

2023-01-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108401 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/108487] [10/11/12/13 Regression] ~20-30x slowdown in populating std::vector from std::ranges::iota_view

2023-01-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs
||std::ranges::iota_view CC||amonakov at gcc dot gnu.org --- Comment #1 from Alexander Monakov --- Regarding fn1, would you mind re-running the test on your Xeon CPU with fn2 removed from the source code and -falign-loops=32 added to gcc

[Bug libstdc++/108487] [10/11/12/13 Regression] ~20-30x slowdown in populating std::vector from std::ranges::iota_view

2023-01-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108487 Alexander Monakov changed: What|Removed |Added Component|tree-optimization |libstdc++ --- Comment #3 from Alexa

<    3   4   5   6   7   8   9   10   11   12   >