RE: Help w/ PR61538?
Hi Joshua, I know very little about this area but I'll try and offer some advice anyway... > On 07/05/2014 23:43, Joshua Kinard wrote: > > Hi, > > > > I filed PR61538 about two weeks ago, regarding gcc-4.8.x and up not > > compiling a g++/pthreads-linked app correctly on SGI R1x000-based systems > > (Octane, Onyx2), running Linux. Running the subsequently-compiled > > application simply hangs in a futex syscall until terminated via Ctrl+C. > I > > suspect it's a double-locking bug of some design, as evidenced by strace > > showing two consecutive syscall()'s w/ 0x108e passed as the syscall # > (4238 > > or futex on o32 MIPS), but I am stumped as to what else I can do to debug > it > > and help fix it. > > > [snip] > > Full details: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538 > > So I've spent the last few weeks bisecting the gcc tree, and I've narrowed > down the set of commits that appear to have introduced this problem: > > 1. 39a8c5eaded1e5771a941c56a49ca0a5e9c5eca0 * config/mips/mips.c > (mips_emit_pre_atomic_barrier_p,) This is the prime candidate for introducing the issue. > 2. 974f0a74e2116143b88d8cea8e1dd5a9c18ef96c * config/mips/constraints.md > (ZR): New constraint. Unlikely > 3. 0f8e46b16a53c02d7255dcd6b6e9b5bc7f8ec953 * config/mips/mips.c > (mips_process_sync_loop): Emit cmp result only if Possible but unlikely still > 4. 30c3c4427521f96fb58b6e1debb86da4f113f06f * emit-rtl.c > (need_atomic_barrier_p): New function. Seems unlikely > > There's a build failure somewhere in the middle of there that is blocking me > from figuring out which specific one is the cause, but they all appear to be > related anyways. All four were added on 2012-06-20. > > When I took a git checkout from 2012-06-26 and reverted those four commits, > I was able to compile glibc-2.19 and get a working "sln" binary. I am > unable to easily test the C++ side because I built the checkouts in my > $HOME, and it's too risky to try and shoehorn one of them in as the system > compiler. However, I think the C++ issue is also fixed by reverting the > four, as that also involved hanging in Linux futex syscalls. Here is a wild guess at the problem... I think the workaround for R1 to use branch likely instead of delay slot branches is ending up annulling an instruction that is required for certain atomic operations. This is an entirely untested theory (and patch) but can you see if this fixes the issue you are seeing: @@ -13014,7 +13023,8 @@ mips_process_sync_loop (rtx insn, rtx *operands) mips_multi_copy_insn (tmp3_insn); mips_multi_set_operand (mips_multi_last_index (), 0, newval); } - else if (!(required_oldval && cmp)) + else if (!(required_oldval && cmp) + || mips_branch_likely) mips_multi_add_insn ("nop", NULL); /* CMP = 1 -- either standalone or in a delay slot. */ I suspect I can weave that in more naturally but can you tell me if that fixes the problem first. Regards, Matthew
Re: Help w/ PR61538?
On 07/28/2014 04:41, Matthew Fortune wrote: > Hi Joshua, > > I know very little about this area but I'll try and offer some advice > anyway... > You know more than I do :) >> On 07/05/2014 23:43, Joshua Kinard wrote: >>> Hi, >>> >>> I filed PR61538 about two weeks ago, regarding gcc-4.8.x and up not >>> compiling a g++/pthreads-linked app correctly on SGI R1x000-based systems >>> (Octane, Onyx2), running Linux. Running the subsequently-compiled >>> application simply hangs in a futex syscall until terminated via Ctrl+C. >> I >>> suspect it's a double-locking bug of some design, as evidenced by strace >>> showing two consecutive syscall()'s w/ 0x108e passed as the syscall # >> (4238 >>> or futex on o32 MIPS), but I am stumped as to what else I can do to debug >> it >>> and help fix it. >>> >> [snip] >>> Full details: >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538 >> >> So I've spent the last few weeks bisecting the gcc tree, and I've narrowed >> down the set of commits that appear to have introduced this problem: >> >> 1. 39a8c5eaded1e5771a941c56a49ca0a5e9c5eca0 * config/mips/mips.c >> (mips_emit_pre_atomic_barrier_p,) > > This is the prime candidate for introducing the issue. This is my guess, too. However, it appears to tie in w/ the fourth commit because the new mips_emit_{pre,post}_atomic_barrier_p functions added in commit 39a8c5ea are removed by commit 30c3c442 a mere ~7 minutes later (which I find really odd). Commit 974f0a74 is really the only one that seems innocent, but I suspect the other three are linked. If mkuvyrkov is still around, perhaps he could explain better? >> 2. 974f0a74e2116143b88d8cea8e1dd5a9c18ef96c * config/mips/constraints.md >> (ZR): New constraint. > > Unlikely > >> 3. 0f8e46b16a53c02d7255dcd6b6e9b5bc7f8ec953 * config/mips/mips.c >> (mips_process_sync_loop): Emit cmp result only if > > Possible but unlikely still > >> 4. 30c3c4427521f96fb58b6e1debb86da4f113f06f * emit-rtl.c >> (need_atomic_barrier_p): New function. > > Seems unlikely > >> >> There's a build failure somewhere in the middle of there that is blocking me >> from figuring out which specific one is the cause, but they all appear to be >> related anyways. All four were added on 2012-06-20. >> >> When I took a git checkout from 2012-06-26 and reverted those four commits, >> I was able to compile glibc-2.19 and get a working "sln" binary. I am >> unable to easily test the C++ side because I built the checkouts in my >> $HOME, and it's too risky to try and shoehorn one of them in as the system >> compiler. However, I think the C++ issue is also fixed by reverting the >> four, as that also involved hanging in Linux futex syscalls. > > Here is a wild guess at the problem... I think the workaround for R1 to > use branch likely instead of delay slot branches is ending up annulling > an instruction that is required for certain atomic operations. This is an > entirely untested theory (and patch) but can you see if this fixes the issue > you are seeing: Well, the branch-likely thing really only affects a specific revision of the R1 processors. Later R1 revisions (3.1+?) and R12000-R16000 shouldn't be affected. I've been playing with disabling that specific workaround on my Octane's kernel and haven't seen any ill effects yet. Though, I haven't tried rebuilding the userland w/ -mno-fix-r1 just yet. If you want, you can take a look at some of the additional info in the corresponding Gentoo bug that tracks PR61538: https://bugs.gentoo.org/show_bug.cgi?id=516548 I have a gdb run (comment #5) of the several instructions in __lll_lock_wait_private, including register values, as each instruction executes. The hang happens after taking the futex syscall, t0-t3 get set to 0x0, and the following "ll v0,0(s0)" is what hangs. In gcc-4.7 and earlier, that 'll' is actually "li v0,2", though control never passes into __lll_lock_wait_private in the first place. There's also a PNG attached to that bug of the disassembled asm in WinMerge they shows what insns actually changed. Someone who understands MIPS asm ordering might be able to make something of that. > @@ -13014,7 +13023,8 @@ mips_process_sync_loop (rtx insn, rtx *operands) >mips_multi_copy_insn (tmp3_insn); >mips_multi_set_operand (mips_multi_last_index (), 0, newval); > } > - else if (!(required_oldval && cmp)) > + else if (!(required_oldval && cmp) > + || mips_branch_likely) > mips_multi_add_insn ("nop", NULL); > >/* CMP = 1 -- either standalone or in a delay slot. */ > > I suspect I can weave that in more naturally but can you tell me if that > fixes the problem first. Testing a fix takes about 7.5hrs to rebuild, plus another 3.5 to rebuild glibc. So I am a bit hesitant to task the machine to do that w/o having a better idea if that solves it or not. Technically, shouldn't passing -mno-fix-r1 have a similar effect by causing branch-likely insns to not get emitted at all
Re: C as intermediate language, signed integer overflow and -ftrapv
On Sun, Jul 27, 2014 at 9:13 AM, Thomas Mertes wrote: > On Fri, Jul 25, 2014 at 12:35, Richard Biener > wrote: >> On Fri, Jul 25, 2014 at 10:43 AM, Thomas Mertes wrote: >> > On Thu, Jul 24 at 10:36 PM, Richard Biener >> > wrote: >> >> Fact is that if somebody is interested in >> >> -ftrapv he/she is welcome to contribute patches. Especially testing >> >> coverage is poor. >> > >> > As I said I have test programs for integer overflow (not written >> > in C). Now I have converted one test program to C. This program >> > checks if an int64_t overflow raises SIGABRT or SIGILL. The name of >> > the program is chkovf64.c and I have uploaded it to >> > >> > http://sourceforge.net/projects/seed7/files/ >> > >> > It is licenced with GPL. You can use it to improve the testing >> > coverage of gcc. When I compile it with: >> > >> > gcc -ftrapv chkovf64.c -o chkovf64 >> > >> > it writes a lot of warnings about "integer overflow in expression". >> > Running chkovf64 shows that -ftrapv does not work correct. >> >> That's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61893 - basically >> as soon as we can constant-fold we lose the trap. Which is probably >> not important for practical purposes, but you have a point here. > > For human programmers it is probably ok to have a warning instead of > the trap. But when the C program has been generated (C as intermediate > language) it is important that the trap is not lost. > > Isn't possible that constant-fold has the result "it will overflow"? > Hopefully it is not necessary to omit an optimization because of -ftrapv. > For my use-case that would be bad. I need -ftrapv together with all > optimizations for production code and NOT for debugging purposes. Well. The optimization would be to transform known overflows to __builtin_trap (). Not sure what optimization you are thinking of otherwise. >> > Maybe all -ftrapv problems uncovered by chkovf64.c are because >> > of this. Unfortunately there are also other test cases where >> > a signal is not raised although a signed integer overflow occurred. >> > This happens in a much bigger program and until now I was not >> > able to create a simple test case from it. >> >> Yes, not all optimizations may be aware of -ftrapv. > > In the case above (the much bigger program) the error (overflow without > raising SIGABRT) happens when I compile without -O and without -g. > When I use -g the signal SIGABRT is raised. I still don't have a simple > test case, sorry. > > Did you have a look at chkovf64.c? With my latest patches it passes at -O0 -ftrapv but still fails partly with optimization: > ./a.out * Negation -(-9223372036854775808) does not always raise a signal. * Overflow checking of negation does not work correct. * Addition underflow by one does not always raise a signal. (1) * Addition underflow by one does not always raise a signal. (2) * Addition underflow by one does not always raise a signal. (3) * Addition underflow by one does not always raise a signal. (4) * Addition overflow by one does not always raise a signal. (1) * Addition overflow by one does not always raise a signal. (2) * Addition overflow by one does not always raise a signal. (3) * Addition overflow by one does not always raise a signal. (4) * Addition underflow does not always raise a signal. * Addition overflow does not always raise a signal. * Overflow checking of addition does not work correct. * Addition assignment underflow by one does not always raise a signal. (1) * Addition assignment underflow by one does not always raise a signal. (2) * Addition assignment underflow by one does not always raise a signal. (3) * Addition assignment underflow by one does not always raise a signal. (4) * Addition assignment overflow by one does not always raise a signal. (1) * Addition assignment overflow by one does not always raise a signal. (2) * Addition assignment overflow by one does not always raise a signal. (3) * Addition assignment overflow by one does not always raise a signal. (4) * Addition assignment underflow does not always raise a signal. * Addition assignment overflow does not always raise a signal. * Overflow checking of addition assignment does not work correct. * Subtraction underflow by one does not always raise a signal. (1) * Subtraction underflow by one does not always raise a signal. (2) * Subtraction underflow by one does not always raise a signal. (3) * Subtraction underflow by one does not always raise a signal. (4) * Subtraction overflow by one does not always raise a signal. (1) * Subtraction overflow by one does not always raise a signal. (2) * Subtraction overflow by one does not always raise a signal. (3) * Subtraction overflow by one does not always raise a signal. (4) * Subtraction underflow does not always raise a signal. * Subtraction overflow does not a
Re: [GSoC] outer-if expressions
On Sat, Jul 26, 2014 at 11:20 PM, Prathamesh Kulkarni wrote: > Hi, >This patch adds support for outer-if expressions. > > Couple of issues: > a) Doesn't interop with for-pattern, since we don't replace identifier > in c-expr yet, > and this gets more complicated with addition of outer-if. Does it? Well, we'll see. > b) I removed ifexpr-locations for now. I am not sure where to output > if-expr locations > when there are multiple if-exprs (one outer and one inner). I'd say we'd use multiple locations, thus record a vector of if-expr, location tuples. Applied. Thanks, Richard. > * genmatch.c (simplify::ifexpr): Remove. > (simplify::ifexpr_location): Likewise. > (simplify::simplify): Adjust to changes in simplify and >add parameter ifexpr_vec_. > (dt_simplify::gen_gimple): Adjust to changes in simplify. > (dt_simplify::gen_generic): Likewise. > (parse_if): New function. > (parse_pattern): Add call to parse_if. > > * match.pd: Use outer-if for plus-minus patterns. > > Thanks, > Prathamesh
Re: [GSoC] support for literals
On Sun, Jul 27, 2014 at 10:05 PM, Prathamesh Kulkarni wrote: > Hi, > I was wondering if it would be a good idea to have the following syntax > for literals: > (type val) ? > type would be one of the tree-codes representing cst types like > INTEGER_CST, REAL_CST, etc. > > eg: > (negate (integer_cst 3)) > > would be equivalent to the following: > (negate INTEGER_CST_P@0) > if (TREE_INT_CST_LOW (@0) == 3) > { .. } > > Also possibly provide a short-hand for some literals > like integer and floating point, so just writing > 3 would be short-hand for (integer_cst 3) ? Hmm. If then I'd rather support literals "literally", thus (negate 3) because the type is always available from the context. I also think literal constants are more useful in the transform context... But really I'd like to defer this for now - literals would be treated quite specially (for example should '0' be emitted as integer_zerop () handling many types or should it be treated as integral-type-only?). Should we only allow a specific set of literals (-1, 0, 1)? How do we write literals of complex or vector type? So yes, literals would be nice (sometimes), but universal handling of literals is hard. How would people feel about only handling a very small set of literals in a subset of all contexts (only in matching or only in code-generation)? > Many patterns from [1] have integral constants to match. > eg: (a >> 2) >=3 -> a >= (3 << 2) > so this could be written as: > (gte (rshift @0 2) 3) > (gte @0 (lshift 3 2)) > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14753 I'm quite sure this bug wants more general treatment of these patterns, not only handling the special combinations of constants... Richard. > > Thanks, > Prathamesh
[GSoC] replacing op in c_expr
I am having few issues replacing op in c_expr. I thought of following possibilities: a) create a new vec vector new_code. for each token in code { if token.type is not CPP_NAME new_code.safe_push (token); else { cpp_token new_token = ??? create new token of type CPP_NAME with contents as name of operator ??? } } I tried to go this way, but am stuck with creating a new token type. i started by: cpp_token new_token = token; // get same attrs as token. CPP_HASHNODE (new_token.val.node.node)->ident.str = name of operator. CPP_HASHNODE (new_token.val.node.node)->ident.len = len of operator name. name of operator is obtained from opers[i] in parse_for. however this does not work because I guess new_token = token, shallow copies the token (default assignment operator, i didn't find an overloaded version). b) create new struct c_expr_elem and use vec code, instead of vec code; sth like: struct c_expr_elem { enum c_expr_elem_type { ID, TOKEN }; enum c_expr_elem_type type; union { cpp_token token; const char *id; }; }; while replacing op, compare token with op, and if it matches, create a new c_expr_elem with type = ID, and id = name of operator. This shall probably work, but shall require many changes to other parts since we change c_expr::code. I would like to hear any other suggestions. Thanks, Prathamesh.
RE: Help w/ PR61538?
I'll switch to replying on PR61538. I had not read all the ticket previously and although I may have found a problem it seems it may not be the cause of this failure. The generated code differences after the patches seem significant but I may not get chance to look at the differences in detail for a little while. Matthew > -Original Message- > From: Joshua Kinard [mailto:ku...@gentoo.org] > Sent: 28 July 2014 10:40 > To: Matthew Fortune; gcc@gcc.gnu.org > Subject: Re: Help w/ PR61538? > > On 07/28/2014 04:41, Matthew Fortune wrote: > > Hi Joshua, > > > > I know very little about this area but I'll try and offer some advice > anyway... > > > > You know more than I do :) > > > >> On 07/05/2014 23:43, Joshua Kinard wrote: > >>> Hi, > >>> > >>> I filed PR61538 about two weeks ago, regarding gcc-4.8.x and up not > >>> compiling a g++/pthreads-linked app correctly on SGI R1x000-based > systems > >>> (Octane, Onyx2), running Linux. Running the subsequently-compiled > >>> application simply hangs in a futex syscall until terminated via Ctrl+C. > >> I > >>> suspect it's a double-locking bug of some design, as evidenced by strace > >>> showing two consecutive syscall()'s w/ 0x108e passed as the syscall # > >> (4238 > >>> or futex on o32 MIPS), but I am stumped as to what else I can do to > debug > >> it > >>> and help fix it. > >>> > >> [snip] > >>> Full details: > >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538 > >> > >> So I've spent the last few weeks bisecting the gcc tree, and I've > narrowed > >> down the set of commits that appear to have introduced this problem: > >> > >> 1. 39a8c5eaded1e5771a941c56a49ca0a5e9c5eca0 * config/mips/mips.c > >> (mips_emit_pre_atomic_barrier_p,) > > > > This is the prime candidate for introducing the issue. > > This is my guess, too. However, it appears to tie in w/ the fourth commit > because the new mips_emit_{pre,post}_atomic_barrier_p functions added in > commit 39a8c5ea are removed by commit 30c3c442 a mere ~7 minutes later > (which I find really odd). Commit 974f0a74 is really the only one that > seems innocent, but I suspect the other three are linked. If mkuvyrkov is > still around, perhaps he could explain better? > > > >> 2. 974f0a74e2116143b88d8cea8e1dd5a9c18ef96c * config/mips/constraints.md > >> (ZR): New constraint. > > > > Unlikely > > > >> 3. 0f8e46b16a53c02d7255dcd6b6e9b5bc7f8ec953 * config/mips/mips.c > >> (mips_process_sync_loop): Emit cmp result only if > > > > Possible but unlikely still > > > >> 4. 30c3c4427521f96fb58b6e1debb86da4f113f06f * emit-rtl.c > >> (need_atomic_barrier_p): New function. > > > > Seems unlikely > > > >> > >> There's a build failure somewhere in the middle of there that is blocking > me > >> from figuring out which specific one is the cause, but they all appear to > be > >> related anyways. All four were added on 2012-06-20. > >> > >> When I took a git checkout from 2012-06-26 and reverted those four > commits, > >> I was able to compile glibc-2.19 and get a working "sln" binary. I am > >> unable to easily test the C++ side because I built the checkouts in my > >> $HOME, and it's too risky to try and shoehorn one of them in as the > system > >> compiler. However, I think the C++ issue is also fixed by reverting the > >> four, as that also involved hanging in Linux futex syscalls. > > > > Here is a wild guess at the problem... I think the workaround for R1 > to > > use branch likely instead of delay slot branches is ending up annulling > > an instruction that is required for certain atomic operations. This is an > > entirely untested theory (and patch) but can you see if this fixes the > issue > > you are seeing: > > Well, the branch-likely thing really only affects a specific revision of the > R1 processors. Later R1 revisions (3.1+?) and R12000-R16000 > shouldn't be affected. I've been playing with disabling that specific > workaround on my Octane's kernel and haven't seen any ill effects yet. > Though, I haven't tried rebuilding the userland w/ -mno-fix-r1 just yet. > > If you want, you can take a look at some of the additional info in the > corresponding Gentoo bug that tracks PR61538: > > https://bugs.gentoo.org/show_bug.cgi?id=516548 > > I have a gdb run (comment #5) of the several instructions in > __lll_lock_wait_private, including register values, as each instruction > executes. The hang happens after taking the futex syscall, t0-t3 get set to > 0x0, and the following "ll v0,0(s0)" is what hangs. In gcc-4.7 and earlier, > that 'll' is actually "li v0,2", though control never passes into > __lll_lock_wait_private in the first place. > > There's also a PNG attached to that bug of the disassembled asm in WinMerge > they shows what insns actually changed. Someone who understands MIPS asm > ordering might be able to make something of that. > > > > @@ -13014,7 +13023,8 @@ mips_process_sync_loop (rtx insn, rtx *operands) > >mips_multi_copy_insn (tmp3_insn); > >
Reload generate invalid instruction on ppc64
Hi Vlad When I use ppc64 gcc4.9 to compile an internal application I got an ICE due to an invalid instruction generated by reload. Before IRA, I have following insns: (insn 139 136 581 10 (set (reg:DI 567) (const_int 0 [0])) ./strings/stringpiece.h:205 discrim 1 520 {*movdi_internal64} (expr_list:REG_EQUIV (const_int 0 [0]) (nil))) ... (insn 231 1062 237 24 (set (reg:V2DI 401 [ vect_cst_.7586 ]) (vec_concat:V2DI (reg:DI 235 [ fprint$lo_ ]) (reg:DI 567))) 1066 {vsx_concat_v2di} (expr_list:REG_DEAD (reg:DI 235 [ fprint$lo_ ]) (expr_list:REG_EQUAL (vec_concat:V2DI (reg:DI 235 [ fprint$lo_ ]) (const_int 0 [0])) (nil IRA decides register r567 should be spilled into memory a48(r567,l0) -- assign memory 48:r567 l0 mem Later reload pass reload constant 0 into hard floating point r42 directly (insn 1086 1062 231 24 (set (reg:DI 42 10) (const_int 0 [0])) 520 {*movdi_internal64} (nil)) (insn 231 1086 237 24 (set (reg:V2DI 32 0 [orig:401 vect_cst_.7586 ] [401]) (vec_concat:V2DI (reg:DI 32 0 [orig:235 fprint$lo_ ] [235]) (reg:DI 42 10))) 1066 {vsx_concat_v2di} (expr_list:REG_EQUAL (vec_concat:V2DI (reg:DI 32 0 [orig:235 fprint$lo_ ] [235]) (const_int 0 [0])) (nil))) Unfortunately insn 1086 doesn't satisfy the constraints of insn pattern movdi_internal64, so causing an ICE in later pass. The reload insn 1086 is generated by following code in function reload1.c:gen_reload /* If IN is a simple operand, use gen_move_insn. */ else if (OBJECT_P (in) || GET_CODE (in) == SUBREG) { tem = emit_insn (gen_move_insn (out, in)); /* IN may contain a LABEL_REF, if so add a REG_LABEL_OPERAND note. */ mark_jump_label (in, tem, 0); } After generating the move insn, reload doesn't check if the insn is valid or if it can satisfy constraints. On ppc64 we can't load a constant into fp/vsx register directly. Is it a bug here? thanks a lot Guozhi Wei