[ANN] C++Now 2014: 5 Days to Submissions Deadline
Hi, Only 5 days left before the submissions deadline for C++Now 2014! C++Now is a general C++ conference for C++ experts and enthusiasts. It is not specific to any library/framework or compiler vendor and has three tracks with presentations ranging from hands-on, practical tutorials to advanced C++ design and development techniques. For more information about C++Now, see the conference's website: http://cppnow.org/about/ Have you learned something interesting about C++ (e.g., a new technique possible in C++11)? Or maybe you have implemented something cool related to C++ (e.g., a C++ library)? If so, consider sharing it with other C++ enthusiasts by giving a talk at C++Now 2014. For more information on possible topics, formats, etc., see the call for submissions: http://cppnow.org/2013/10/21/2014-call-for-submissions/ Boris
Rust front-end to GCC
Hey all Some of you may have noticed the gccrs branch on the git mirror. Since PyCon IE 2013 i gave a talk on my Python Front-end pet project and heard about rust by a few people and i never really looked at it before until then but i've kind of been hooked since. So to learn the language i've been writing this front-end to GCC. Only really a a month or so on and off work in between work. Currently it compiles alot of rust already in fairly little effort on my side GCC is doing loads of the heavy lifting. Currently it compiles most of the basic stuff such as a struct an impl block while loop, functions expressions calling methods passing arguments etc. Currently focusing on getting the typing working correctly to support & and ~ and look at how templates might work as well as need to implement break and return. There is still a lot of work but i would really like to share it and see what people think. Personally i think rust will target GCC very well and be a good addition (if / when it works). I really want to try and give back to this community who have been very good to me in learning over the last few years with GSOC. To get a jist of what i am compiling in my tests are something like: fn fib1 (n:int) -> int { if (n <= 1) { 1 } else { n * fib1 (n - 1) } } fn fib2 (n:int) -> int { let mut i = 1; let mut result = 1; while (i <= n) { result = result * i; i = i + 1; } result } fn main () { fib1 (10); fib2 (10); } Or struct mytype { x : int } impl mytype { fn test (self) -> int { println ("yyoyoyo"); test2 (1) } } fn main () { let x = mytype { x : 1 }; let z = x.x; let y = x.test (); let a = test2 (y); } fn test2 (x : int) -> int { let z = x; 1 + z } Theses are both pretty abstract test cases but were the ones i just made work a while ago. Lots more work to do on it but i feel these 2 test cases working is kind of a mile stone for me. I will start a wiki page on the project and the code i work on is at http://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/gccrs and i have it on github first mostly for travis CI and so i can do a bunch of commits and rebase etc http://github.com/redbrain/gccrs Thanks --Phil
Re: [rust-dev] Rust front-end to GCC
On Tue, Dec 3, 2013 at 12:22 PM, Philip Herron wrote: > Hey all > > Some of you may have noticed the gccrs branch on the git mirror. Since PyCon > IE 2013 i gave a talk on my Python Front-end pet project and heard about > rust by a few people and i never really looked at it before until then but > i've kind of been hooked since. > > So to learn the language i've been writing this front-end to GCC. Only > really a a month or so on and off work in between work. Currently it > compiles alot of rust already in fairly little effort on my side GCC is > doing loads of the heavy lifting. > > Currently it compiles most of the basic stuff such as a struct an impl block > while loop, functions expressions calling methods passing arguments etc. > Currently focusing on getting the typing working correctly to support & and > ~ and look at how templates might work as well as need to implement break > and return. > > There is still a lot of work but i would really like to share it and see > what people think. Personally i think rust will target GCC very well and be > a good addition (if / when it works). I really want to try and give back to > this community who have been very good to me in learning over the last few > years with GSOC. > > To get a jist of what i am compiling in my tests are something like: > > fn fib1 (n:int) -> int { > if (n <= 1) { 1 } > else { n * fib1 (n - 1) } > } > > fn fib2 (n:int) -> int { > let mut i = 1; > let mut result = 1; > while (i <= n) { > result = result * i; > i = i + 1; > } > result > } > > fn main () { > fib1 (10); > fib2 (10); > } > > Or > > struct mytype { > x : int > } > > impl mytype { > fn test (self) -> int { > println ("yyoyoyo"); > test2 (1) > } > } > > fn main () { > let x = mytype { x : 1 }; > let z = x.x; > let y = x.test (); > let a = test2 (y); > } > > fn test2 (x : int) -> int { > let z = x; > 1 + z > } > > Theses are both pretty abstract test cases but were the ones i just made > work a while ago. Lots more work to do on it but i feel these 2 test cases > working is kind of a mile stone for me. > > I will start a wiki page on the project and the code i work on is at > http://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/gccrs and i have > it on github first mostly for travis CI and so i can do a bunch of commits > and rebase etc http://github.com/redbrain/gccrs > > Thanks > > --Phil I think another backend would be very valuable, but not another implementation of the rest of the compiler. I would expect a gcc backend to be based on libsyntax/librustc with an alternate backend for `trans` and `back`. Implementing a fully compatible compiler with a full parser/macros, metadata, type-checking, borrow checking, liveness checking, type inference, privacy/reachability, etc. doesn't seem feasible. The upstream compiler already has hundreds of issues to fix in these areas.
Truncate optimisation question
Hi all, I'm investigating a testsuite failure on arm: gcc.target/arm/unsigned-extend-1.c For code: unsigned char foo (unsigned char c) { return (c >= '0') && (c <= '9'); } we end up generating: sub r0, r0, #48 uxtbr0, r0 cmp r0, #9 movhi r0, #0 movls r0, #1 bx lr The extra uxtb (extend) is causing the test to fail. We started generating the extra extend when a particular optimisation went in with (revision r191928). The comment in simplify-rtx.c says it transforms (truncate:SI (op:DI (x:DI) (y:DI))) into (op:SI (truncate:SI (x:DI)) (truncate:SI (x:DI))) but from what I can see it also transforms (truncate:QI (op:SI (x:SI) (y:SI))) into (op:QI (truncate:QI (x:SI)) (truncate:QI (x:SI))) From the combine dump I see that the sub and extend operations come from the RTL: (insn 6 3 7 2 (set (reg:SI 116) (plus:SI (reg:SI 0 r0 [ c ]) (const_int -48 [0xffd0]))) (insn 7 6 8 2 (set (reg:SI 117) (zero_extend:SI (subreg:QI (reg:SI 116) 0))) If I add a QImode compare pattern to the arm backend it gets matched and the extra extend goes away, but it seems to me that that's not the correct solution. Ideally, if a QImode operation is performed as an SImode operation on a target (like the sub and compare operations on arm) then we should not be doing this optimisation? My question is, how does one express that information in the simplify-rtx.c code? It seems that the PR that optimisation fixed (54457) only cared about DI -> SI truncations, so perhaps we should disable it for conversions between other modes where it's not beneficial altogether? Thanks, Kyrill
Re: Truncate optimisation question
> For code: > > unsigned char foo (unsigned char c) > { >return (c >= '0') && (c <= '9'); > } > > we end up generating: > > sub r0, r0, #48 > uxtbr0, r0 > cmp r0, #9 > movhi r0, #0 > movls r0, #1 > bx lr > > The extra uxtb (extend) is causing the test to fail. We started generating > the extra extend when a particular optimisation went in with (revision > r191928). That's PR rtl-optimization/58295. We have pessimizations on the SPARC too. > The comment in simplify-rtx.c says it transforms > (truncate:SI (op:DI (x:DI) (y:DI))) > > into > > (op:SI (truncate:SI (x:DI)) (truncate:SI (x:DI))) > > but from what I can see it also transforms > > (truncate:QI (op:SI (x:SI) (y:SI))) > > into > > (op:QI (truncate:QI (x:SI)) (truncate:QI (x:SI))) > > From the combine dump I see that the sub and extend operations come from > the RTL: > > (insn 6 3 7 2 (set (reg:SI 116) > (plus:SI (reg:SI 0 r0 [ c ]) > (const_int -48 [0xffd0]))) > > (insn 7 6 8 2 (set (reg:SI 117) > (zero_extend:SI (subreg:QI (reg:SI 116) 0))) > > > If I add a QImode compare pattern to the arm backend it gets matched and the > extra extend goes away, but it seems to me that that's not the correct > solution. Ideally, if a QImode operation is performed as an SImode > operation on a target (like the sub and compare operations on arm) then we > should not be doing this optimisation? Yes, that's my opinion as well, but Richard (CCed) seemed to disagree. In any case, that's certainly not a simplification. > My question is, how does one express that information in the simplify-rtx.c > code? It seems that the PR that optimisation fixed (54457) only cared about > DI -> SI truncations, so perhaps we should disable it for conversions > between other modes where it's not beneficial altogether? I can think of 3 possible solutions: - WORD_REGISTER_OPERATIONS, - promote_mode, - optabs. The 3rd solution seems to be the most straightforward, but this would be the first time that we test optabs from simplify-rtx.c. -- Eric Botcazou
Re: Truncate optimisation question
Eric Botcazou writes: >> For code: >> >> unsigned char foo (unsigned char c) >> { >>return (c >= '0') && (c <= '9'); >> } >> >> we end up generating: >> >> sub r0, r0, #48 >> uxtbr0, r0 >> cmp r0, #9 >> movhi r0, #0 >> movls r0, #1 >> bx lr >> >> The extra uxtb (extend) is causing the test to fail. We started generating >> the extra extend when a particular optimisation went in with (revision >> r191928). > > That's PR rtl-optimization/58295. We have pessimizations on the SPARC too. > >> The comment in simplify-rtx.c says it transforms >> (truncate:SI (op:DI (x:DI) (y:DI))) >> >> into >> >> (op:SI (truncate:SI (x:DI)) (truncate:SI (x:DI))) >> >> but from what I can see it also transforms >> >> (truncate:QI (op:SI (x:SI) (y:SI))) >> >> into >> >> (op:QI (truncate:QI (x:SI)) (truncate:QI (x:SI))) >> >> From the combine dump I see that the sub and extend operations come from >> the RTL: >> >> (insn 6 3 7 2 (set (reg:SI 116) >> (plus:SI (reg:SI 0 r0 [ c ]) >> (const_int -48 [0xffd0]))) >> >> (insn 7 6 8 2 (set (reg:SI 117) >> (zero_extend:SI (subreg:QI (reg:SI 116) 0))) >> >> >> If I add a QImode compare pattern to the arm backend it gets matched and the >> extra extend goes away, but it seems to me that that's not the correct >> solution. Ideally, if a QImode operation is performed as an SImode >> operation on a target (like the sub and compare operations on arm) then we >> should not be doing this optimisation? > > Yes, that's my opinion as well, but Richard (CCed) seemed to disagree. In > any > case, that's certainly not a simplification. > >> My question is, how does one express that information in the simplify-rtx.c >> code? It seems that the PR that optimisation fixed (54457) only cared about >> DI -> SI truncations, so perhaps we should disable it for conversions >> between other modes where it's not beneficial altogether? > > I can think of 3 possible solutions: > - WORD_REGISTER_OPERATIONS, > - promote_mode, > - optabs. > > The 3rd solution seems to be the most straightforward, but this would be the > first time that we test optabs from simplify-rtx.c. I don't think this is the way to go. AIUI the problem here isn't that RISC architectures don't have QImode adds as such. If we were going to combine insn 6 and insn 7 _in isolation_ then we would have either: (zero_extend:SI (subreg:QI (plus:SI (subreg:QI (reg:SI R)) (const_int X before the patch or: (zero_extend:SI (plus:QI (reg:QI R) (const_int X))) after the patch. And IMO the second form is a simplification over the first. (Despite being a RISC port, MIPS Octeon does have a pattern for zero-extending byte addition, so this isn't entirely academic.) It sounds like the problem is instead that we're relying on combine to remove redundant zero extensions and that combine's no longer able to do that when folding insns 6 and 7 into the comparison. Is that right? I think it would better to use a dedicated global pass to remove redundant extensions instead. IIRC there were various attempts to do that. IMO combine should just be about instruction selection. The patch still seems good to me from that POV. Thanks, Richard
Re: Truncate optimisation question
Eric Botcazou wrote: >> For code: >> >> unsigned char foo (unsigned char c) >> { >>return (c >= '0') && (c <= '9'); >> } >> >> we end up generating: >> >> sub r0, r0, #48 >> uxtbr0, r0 >> cmp r0, #9 >> movhi r0, #0 >> movls r0, #1 >> bx lr >> >> The extra uxtb (extend) is causing the test to fail. We started >generating >> the extra extend when a particular optimisation went in with >(revision >> r191928). > >That's PR rtl-optimization/58295. We have pessimizations on the SPARC >too. > >> The comment in simplify-rtx.c says it transforms >> (truncate:SI (op:DI (x:DI) (y:DI))) >> >> into >> >> (op:SI (truncate:SI (x:DI)) (truncate:SI (x:DI))) >> >> but from what I can see it also transforms >> >> (truncate:QI (op:SI (x:SI) (y:SI))) >> >> into >> >> (op:QI (truncate:QI (x:SI)) (truncate:QI (x:SI))) >> >> From the combine dump I see that the sub and extend operations come >from >> the RTL: >> >> (insn 6 3 7 2 (set (reg:SI 116) >> (plus:SI (reg:SI 0 r0 [ c ]) >> (const_int -48 [0xffd0]))) >> >> (insn 7 6 8 2 (set (reg:SI 117) >> (zero_extend:SI (subreg:QI (reg:SI 116) 0))) >> >> >> If I add a QImode compare pattern to the arm backend it gets matched >and the >> extra extend goes away, but it seems to me that that's not the >correct >> solution. Ideally, if a QImode operation is performed as an SImode >> operation on a target (like the sub and compare operations on arm) >then we >> should not be doing this optimisation? > >Yes, that's my opinion as well, but Richard (CCed) seemed to disagree. >In any >case, that's certainly not a simplification. > >> My question is, how does one express that information in the >simplify-rtx.c >> code? It seems that the PR that optimisation fixed (54457) only cared >about >> DI -> SI truncations, so perhaps we should disable it for conversions >> between other modes where it's not beneficial altogether? > >I can think of 3 possible solutions: > - WORD_REGISTER_OPERATIONS, > - promote_mode, > - optabs. > >The 3rd solution seems to be the most straightforward, but this would >be the >first time that we test optabs from simplify-rtx.c. To me promote_mode sounds like the best fit. But doesn't combine do instruction validation? So in this case the target claims to support the narrow operation? Richard.
Appointment as SLSR maintainer
Sorry for the delay on this, it fell off my radar as we wrapped up stage1 development. -- I am pleased to announce that the GCC Steering Committee has accepted you as the maintainer for the SLSR optimization pass. Please update your listing in the MAINTAINERS file and congratulations on the new role. Thanks, Jeff
Re: Truncate optimisation question
> I don't think this is the way to go. AIUI the problem here isn't that > RISC architectures don't have QImode adds as such. If we were going > to combine insn 6 and insn 7 _in isolation_ then we would have either: > >(zero_extend:SI (subreg:QI (plus:SI (subreg:QI (reg:SI R)) >(const_int X > > before the patch or: > >(zero_extend:SI (plus:QI (reg:QI R) > (const_int X))) > > after the patch. And IMO the second form is a simplification over the > first. (Despite being a RISC port, MIPS Octeon does have a pattern for > zero-extending byte addition, so this isn't entirely academic.) Well, if you look at the transformation in isolation, you cannot reasonably say that it's a simplification for most RISC architectures either. That it happens to help zero-extensions on MIPS is fine, but it pessimizes other cases on other architectures (and maybe on MIPS as well). > It sounds like the problem is instead that we're relying on combine > to remove redundant zero extensions and that combine's no longer able > to do that when folding insns 6 and 7 into the comparison. Is that right? > I think it would better to use a dedicated global pass to remove redundant > extensions instead. IIRC there were various attempts to do that. The regressions affect 4.8 and mainline and it's probably too late to add new passes in order to mitigate them. > IMO combine should just be about instruction selection. The patch > still seems good to me from that POV. The patch is in simplify-rtx.c though, not in combine.c, so it's more general. -- Eric Botcazou
Re: Truncate optimisation question
> To me promote_mode sounds like the best fit. But doesn't combine do > instruction validation? So in this case the target claims to support the > narrow operation? Part of the problem is that it's not in the combiner, it's in simplify-rtx.c, so it's applied liberally when you're manipulating the RTL. -- Eric Botcazou
Re: Truncate optimisation question
Eric Botcazou writes: >> I don't think this is the way to go. AIUI the problem here isn't that >> RISC architectures don't have QImode adds as such. If we were going >> to combine insn 6 and insn 7 _in isolation_ then we would have either: >> >>(zero_extend:SI (subreg:QI (plus:SI (subreg:QI (reg:SI R)) >>(const_int X >> >> before the patch or: >> >>(zero_extend:SI (plus:QI (reg:QI R) >> (const_int X))) >> >> after the patch. And IMO the second form is a simplification over the >> first. (Despite being a RISC port, MIPS Octeon does have a pattern for >> zero-extending byte addition, so this isn't entirely academic.) > > Well, if you look at the transformation in isolation, you cannot > reasonably say that it's a simplification for most RISC architectures > either. That it happens to help zero-extensions on MIPS is fine, but > it pessimizes other cases on other architectures (and maybe on MIPS as > well). I think we're talking about different meanings of "simplification". simplify-rtx.c can't and IMO shouldn't be making decisions about simplicity in the sense of "X is cheaper on the target than Y" or even "the target has an optab for X but not Y". But it should try wherever possible to (a) allow recursive simplification and (b) reduce the number of ways that the same thing can be represented. And the patch did that. E.g. the old representation quoted above allowed the constant to have redundant upper bits. The new representation causes them to be removed, so we have a predictable constant. Combine is asking simplify-rtx.c to truncate an addition to QImode and simplify-rtx.c is providing a reasonable representation of that. It's the representation we should use when matching against .md patterns, for example. The problem is that combine doesn't want to keep the truncation in this case, but doesn't know that yet. >> It sounds like the problem is instead that we're relying on combine >> to remove redundant zero extensions and that combine's no longer able >> to do that when folding insns 6 and 7 into the comparison. Is that right? >> I think it would better to use a dedicated global pass to remove redundant >> extensions instead. IIRC there were various attempts to do that. > > The regressions affect 4.8 and mainline and it's probably too late to add new > passes in order to mitigate them. > >> IMO combine should just be about instruction selection. The patch >> still seems good to me from that POV. > > The patch is in simplify-rtx.c though, not in combine.c, so it's more general. Right, but the only complaint I know of is about its effect on combine. And my point is that that complaint isn't about combine failing to combine instructions per se. It's that combine is failing to remove a redundant operation. With the right input, the same rtl sequence could conceivably be generated on a CISC target like x86_64, since it defines all the required patterns (SImode addition, QI->SI zero extension, SImode comparison). It could also occur with a sequence that starts out as a QImode addition. So trying to make the simplification depend on CISCness seems like papering over the problem. If we want to keep this as a combine optimisation for 4.9 then I think we should teach it to handle zero-extended arithmetic too. But there again, I stood down as an rtl maintainer for a reason. :-) If you think the patch was wrong or if you feel the fallout is too great then please feel free to revert it. Thanks, Richard