Re: Hints for backporting gcc 4.5 powerpc fix to gcc 4.4.3?
On 22 March 2011 14:56, David Edelsohn wrote: > > On Tue, Mar 22, 2011 at 9:25 AM, Simon Baldwin wrote: > > I'm currently trying to backport a small part of gcc 4.5 r151729 to > > gcc 4.4.3. This revision fixes a problem in powerpc code generation > > that leads to gcc not using lmw/stmw instructions in function prologue > > and epilogues, where it could otherwise validly use them. > > > > On the face of things, the central piece of r151729 I seem to want is just > > this: > > > > Index: gcc/config/rs6000/rs6000.c > > === > > --- gcc/config/rs6000/rs6000.c (revision 151728) > > +++ gcc/config/rs6000/rs6000.c (revision 151729) > > @@ -18033,7 +18033,8 @@ static bool > > no_global_regs_above (int first, bool gpr) > > { > > int i; > > - for (i = first; i < gpr ? 32 : 64 ; i++) > > + int last = gpr ? 32 : 64; > > + for (i = first; i < last; i++) > > if (global_regs[i]) > > return false; > > return true; > > > > Taking only that and leaving out all of the rest of r151729 lets me > > build a powerpc gcc that does use lmw/stmw instructions in function > > prologue and epilogues as hoped. Unfortunately it also has bad > > codegen elsewhere. So it seems I need more than just this little > > piece of r151729. Unfortunately, r151729 is a fairly large patch that > > seems to do a number of jobs and which does not apply readily to gcc > > 4.4. At the moment it's not clear to me what other parts of it I > > might need. > > > > Can anyone here offer any hints or pointers on how to extract from the > > r151729 diff just the few pieces needed to fix this single powerpc > > codegen bug in gcc 4.4.3? Anyone recognize this issue and already > > dealt with it in isolation? > > The change to no_global_regs_above() is one of the key pieces, but > that change exposed other latent bugs, as you have encountered. One > needs the additional patches to the save/restore strategy routines and > prologue/epilogue. This is why the entire patch was committed in one > piece. Thanks for the reply, David. I'll take another look and see if I can abstract out just the required pieces. In practice, though, it looks like it may be easier for me to just upgrade to gcc 4.5 or 4.6. Certainly safer. -- Google UK Limited | Registered Office: Belgrave House, 76 Buckingham Palace Road, London SW1W 9TQ | Registered in England Number: 3977902
Modifying instruction flow during scheduling
Hi, I would like to experiment with modifications to the instruction flow during scheduling. One motivation for doing that is the combining of contiguous loads like was discussed here: http://gcc.gnu.org/ml/gcc/2010-12/msg00153.html I've seen that the scheduler itself does some modifications to the intruction flow to introduce the speculative form of the instructions, but is it somehow prepared to modifications to the instructions from the target hooks (TARGET_SCHED_REORDER) ? If yes, what are the primitives that one should use to notify modifications? If no, what would it take to make that possible? Thanks, Fred
Complex vectorization
Hi, I'm currently working on trying to implement a way to use the SIMD instructions of the SSEx family when computing a vector of complex numbers. I have to say that I have never worked on compilation techniques before, and that I only have little understanding of the vectorization problems. I've spent a fait amount of time reading documentation and code, and I came to the conclusion that, at least for the multiplication and division of complex numbers, I had to implement them as functions in the libgcc as their scalar counterpart, __mul*c3 and __div*c3. I face a couple of issues here : what are the C types corresponding to the vector types, assuming they exist ? And, also important : I understand that the processor has to be in a certain state to use the SIMD instructions. Will it stay that way when calling a function, thus changing partially the environment ? Have a nice day, Simon Chopin signature.asc Description: Digital signature
Re: Complex vectorization
On Thu, Mar 24, 2011 at 12:41 PM, Simon Chopin wrote: > Hi, > > I'm currently working on trying to implement a way to use the SIMD > instructions of the SSEx family when computing a vector of complex > numbers. > > I have to say that I have never worked on compilation techniques before, > and that I only have little understanding of the vectorization problems. > > I've spent a fait amount of time reading documentation and code, and I > came to the conclusion that, at least for the multiplication and > division of complex numbers, I had to implement them as functions in the > libgcc as their scalar counterpart, __mul*c3 and __div*c3. > > I face a couple of issues here : what are the C types corresponding to > the vector types, assuming they exist ? There are no vector of complex types and GCC internally does not handle this case as well. Instead GCC lowers complex operations to piecewise scalar operations, thus vectorization would have vectors of the complex components. There are a number of bugs in bugzilla for complex vectorization, like PR37021 or PR40770. Richard.
gcno file question
Hi, RTEMS has been using simulators and some programs we wrote for coverage analysis for a while now. I am looking into writing a converter which takes coverage data from simulators and produces .gcno files. The coverage data is often just a bitmap of which addresses were executed. There is no frequency, just yes/no. We can already map that information back to file/line. + Is this enough to produce a .gcno file from? + What records need to be generated as a minimum? As a technical sanity question, the RTEMS code is in a library and we are merging coverage data from multiple executables to get unified coverage data. We abstract away physical address into offsets into methods and file/line. Does generating a .gcno from this merged data sound feasible? Thoughts, insights, comments appreciated. Thanks. -- Joel Sherrill, Ph.D. Director of Research& Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available (256) 722-9985
Re: inline assembly vs. intrinsic functions
2011/3/22 Ian Lance Taylor : > roy rosen writes: > >> 2010/10/26 Ian Lance Taylor : >>> roy rosen writes: >>> I am trying to demonstrate my port capabilities. I am writing an application which needs to use instructions like max a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs. Is that possible to write an intrinsic function for that? I think not because that means that I need to pass d,e,f by reference which means that they would be in memory and not in a register as meant by the instruction. >>> >>> That is correct. An intrinsic function is a normal function. If you >>> want it to have multiple outputs, you need to pass in addresses, or you >>> need to have it return a struct. >>> >>> I'm a bit curious as to why a function named max would have multiple >>> outputs. >>> Is there any port with such an example? >>> >>> Not to my knowledge. I wrote a private port in which some intrinsics >>> returned a struct, and to keep everything out of memory I added >>> additional intrinsics to retrieve elements of the struct. It's awkward >>> to use but the resulting code is fine. >>> >> Can you please explain how this solution should work? >> First a code with memory accesses would be generated and then >> optimizations would optimize it to use registers directly? > > You build a RECORD_TYPE holding the fields you want to return. You > define the appropriate builtin functions to return that record type. How is that done? using define_insn? How do I tell it to return a struct? Is there an example I can look at? Roy. > You define another builtin function for each field, which takes the > RECORD_TYPE as its argument and returns the type of the field. In > TARGET_FOLD_BUILTIN you convert the per-field functions into > COMPONENT_REFs. > > Ian >
Re: Complex vectorization
On Thu, Mar 24, 2011 at 12:55:44PM +0100, Richard Guenther wrote: > There are no vector of complex types and GCC internally does not handle > this case as well. Instead GCC lowers complex operations to Yep, sorry, my mistake. I meant array of complex. > piecewise scalar operations, thus vectorization would have vectors > of the complex components. There are a number of bugs in bugzilla > for complex vectorization, like PR37021 or PR40770. And yet, when trying to multiply numbers, gcc says that complex isn't a supported type. From the links you provided, part of the solution would be to associate the complex type and the vector type of its scalar type. While it should work for most operations, providing support for the IMAGPART_EXPR and REALPART_EXPR, the multiplication and division operations are implemented as separated functions of libgcc. Because of that, they wouldn't gain from the vectorization, or I am mistaken (again) ? Cheers, Simon P.S. I am not aware of the list policy regarding the CCs, but I assumed you were already subscribed.
Re: Complex vectorization
On Thu, Mar 24, 2011 at 2:50 PM, Simon Chopin wrote: > On Thu, Mar 24, 2011 at 12:55:44PM +0100, Richard Guenther wrote: >> There are no vector of complex types and GCC internally does not handle >> this case as well. Instead GCC lowers complex operations to > Yep, sorry, my mistake. I meant array of complex. >> piecewise scalar operations, thus vectorization would have vectors >> of the complex components. There are a number of bugs in bugzilla >> for complex vectorization, like PR37021 or PR40770. > > And yet, when trying to multiply numbers, gcc says that complex isn't a > supported type. _Complex float should work. Complex is a c99 feature and requires you to include complex.h > From the links you provided, part of the solution would > be to associate the complex type and the vector type of its scalar type. > While it should work for most operations, providing support for the > IMAGPART_EXPR and REALPART_EXPR, the multiplication and division > operations are implemented as separated functions of libgcc. Because of > that, they wouldn't gain from the vectorization, or I am mistaken > (again) ? Multiplication is inlined for -fcx-fortran-rules for example. Yes, division is always out-of-line. Richard. > Cheers, > > Simon > > P.S. I am not aware of the list policy regarding the CCs, but I assumed > you were already subscribed. >
Re: mov arguments are still the same
Let me revive this thread and ask for suggestions/tips on the issue below. Cheers, PMatos On 16/03/11 18:19, Paulo J. Matos wrote: Hi, I have touched this subject before: http://thread.gmane.org/gmane.comp.gcc.devel/116198/focus=116200 Now, at the time I didn't pursue this issue but now with 4.4.4 this keeps happening and I traced it to the cprop_hardreg replacing a register which makes the set insn having the same source and dest. Here's insn 32 from pass 183:ce3 (insn 32 31 33 4 h.c:51 (set (reg:QI 1 AL) (reg/f:QI 8 @H'fff9 [33])) 4 {*movqi} (expr_list:REG_DEAD (reg/f:QI 8 @H'fff9 [33]) (expr_list:REG_EQUAL (plus:QI (reg/f:QI 6 Y) (const_int 1 [0x1])) (nil Now the same insn after the following pass 185:cprop_hardreg insn 32: replaced reg 8 with 1 ... (insn 32 31 33 4 h.c:51 (set (reg:QI 1 AL) (reg/f:QI 1 AL [33])) 4 {*movqi} (expr_list:REG_DEAD (reg/f:QI 8 @H'fff9 [33]) (expr_list:REG_EQUAL (plus:QI (reg/f:QI 6 Y) (const_int 1 [0x1])) (nil This stays as is until assembler generation, which is really annoying cause it generates an instruction (which is basically a nop) like: ... mov AL,AL ... Is this a known issue? I can't see how this is a problem with my backend but might as well be. Maybe cprop_regmove should check if it is making the src equal to dest and if it is remove the insn? Any suggestions? -- PMatos
Re: inline assembly vs. intrinsic functions
roy rosen writes: >> You build a RECORD_TYPE holding the fields you want to return. You >> define the appropriate builtin functions to return that record type. > > How is that done? using define_insn? How do I tell it to return a struct? > Is there an example I can look at? A RECORD_TYPE is what gcc generates when you define a struct in your source code. For an example of a backend building a struct, see, e.g., ix86_build_builtin_va_list_abi. When you define your builtin functions in TARGET_INIT_BUILTINS you specify the argument types and the return type, typically by building a FUNCTION_TYPE and passing it to add_builtin_function. To define a builtin which returns a struct, just arrange for the return type of the FUNCTION_TYPE that you pass to add_builtin_function be the RECORD_TYPE that you built. Ian
mt-ospace usage in m32r and fr30
These targets are using -Os to build target libraries. Perhaps the right thing to do instead would be to disable some optimizations selectively in the compiler? Thanks! Paolo
Re: Complex vectorization
On 03/24/2011 07:47 AM, Richard Guenther wrote: > Multiplication is inlined for -fcx-fortran-rules for example. Yes, division > is always out-of-line. Division is inlined with -fcx-limited-range. r~
Re: Second GCC 4.6.0 release candidate is now available
On Tue, Mar 22, 2011 at 11:12 AM, Jakub Jelinek wrote: > A second GCC 4.6.0 release candidate is available at: > > ftp://gcc.gnu.org/pub/gcc/snapshots/4.6.0-RC-20110321/ > > Please test the tarballs and report any problems to Bugzilla. > CC me on the bugs if you believe they are regressions from > previous releases severe enough to block the 4.6.0 release. > > If no more blockers appear I'd like to release GCC 4.6.0 > early next week. The RC bootstraps C, C++, Fortran, Obj-C, and Obj-C++ on ARMv7/Cortex-A9/Thumb-2/NEON, ARMv5T/ARM/softfp, ARMv5T/Thumb/softfp, and ARMv4T/ARM/softfp. I'm afraid I haven't reviewed the test results (Richard? Ramana?) See: http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02298.html http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02391.html http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02394.html http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02393.html and: http://builds.linaro.org/toolchain/gcc-4.6.0-RC-20110321/logs/ -- Michael
gcc-4.5-20110324 is now available
Snapshot gcc-4.5-20110324 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20110324/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch revision 171425 You'll find: gcc-4.5-20110324.tar.bz2 Complete GCC (includes all of below) MD5=9e8dfb8a5e75b885337699c79b0ed1ba SHA1=2a2b6fdbd610d6c8915bdc24f42839a6df8efb13 gcc-core-4.5-20110324.tar.bz2C front end and core compiler MD5=ed1a6ad884f7650ec7de81c6956744dc SHA1=e72e476df8bb83bd0406ece833ecb327631aa0c8 gcc-ada-4.5-20110324.tar.bz2 Ada front end and runtime MD5=5fe17340ca5d91afc07b572d23b83615 SHA1=1b26b2ac9c489bdfa1be96cd4469201633e1ae96 gcc-fortran-4.5-20110324.tar.bz2 Fortran front end and runtime MD5=9bac9af4671ccd9720f6abd11e4e983e SHA1=912e0628587622284fe69274c6dc0fec7a97224e gcc-g++-4.5-20110324.tar.bz2 C++ front end and runtime MD5=221d157533f5fb7fdc82e0acc30a2013 SHA1=4720c3e002441f5c59998ee39db2e3ac613bdd8b gcc-go-4.5-20110324.tar.bz2 Go front end and runtime MD5=d6d5d5c37ac87a240109966f68066a74 SHA1=71fac6499f5cffa48c63420dee6497f1346a36a5 gcc-java-4.5-20110324.tar.bz2Java front end and runtime MD5=83e088b55efaac658ecf8a2a1bbe2ed1 SHA1=02ed9eed4f68d97ccad18194db11eecb94c14d6e gcc-objc-4.5-20110324.tar.bz2Objective-C front end and runtime MD5=232225b6f202f6356e7046cca5ec19c8 SHA1=413266384c9e677e86a0fb0b7f55267cdfcffd5b gcc-testsuite-4.5-20110324.tar.bz2 The GCC testsuite MD5=0c2c790e7a59dbb25117dca6927e7666 SHA1=e61bd8ece43001d57f60eb6fb6e7a95e09768fee Diffs from 4.5-20110317 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.