RE : [trans-mem] optimization problem with ITM functions
Hello, Here the asm of the previous example: .globl _ZN5bench10LinkedList6insertEi .type _ZN5bench10LinkedList6insertEi, @function _ZN5bench10LinkedList6insertEi: .LFB46: .loc 1 24 0 .cfi_startproc pushq %r12 .cfi_def_cfa_offset 16 pushq %rbp .cfi_def_cfa_offset 24 pushq %rbx .cfi_def_cfa_offset 32 subq$16, %rsp .cfi_def_cfa_offset 48 ... .cfi_offset 3, -32 .cfi_offset 6, -24 .cfi_offset 12, -16 call_ITM_beginTransaction .L8: ... call_ITM_WU8 .loc 1 46 0 discriminator 4 addq$16, %rsp .cfi_remember_state .cfi_def_cfa_offset 32 popq%rbx .cfi_def_cfa_offset 24 popq%rbp .cfi_def_cfa_offset 16 popq%r12 .cfi_def_cfa_offset 8 .loc 2 56 0 discriminator 4 jmp _ITM_commitTransaction Since _ITM_commitTransaction can jump back to the label L8, the tail call optimization must not be allowed. Is it right? Do some other optimizations can create problems with this longjmp behavior of _ITM_commitTransaction? Does the ECF_RETURNS_TWICE flag is the right way to solve that? Should I fill a bug-report? Thanks. Patrick Marlier. De : Patrick Marlier [patrick.marl...@unine.ch] Date d'envoi : jeudi 20 janvier 2011 20:42 À : gcc@gcc.gnu.org Cc : r...@redhat.com; al...@redhat.com; gokcen.kes...@bsc.es Objet : [trans-mem] optimization problem with ITM functions Hello, Attached the cpp example. While I was trying to understand the problem (segfault), I found this: In special_function_p function (calls.c), ECF_TM_OPS flag is returned for all TM builtin call except BUILT_IN_TM_START. Question: is it really intentional or missing? Moreover since BUILT_IN_TM_START is doing a setjmp, I suppose it should add also the flag ECS_RETURNS_TWICE. If I add this, the generated code is a bit different (more things happen in the stack, which I suppose right). BUILT_IN_TM_ABORT is kind of longjmp, it should then add ECF_NORETURN, right?. Otherwise I have a strange bug with the attached cpp file when _ITM_commitTransaction is the last call of a function with optimization level>=2. This call is optimized as a tail call thus the epilogue is before the jmp. But in this specific case, if the _ITM_commitTransaction aborts and roll backs, it seems it creates a problem (corrupted stack) but I didn't figure out the real reason. To avoid this problem I have added ECF_RETURNS_TWICE for the transaction commit which avoid this tail call optimization but I am sure this is not the way to fix this. Attached the patch for these problems. Thanks for any help. Patrick Marlier.
Why doesn't vetorizer skips loop peeling/versioning for target supports hardware misaligned access?
Hello, Some of our target processors support complete hardware misaligned memory access. I implemented movmisalignm patterns, and found TARGET_SUPPORT_VECTOR_MISALIGNMENT (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT On 4.6) hook is based on checking these patterns. Somehow this hook doesn't seem to be used. vect_enhance_data_refs_alignment is called regardless whether the target has HW misaligned support or not. Shouldn't using HW misaligned memory access be better than generating extra code for loop peeling/versioning? Or at least if for some architectures it is not the case, we should have a compiler hook to choose between them. BTW, I mainly work on 4.5, maybe 4.6 has changed. Thanks, Bingfeng Mei
Re: Plugin that parse tree
Hello! I am discovering gcc and his plugin system. I have tried MELT. I would like to say that the lispy syntax is not so difficult. It mights look unattractive to have such number of parenthesis but we quikly get used to the structure. The harder for me is to have a good view of the GCC internals and that's not mainly a question of MELT or not (We could even say that the abstraction of MELT can be an help). In reality I think the most important feature of MELT is the patterm matching system. I have mainly use it at gimple level and that's easy to recover an element wich has a property or another. For exemple, it is pretty easy to get a gimple_cond which compare a pointer to null or such things. Looks interesting for making test on the code, having new warnings... On the contrary that is not ready for modifying directly the representation. I can understand that peoples who are used to C plugins don't want to learn MELT, but I think it might be a good way for newcomers and curious. Best regards Pierre Vittet > On Sun, 23 Jan 2011 15:49:48 +0100 > Daniel Marjamäki wrote: > >> GCC-MELT is an interesting project. But it seems to be very difficult >> to write lisp scripts. You don't have a C interface also, do you? > > The few people who tried writing MELT code are founding on the contrary > that coding in MELT is easier than coding in C (even if I agree that > MELT is not very well documented). > > The major MELT idea is on the contrary that coding in MELT, using the > powerful features provided by the MELT language (pattern matching, > functional/applicative & object programming styles), is much easier > than coding in C. And MELT philosophy is indeed that it is a > higher-level language than C (you don't have any pattern matching in C > for instance). > > So I disagree with the idea that writing MELT code is difficult (and > harder than writing in C). But it is also matter of taste. > >> I would like to see how I can use plain C. > > In that case, MELT is not really for you. The selling point of MELT is > precisely to avoid the low level details of C and to give something > higher-level to you. If you really want to code in C, don't use MELT. > > There is no interface from C to MELT, and there cannot be one... The > purpose of MELT is to avoid coding plugins in C! > > >>> I want to write a plugin that parse the AST. Could I get some hint >>> about how to do it? > > The major issue is to understand all the details of GCC internal > representations (i.e. Trees, Gimples). Did you understand them? > > Regards > > > -- > Basile STARYNKEVITCH http://starynkevitch.net/Basile/ > email: basilestarynkevitchnet mobile: +33 6 8501 2359 > 8, rue de la Faiencerie, 92340 Bourg La Reine, France > *** opinions {are only mine, sont seulement les miennes} ***
Re: Why doesn't vetorizer skips loop peeling/versioning for target supports hardware misaligned access?
Hi, gcc-ow...@gcc.gnu.org wrote on 24/01/2011 03:21:51 PM: > Hello, > Some of our target processors support complete hardware misaligned > memory access. I implemented movmisalignm patterns, and found > TARGET_SUPPORT_VECTOR_MISALIGNMENT > (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT > On 4.6) hook is based on checking these patterns. Somehow this > hook doesn't seem to be used. vect_enhance_data_refs_alignment > is called regardless whether the target has HW misaligned support > or not. targetm.vectorize.support_vector_misalignment is used in vect_supportable_dr_alignment to decide whether a specific misaligned access is supported. > > Shouldn't using HW misaligned memory access be better than > generating extra code for loop peeling/versioning? Or at least > if for some architectures it is not the case, we should have > a compiler hook to choose between them. BTW, I mainly work > on 4.5, maybe 4.6 has changed. Right. And we have that implemented in 4.6 at least partially: for known misalignment and for peeling for loads. Maybe this part needs to be enhanced, concrete testcases could help. Ira > > Thanks, > Bingfeng Mei >
Re: Why doesn't vetorizer skips loop peeling/versioning for target supports hardware misaligned access?
On 1/24/2011 5:21 AM, Bingfeng Mei wrote: Hello, Some of our target processors support complete hardware misaligned memory access. I implemented movmisalignm patterns, and found TARGET_SUPPORT_VECTOR_MISALIGNMENT (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT On 4.6) hook is based on checking these patterns. Somehow this hook doesn't seem to be used. vect_enhance_data_refs_alignment is called regardless whether the target has HW misaligned support or not. Shouldn't using HW misaligned memory access be better than generating extra code for loop peeling/versioning? Or at least if for some architectures it is not the case, we should have a compiler hook to choose between them. BTW, I mainly work on 4.5, maybe 4.6 has changed. Thanks, Bingfeng Mei Peeling for alignment still presents a performance advantage on longer loops for the most common current CPUs. Skipping the peeling is likely to be advantageous for short loops. I've noticed that 4.6 can vectorize loops with multiple assignments, presumably taking advantage of misalignment support. There's even a better performing choice of instructions for -march=corei7 misaligned access than is taken by other compilers, but that could be an accident. At this point, I'd like to congratulate the developers for the progress already evident in 4.6. -- Tim Prince
Re: TREE_LIST removals and cleanups for 4.7
On Sat, Jan 22, 2011 at 08:02:33PM +0100, Michael Matz wrote: > On Sat, 22 Jan 2011, Nathan Froyd wrote: > > - Similarly to the work I did for s/TREE_CHAIN/DECL_CHAIN/, I'd like to > > replace TREE_TYPE for things like {POINTER,FUNCTION,ARRAY}_TYPE, etc. > > This work would be a good step towards both staticizing trees and > > tuplification of types. > > I don't see the advantage in the accessors to that type be named > differently according to context compared to simply TREE_TYPE. Well, documentation for one. TREE_TYPE (TREE_TYPE (t)) looks better if you wrote it as RETURN_TYPE (DECL_TYPE (t)). Maybe it's slightly more obvious if the variable wasn't named `t' and from the surrounding context; from the conversion for RETURN_TYPE, though, I don't think it's obvious. And triply-nested TREE_TYPEs are confusing regardless. :) I admit that this introduces unnecessary tests to satisfy --enable-checking builds; I haven't looked whether GCC will optimize out the extra checks for --disable-checking. Not all types have a {sub,element}type either. You'd like to be able to split out those types to make them smaller (tree_type is huge), and that's hard to do otherwise--you can't use the tree_typed .type member. (This is the tuplification part.) If you have statically typed trees, you're also going to have separate accessors for type of types (see above), type of exprs, type of decls, etc. even if they share a common base class (tree_base) for lightweight RTTI. This goal is farther off, even if the proposal is eight years old at this point. > If your goal is to make tree_common smaller, introduce a tree_typed > structure (consisting of tree_base + type member), and use that instead of > tree_common in all tree structures needing to have a type. I think that's a good idea, too. But orthogonal to the above. -Nathan
Re: Plugin that parse tree
Do you have any opinion about adding a warning for: int f(char c) { return 10 * (c == 13) ? 1 : 2; } The multiplication has no effect. The function returns either 1 or 2. It would be interesting to know how a MELT script could look like for such a case. As far as I see the multiplication doesn't exist in the gimple format (looking at a.c.004t.gimple generated by -fdump-tree-all).
Re: Plugin that parse tree
Daniel Marjamäki writes: > Do you have any opinion about adding a warning for: > > int f(char c) > { > return 10 * (c == 13) ? 1 : 2; > } > > The multiplication has no effect. The function returns either 1 or 2. > > It would be interesting to know how a MELT script could look like for > such a case. The problem with warnings for this kind of code in C/C++ is that it often arises in macro expansions. I think it would be necessary to first develop a scheme which lets us determine whether code resulted from a macro expansion or not, which I think would be quite useful in a number of different cases. > As far as I see the multiplication doesn't exist in the gimple format > (looking at a.c.004t.gimple generated by -fdump-tree-all). It gets constant folded in the front end. Ian
Re: Plugin that parse tree
2011/1/24 Ian Lance Taylor : > The problem with warnings for this kind of code in C/C++ is that it > often arises in macro expansions. I see... so it won't be included in gcc. :-( It was my goal to get it into GCC. But I still think it's an interesting idea that I'll look into. Regards, Daniel
Re: Find a new maintainer for option handling?
On Mon, 17 Jan 2011, Gerald Pfeifer wrote: > On Wed, 12 Jan 2011, Jie Zhang wrote: > > I agree. I think Joseph is the best candidate for the maintainer of the > > option handling since he made the most changes of gcc/opts-common.c. He > > is already the maintainer of the driver. If we unify these two > > maintainerships, we save one line of MAINTAINERS. :-) > > I am not so much concerned about that one line in MAINTAINERS, more > finding someone who is willing to take on the role. I, too, think > Joseph would be a great candidate, but it's his call whether he wants > to. ;-) (I'll be happy to raise it on the SC in case.) I am willing to be considered for option handling maintainership or reviewership. -- Joseph S. Myers jos...@codesourcery.com
Re: IA64: short data segment overflowed issue
On Sat, 2011-01-22 at 12:26 -0800, Richard Henderson wrote: > On 01/22/2011 10:48 AM, Sergei Trofimovich wrote: > > I've attached dirty patch. It has not very nice comments, tabs and spaces > > yet. > > Steve perhaps should weigh in here... I am not very familiar with AUTO_PIC and NO_PIC. It would be nice to implement the TODO: parts as part of this implementation to make it more complete. Would it make sense for TARGET_AUTO_PIC to automatically set ia64_cmodel to CMODEL_LARGE? It looks like everywhere we check for ia64_cmodel equal to CMODEL_LARGE, we also check for TARGET_AUTO_PIC so this could simplify the logic. Steve Ellcey s...@cup.hp.com
Re: IA64: short data segment overflowed issue
On 01/24/2011 11:40 AM, Steve Ellcey wrote: > On Sat, 2011-01-22 at 12:26 -0800, Richard Henderson wrote: >> On 01/22/2011 10:48 AM, Sergei Trofimovich wrote: >>> I've attached dirty patch. It has not very nice comments, tabs and spaces >>> yet. >> >> Steve perhaps should weigh in here... > > I am not very familiar with AUTO_PIC and NO_PIC. It would be nice to > implement the TODO: parts as part of this implementation to make it more > complete. Would it make sense for TARGET_AUTO_PIC to automatically set > ia64_cmodel to CMODEL_LARGE? It looks like everywhere we check for > ia64_cmodel equal to CMODEL_LARGE, we also check for TARGET_AUTO_PIC so > this could simplify the logic. Yes, that's what I suggested later in the message. r~
Re: Plugin that parse tree
Daniel Marjamäki writes: > 2011/1/24 Ian Lance Taylor : > >> The problem with warnings for this kind of code in C/C++ is that it >> often arises in macro expansions. > > I see... so it won't be included in gcc. :-( Actually, I think it could be included in gcc, provided you (or somebody) first implements a way to not warn in a macro expansion. Ian
Re: PATCH: 2 stage BFD linker for LTO plugin
Ian Lance Taylor writes: > 2011-01-18 Ian Lance Taylor > > * plugin.cc (class Plugin_rescan): Define new class. > (Plugin_manager::claim_file): Set any_claimed_. > (Plugin_manager::save_archive): New function. > (Plugin_manager::save_input_group): New function. > (Plugin_manager::all_symbols_read): Create Plugin_rescan task if > necessary. > (Plugin_manager::new_undefined_symbol): New function. > (Plugin_manager::rescan): New function. > (Plugin_manager::rescannable_defines): New function. > (Plugin_manager::add_input_file): Set any_added_. > * plugin.h (class Plugin_manager): define new fields rescannable_, > undefined_symbols_, any_claimed_, and any_added_. Declare > Plugin_rescan as friend. Declare new functions. > (Plugin_manager::Rescannable): Define type. > (Plugin_manager::Rescannable_list): Define type. > (Plugin_manager::Undefined_symbol_list): Define type. > (Plugin_manager::Plugin_manager): Initialize new fields. > * archive.cc (Archive::defines_symbol): New function. > (Add_archive_symbols::run): Pass archive to plugins if any. > * archive.h (class Archive): Declare defines_symbol. > * readsyms.cc (Input_group::~Input_group): New function. > (Finish_group::run): Pass input_group to plugins if any. > * readsyms.h (class Input_group): Declare destructor. > * symtab.cc (add_from_object): Pass undefined symbol to plugins if > any. I have committed this gold patch to binutils mainline and binutils 2.21 branch. Ian
gold patch committed: Bump gold version number to 1.11
I've committed this patch to bump the gold version number to 1.11, to both mainline and the binutils 2.21 branch. This is so that gcc's LTO plugin can detect the changed behaviour concerning static archives, so that the plugin knows that it need not honor the -pass-through option. The -pass-through option existed to pass static archives to the linker via the plugin; that is unnecessary when the plugin researches static archives anyhow. Ian 2011-01-24 Ian Lance Taylor * version.cc (version_string): Bump to 1.11. Index: version.cc === RCS file: /cvs/src/src/gold/version.cc,v retrieving revision 1.22 diff -p -u -r1.22 version.cc --- version.cc 1 Jan 2011 21:42:17 - 1.22 +++ version.cc 24 Jan 2011 22:26:52 - @@ -37,7 +37,7 @@ namespace gold // version number from configure.ac. But it's easier to just change // this file for now. -static const char* version_string = "1.10"; +static const char* version_string = "1.11"; // Report version information.
Error building gcc 4.5.2 for AVR
Hello, I am creating a script for building GCC 4.5.2 for the AVR target: http://www.cl.cam.ac.uk/~osc22/files/install_avr_tools.sh I have some troubles when building GCC-4.5.2, see below, maybe you can help me; thanks: ... make[3]: Nothing to be done for `all'. make[3]: Leaving directory `/local/scratch/osc22/temp/build-avr/gcc-4.5.2/buildavr/libiberty/testsuite' /bin/bash ../../libiberty/../mkinstalldirs /usr/local/avr/lib/`gcc -g -O2 -print-multi-os-directory` /usr/bin/install -c -m 644 ./libiberty.a /usr/local/avr/lib/`gcc -g -O2 -print-multi-os-directory`/./libiberty.an ( cd /usr/local/avr/lib/`gcc -g -O2 -print-multi-os-directory` ; chmod 644 ./libiberty.an ;ranlib ./libiberty.an ) mv -f /usr/local/avr/lib/`gcc -g -O2 -print-multi-os-directory`/./libiberty.an /usr/local/avr/lib/`gcc -g -O2 -print-multi-os-directory`/./libiberty.a if test -n ""; then \ case "" in \ /*) thd=;; \ *) thd=/usr/local/avr/include/;; \ esac; \ /bin/bash ../../libiberty/../mkinstalldirs ${thd}; \ for h in ../../libiberty/../include/ansidecl.h ../../libiberty/../include/demangle.h ../../libiberty/../include/dyn-string.h ../../libiberty/../include/fibheap.h ../../libiberty/../include/floatformat.h ../../libiberty/../include/hashtab.h ../../libiberty/../include/libiberty.h ../../libiberty/../include/objalloc.h ../../libiberty/../include/partition.h ../../libiberty/../include/safe-ctype.h ../../libiberty/../include/sort.h ../../libiberty/../include/splay-tree.h; do \ /usr/bin/install -c -m 644 $h ${thd}; \ done; \ fi make[3]: Entering directory `/local/scratch/osc22/temp/build-avr/gcc-4.5.2/buildavr/libiberty/testsuite' make[3]: Nothing to be done for `install'. make[3]: Leaving directory `/local/scratch/osc22/temp/build-avr/gcc-4.5.2/buildavr/libiberty/testsuite' make[2]: Leaving directory `/local/scratch/osc22/temp/build-avr/gcc-4.5.2/buildavr/libiberty' /bin/bash: line 3: cd: avr/libgcc: No such file or directory make[1]: *** [install-target-libgcc] Error 1 make[1]: Leaving directory `/local/scratch/osc22/temp/build-avr/gcc-4.5.2/buildavr' make: *** [install] Error 2 It seems the build cannot find the folder avr/libgcc, but why? When running, the libgcc error above shows: avr-gcc -mmcu=at90usb1287 -Wl,-Map=SCD.map SCD.o EMV.o halSCD.o scdIO.o utils.o terminal.o halSCD.S SCD.S -o SCD.elf /usr/local/avr/lib/gcc/avr/4.5.2/../../../../avr/bin/ld: cannot find -lgcc /usr/local/avr/lib/gcc/avr/4.5.2/../../../../avr/bin/ld: cannot find -lgcc collect2: ld returned 1 exit status Info about the avr-gcc: avr-gcc -v Using built-in specs. COLLECT_GCC=avr-gcc COLLECT_LTO_WRAPPER=/usr/local/avr/libexec/gcc/avr/4.5.2/lto-wrapper Target: avr Configured with: ../configure --target=avr --prefix=/usr/local/avr -v --program-prefix=avr- --with-gcc --with-gnu-ld --with-gnu-as --with-dwarf2 --disable-libssp --enable-languages=c,c++ --disable-werror --disable-nls Thread model: single gcc version 4.5.2 (GCC) And verbose output of running the command: avr-gcc -v -mmcu=at90usb1287 -Wl,-Map=SCD.map SCD.o EMV.o halSCD.o scdIO.o utils.o terminal.o halSCD.S SCD.S -o SCD.elf Using built-in specs. COLLECT_GCC=avr-gcc COLLECT_LTO_WRAPPER=/usr/local/avr/libexec/gcc/avr/4.5.2/lto-wrapper Target: avr Configured with: ../configure --target=avr --prefix=/usr/local/avr -v --program-prefix=avr- --with-gcc --with-gnu-ld --with-gnu-as --with-dwarf2 --disable-libssp --enable-languages=c,c++ --disable-werror --disable-nls Thread model: single gcc version 4.5.2 (GCC) COLLECT_GCC_OPTIONS='-v' '-mmcu=at90usb1287' '-o' 'SCD.elf' /usr/local/avr/libexec/gcc/avr/4.5.2/cc1 -E -lang-asm -quiet -v -imultilib avr51 halSCD.S -mmcu=at90usb1287 -fno-directives-only -o /tmp/ccPZdrIF.s ignoring nonexistent directory "/usr/local/avr/lib/gcc/avr/4.5.2/../../../../avr/sys-include" #include "..." search starts here: #include <...> search starts here: /usr/local/avr/lib/gcc/avr/4.5.2/include /usr/local/avr/lib/gcc/avr/4.5.2/include-fixed /usr/local/avr/lib/gcc/avr/4.5.2/../../../../avr/include End of search list. COLLECT_GCC_OPTIONS='-v' '-mmcu=at90usb1287' '-o' 'SCD.elf' /usr/local/avr/lib/gcc/avr/4.5.2/../../../../avr/bin/as -v -mmcu=at90usb1287 -o /tmp/ccAv4Kbo.o /tmp/ccPZdrIF.s GNU assembler version 2.21 (avr) using BFD version (GNU Binutils) 2.21 COLLECT_GCC_OPTIONS='-v' '-mmcu=at90usb1287' '-o' 'SCD.elf' /usr/local/avr/libexec/gcc/avr/4.5.2/cc1 -E -lang-asm -quiet -v -imultilib avr51 SCD.S -mmcu=at90usb1287 -fno-directives-only -o /tmp/ccPZdrIF.s ignoring nonexistent directory "/usr/local/avr/lib/gcc/avr/4.5.2/../../../../avr/sys-include" #include "..." search starts here: #include <...> search starts here: /usr/local/avr/lib/gcc/avr/4.5.2/include /usr/local/avr/lib/gcc/avr/4.5.2/include-fixed /usr/local/avr/lib/gcc/avr/4.5.2/../../../../avr/include End of search list. COLLECT_GCC_OPTIONS='-v' '-mmcu=at90usb1287' '-o' 'SCD.elf' /usr/local/avr/lib/gcc/avr/4.5.2
Re: Error building gcc 4.5.2 for AVR
On 24 January 2011 22:49, Omar Choudary wrote: > > I am creating a script for building GCC 4.5.2 for the AVR target: > http://www.cl.cam.ac.uk/~osc22/files/install_avr_tools.sh > > I have some troubles when building GCC-4.5.2, see below, maybe you can > help me; thanks: This question is off-topic on this mailing list, please send such questions to the gcc-h...@gcc.gnu.org list, as described http://gcc.gnu.org/lists.html Since you're using a third-party's build script, not the GCC installation instructions, you might be able to ask the author of that script for support.
Re: TREE_LIST removals and cleanups for 4.7
On Sat, Jan 22, 2011 at 10:52, Nathan Froyd wrote: > Comments? Concerns? Only one: thanks! They all look very useful to me. Diego.