Re: gen_lowpart called where 'truncate' needed?
Adam Nemet writes: > > > I think the right fix is to call convert_to_mode or convert_move in the > > > expansion code which ensure the proper truncation. > > > > That would yield correct code, but wouldn't it throw away the fact > > that the high bits are already known to be zero, and yield redundant > > zero-extension on some platforms? I'm guessing that's why the code was > > originally written to call convert_lowpart rather than convert_to_mode. > > convert_to_mode uses gen_lowpart for truncation if TRULY_NOOP_TRUNCATION. I was concerned about the !TRULY_NOOP_TRUNCATION case; I didn't want to harm someone else's performance to benefit my chip. But, looking around, it looks like (almost?) every chip either #defines TRULY_NOOP_TRUNCATION or wants this fix. So your suggestion sounds good to me, and in any case using straightforward type conversions should surely be the default choice. -Mat
Contributing
Hello GCC, This is Kishore. i am very much intrested in contributing to GCC. i want to contribute for DOCUMENTATION of the project. Can you please assign me some work and let me know the process involved in it. Eagerly waiting for your reply! Thanks & Regards, Kishore. S
Re: Contributing
On Fri, Feb 5, 2010 at 10:40, krishna kishore wrote: > i want to contribute for DOCUMENTATION of the project. Thank you for the offer. It's actually pretty brave, since documentation is one of the sore points of the project. Are you interested in contributing user documentation or internals? For user-level documentation, I'm not sure what is currently needed. Others may be able to give you better information. For internals documentation, I could give you a very long list of things we need, but first I'd like to know whether you'd be interested in that. Alternatively, you can search the wiki (http://gcc.gnu.org/wiki). Diego.
Re: Unwanted IRA copies via process_reg_shuffles
Jeff Law wrote: I was looking at a regression caused by having ira-reload utilize the existing copy detection code in IRA rather than my own and stumbled upon this... Consider this insn prior to IRA: (insn 72 56 126 8 j.c:744 (parallel [ (set (reg:SI 110) (minus:SI (reg:SI 69 [ ew_u$parts$lsw ]) (reg:SI 68 [ ew_u$parts$lsw ]))) (clobber (reg:CC 17 flags)) ]) 290 {*subsi_1} (expr_list:REG_DEAD (reg:SI 69 [ ew_u$parts$lsw ]) (expr_list:REG_DEAD (reg:SI 68 [ ew_u$parts$lsw ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil) Which matches this pattern in the x86 backend: (define_insn "*sub_1" [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") (minus:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,0") (match_operand:SWI 2 "" ",m"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (MINUS, mode, operands)" "sub{}\t{%2, %0|%0, %2}" [(set_attr "type" "alu") (set_attr "mode" "")]) Note carefully that the constraints require operands 0 and 1 to match. Operand 2 is not tied to any other operand. In fact, if operand 2 is tied to operand0, then we are guaranteed to generate a reload and muck up the code pretty badly. Now looking at the copies recorded by IRA we have: cp0:a0(r95)<->a1(r70)@248:move cp1:a4(r69)<->a6(r110)@178:constraint cp2:a3(r68)<->a6(r110)@22:shuffle cp3:a9(r66)<->a10(r92)@11:shuffle cp4:a11(r79)<->a12(r93)@89:move cp5:a1(r70)<->a14(r96)@114:constraint cp6:a1(r70)<->a13(r97)@114:constraint Note carefully cp2 which claims a shuffle-copy between r68 and r110. ISTM that when trying to assign a hard reg to pseudo r110 that if r68 has a hard reg, but r69 does not, then pseudo r110 will show a cost savings if it is allocated into the same hard reg as pseudo r68. The problematic code is add_insn_allocno_copies: { extract_insn (insn); for (i = 0; i < recog_data.n_operands; i++) { operand = recog_data.operand[i]; if (REG_SUBREG_P (operand) && find_reg_note (insn, REG_DEAD, REG_P (operand) ? operand : SUBREG_REG (operand)) != NULL_RTX) { str = recog_data.constraints[i]; while (*str == ' ' || *str == '\t') str++; bound_p = false; for (j = 0, commut_p = false; j < 2; j++, commut_p = true) if ((dup = get_dup (i, commut_p)) != NULL_RTX && REG_SUBREG_P (dup) && process_regs_for_copy (operand, dup, true, NULL_RTX, freq)) bound_p = true; if (bound_p) continue; /* If an operand dies, prefer its hard register for the output operands by decreasing the hard register cost or creating the corresponding allocno copies. The cost will not correspond to a real move insn cost, so make the frequency smaller. */ process_reg_shuffles (operand, i, freq < 8 ? 1 : freq / 8); } } With r68 dying and not bound to an another operand, we create a reg-shuffle copy to encourage tying r68 to the output operand. Not good. ISTM that if an output is already bound to some input that we should not be recording a copy between an unbound dying input and the bound output. You can play with the attached (meaningless) testcase -O2 -m32 -fPIC. You won't see code quality regressions due to this problem on this testcase with the mainline sources, but it should be enough to trigger the bogus copy. Thoughts? Yes, Jeff, you are probably right that we should make the shuffle copy when there is already a constraint copy to the operand. There is no regression on the test because shuffle copy has always the smaller cost. I introduced the shuffle copies because they improved SPEC2000 rates for x86/x86_64 but I did not try to write a code to remove the situation you found. I think we should try to remove shuffle copy (or even discorage such shuffle through negative copy cost) and see what happened to SPEC rates.
Re: Exception handling information in the macintosh
Jack Howarth a écrit : On Thu, Feb 04, 2010 at 08:12:10PM +0100, jacob navia wrote: Hi I have developed a JIT for linux 64 bits. It generates exception handling information according to DWARF under linux and it works with gcc 4.2.1. I have recompiled the same code under the Macintosh and something has changed, apparently, because now any throw that passes through my code crashes. Are there any differences bertween the exception info format between the macintosh and linux? The st...@the moment of the throw looks like this: CPP code compiled with gcc 4.2.1 calls JIT code generated on the fly by my JIT compiler that calls CPP code compiled with gcc 4.2.1 that throws. The catch is in the CPP code The throw must go through the JIT code, so it needs the DWARF frame descriptions that I generate. Apparently there is a difference. Thanks in advance for any information. jacob Jacob, Are you compiling on darwin10 and using the Apple or FSF gcc compilers? If you are using Apple's, this question should be on the darwin-devel mailing list instead. I did that. I was compiling with Apple's gcc. Now, I downloaded the source code of gcc 4.2.1 and compiled that in my Mac. The build crashed in the java section by the way, there was a script that supposed the object files in a .libs directory but the objects were in the same directory as the source code. This happened several times, so at the end I stopped since I am not interested in Java. I installed gcc, everything went OK, and I recompiled the source code with the new gcc. Then, in the new executable, the normal throws that have been working under the Apple's gcc do not work anymore and any throw (not only those that go through the JIT) fail. I do not understand what is going on. I would mention though that darwin10 is problematic in that the libgcc and its unwinder calls are now subsumed into libSystem. This means that regardless of how you try to link in libgcc, the new code in libSystem will always be used. For darwin10, Apple decided to default their linker over to compact unwind which causes problems with some of the java testcases on gcc 4.4.x. This is fixed for FSF gcc 4.5 by forcing the compiler to always link with the -no_compact_unwind option. If you use that option I get ld: symbol dyld_stub_binding_helper not defined (usually in crt1.o/dylib1.o/bundle1.o) and Apple's linker refuses to go on. Another complexity is that Apple decided to silently abort some of the libgcc calls (now in libSystem) that require access to FDEs like _Unwind_FindEnclosingFunction(). The reasoning was that the default behavior (compact unwind info) doesn't use FDEs. This is fixed for gcc 4.5 by http://gcc.gnu.org/ml/gcc-patches/2009-12/msg00998.html. If you are using any other unwinder call that is now silently aborting, let me know as it may be another that we need to re-export under a different name from libgcc_ext. Alternatively, you may be able to work around this issue by using -mmacosx-version-min=10.5 under darwin10. Jack OK, now, what would be the procedure for getting to avoid Apple's modifications to the exception handling stuff? Pleeeze :-) P.S. If this discussion does not belong in this list please send me just an email. Thanks for your answers. jacob
_cpp_line_note structure in front end
Hello all, Could anybody please answer me on one question which is related to _cpp_line_note structure (GCC front end)? I am currently working on modifying FE to "swallow" piece of code similar to this one: _Asm void DoSomething(some_parameters) { mov r1, r2 mov r2, r3 ... and similar assembler code } The idea is to just take assembler body and parse it as there was asm("") statement. I've managed to lex/parse whole this. However, when FE lexes assembler body of this function, my code fails starting from here: skipped_white: if (buffer->cur >= buffer->notes[buffer->cur_note].pos && !pfile->overlaid_buffer) { _cpp_process_line_notes (pfile, false); result->src_loc = pfile->line_table->highest_line; } It goes into _cpp_process_line_notes and then into abort() Above quoted code is located in _cpp_lex_direct() in lex.c file. In short, (buffer->cur>= buffer->notes[buffer->cur_note].pos) is true. I understand that I need to reposition pos field of _cpp_line_note structure, but I don't know where to position it. At the end of assembler body? At the closing parenthesis of DoSomething() function? Where should it go? What is the purpose of _cpp_line_note structure and this field at all? Any suggestion/help/advice would be much appreciated. Thanks in advance! Best regards, Nikola
Re: Long paths with ../../../../ throughout
Hello Ian Ian Lance Taylor wrote: [.] I've attached collect2 patch. Let me know what you think of it. There is actually a GNU standard for --help output, and collect2 might as well follow it. http://www.gnu.org/prep/standards/html_node/_002d_002dhelp.html Ok, looks good, I've updated the changes, please find attached revised patch. Do you have a copyright assignment/disclaimer with the FSF? I asked FSF this week, I'm just waiting for the snail mail to arrive. Will post it back as soon as it does. Cheers, Jon Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,20 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " Wrap linker and generate constructor code if needed.\n"); + fprintf (stderr, " Options:\n"); + fprintf (stderr, " -debug Enable debug output\n"); + fprintf (stderr, " --help Display this information\n"); + fprintf (stderr, " -v, --version Display this program's version number\n"); + fprintf (stderr, "Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); + fprintf (stderr, "Report bugs: http://gcc.gnu.org/\n";); + + collect_exit (0); +} + if (debug) { const char *ptr; Index: collect2.h === --- collect2.h (revision 156482) +++ collect2.h (working copy) @@ -38,7 +38,7 @@ extern const char *c_file_name; extern struct obstack temporary_obstack; extern char *temporary_firstobj; -extern int vflag, debug; +extern bool vflag, debug; extern void error (const char *, ...) ATTRIBUTE_PRINTF_1; extern void notice (const char *, ...) ATTRIBUTE_PRINTF_1;
Re: Exception handling information in the macintosh
On 05/02/2010 18:46, jacob navia wrote: > The build crashed in the java section by the way, there was a script that > supposed the object files in a .libs directory but the objects were in the > same directory as the source code. This happened several times, so at the > end I stopped since I am not interested in Java. Then this is just FYI: there should have been matching sets of .o files in both the main directory and the .libs subdirectory; the ones in the .libs subdir are/should be compiled as position-independent code. That's how it usually works anyway; Darwin may do things differently. You also shouldn't build in the source directory, it's not guaranteed to work and occasionally fails in peculiar ways. cheers, DaveK
Re: Exception handling information in the macintosh
On Fri, Feb 05, 2010 at 09:06:56PM +, Dave Korn wrote: > On 05/02/2010 18:46, jacob navia wrote: > > > The build crashed in the java section by the way, there was a script that > > supposed the object files in a .libs directory but the objects were in the > > same directory as the source code. This happened several times, so at the > > end I stopped since I am not interested in Java. > > Then this is just FYI: there should have been matching sets of .o files in > both the main directory and the .libs subdirectory; the ones in the .libs > subdir are/should be compiled as position-independent code. That's how it > usually works anyway; Darwin may do things differently. > > You also shouldn't build in the source directory, it's not guaranteed to > work and occasionally fails in peculiar ways. > > cheers, > DaveK Dave, I suspect one major problem is that he is building FSF gcc 4.2.1 on darwin10. The first FSF gcc validated on darwin10 was 4.4. Also, it wasn't until gcc 4.5 that -no_compact_unwind was passed to the linker on darwin10. Thus, gcc 4.4.x will be using the new compact unwinder on darwin10 which is the origin of some of the x86_64-apple-darwin10 java testsuite failures (now absent in FSF gcc trunk). Jack
aliases without a _DECL?
To fix PR 12909 with minimal ABI breakage, I'd like to be able to just hang extra symbols off the cgraph node for a variable or function and have them all emitted together. I could do this with the existing alias mechanisms, but they involve additional DECLs for the aliases. Does it seem reasonable to add extra symbol names to cgraph, or should I make dummy DECLs for them? Jason
Re: aliases without a _DECL?
Hi, I have no idea what you would like to achieve by this? I assume that you want to add aliases to given declaration without actually creating alias DECLs, just assembler symbol names. But without the DECLs there would be absolutely no way to reffer to these within current unit, so I guess cgraph don't need to care about them much (i.e. they can just be some list assigned to node or decl). But then I don't see how this will work with LTO, so it seems that creating real aliases should work better? Honza > To fix PR 12909 with minimal ABI breakage, I'd like to be able to just > hang extra symbols off the cgraph node for a variable or function and > have them all emitted together. I could do this with the existing alias > mechanisms, but they involve additional DECLs for the aliases. Does it > seem reasonable to add extra symbol names to cgraph, or should I make > dummy DECLs for them? > > Jason
Re: aliases without a _DECL?
On 02/05/2010 05:56 PM, Jan Hubicka wrote: But without the DECLs there would be absolutely no way to reffer to these within current unit, so I guess cgraph don't need to care about them much (i.e. they can just be some list assigned to node or decl). Right, they would just be for binary compatibility. But then I don't see how this will work with LTO, so it seems that creating real aliases should work better? Yeah, I suppose it's a rare enough issue that I might as well use the normal process. Jason
Re: aliases without a _DECL?
> On 02/05/2010 05:56 PM, Jan Hubicka wrote: >> But without the DECLs there would be absolutely no way to reffer to these >> within current unit, so I guess cgraph don't need to care about them much >> (i.e. they can just be some list assigned to node or decl). > > Right, they would just be for binary compatibility. > >> But then I don't see how this will work with LTO, so it seems that creating >> real aliases should work better? > > Yeah, I suppose it's a rare enough issue that I might as well use the > normal process. Java is producing aliases for pretty much everything, so the normal process should be ready to scale to quite few aliases ;) Honza > > Jason
Re: _cpp_line_note structure in front end
Nikola Ikonic writes: > I am currently working on modifying FE to "swallow" piece of code > similar to this one: > > _Asm void DoSomething(some_parameters) { > mov r1, r2 > mov r2, r3 > ... and similar assembler code > } > > The idea is to just take assembler body and parse it as there was > asm("") statement. > > I've managed to lex/parse whole this. However, when FE lexes assembler > body of this function, my code fails starting from here: > > skipped_white: > if (buffer->cur >= buffer->notes[buffer->cur_note].pos > && !pfile->overlaid_buffer) > { > _cpp_process_line_notes (pfile, false); > result->src_loc = pfile->line_table->highest_line; > } > > It goes into _cpp_process_line_notes and then into abort() > > Above quoted code is located in _cpp_lex_direct() in lex.c file. > > In short, (buffer->cur>= buffer->notes[buffer->cur_note].pos) is true. > I understand that I need to reposition > pos field of _cpp_line_note structure, but I don't know where to > position it. At the end of assembler body? > At the closing parenthesis of DoSomething() function? Where should it > go? What is the purpose of _cpp_line_note structure > and this field at all? The add_line_note function is used to record locations where a warning should be issued. I don't see how any change you describe could cause _cpp_process_line_notes to abort. _cpp_process_line_notes just looks at the type of the note added by a call to add_line_note. You shouldn't need to change anything. Ian
Problem with stores and loads from unions and proposed fix
This problem showed up in a PDP10 C version of GCC I'm responsible for and took a good while to track down. The fix is in generic gcc code so even though my PDP10 compiler is not an official gcc version and I haven't been successful at creating a failing program on the Intel compiler it seems like it should cause problems elsewhere so I figured I should pass it on. Here's a union that allows referencing bits in a word in different ways (the PDP10 has a 36 bit word, but that doesn't seem to be an issue here) union { int word; struct { unsigned long w0 : 32; unsigned long pad : 4; } i32; struct { unsigned long s0 : 16; unsigned long s1 : 16; unsigned long pad : 4; } i16; struct { unsigned long b0 : 8; unsigned long b1 : 8; unsigned long b2 : 8; unsigned long b3 : 8; unsigned long pad : 4; } i8; } u32; u32.word = ; /* in a subsequent different basic block which is guaranteed to be reached with u32 unchanged */ u32.i8.b1 = ... ; ... = u32.word ; CSE detects that the same subexpression is used in two places and substitutes a reaching register for the reference to u32.word without noticing that the memory has been modified by the bit field reference. Adding a call to invalidate_any_buried_refs(dest) flags the memory reference in such a way that it is not ignored and the erroneous CSE optimization is not done. --- gcse.c (revision 156482) +++ gcse.c (working copy) @@ -4783,7 +4783,12 @@ compute_ld_motion_mems (void) else ptr->invalid = 1; } - } + else +{ + /* Make sure there isn't a buried store somewhere. */ + invalidate_any_buried_refs (dest); +} + } else invalidate_any_buried_refs (PATTERN (insn)); } Thanks to anyone who can help determine whether this is a problem for other gcc versions and getting a fix into the gcc source. Martin Chaney XKL, LLC