Re: Linemap and pph
Gabriel Charette a écrit: > From what I understand, the source_locations allocated for > everything in a given set of headers (from the LC_ENTER for the > header in the line_table up to, and including everything in between, > the corresponding LC_LEAVE) is dependent on only one thing; the > value of line_table->highest_location when the header was inserted The source_location handed out by a given line map is dependant on three things: 1/ The line_map::start_location. For a given map, this one equals line_table->highest_location + 1, because at the time the line map is created, its line_map::start_location must be greater than the highest source_location handed out by any line map previously created. At any point in time, line_table->highest_location equals the line_map::start_location of the lastly created line_map. This is part of the monotonically increasing property of source_location we were talking about earlier. 2/ The line number of the location 3/ The column number of the location > (i.e. if in two different contexts we happen to have the same > line_table->highest_location when inserting header A.h, all of the > tokens for A.h in each context will have the same source_location). Each token coming from a given A.h will have a different source_location, as they will presumably have a different {line,column} pair. If in two different contexts we happen to have the same line_table->highest_location, the line_map::start_location of the two different line maps of the two different A.hs will be equal, though. > If the previous statement is true, then we calculate an offset > between the line_table->highest_location as it was in the serialized > version of the header and as it is in the C file in which we are > about to de-serialize the line table. > > We then need to update some things based on that offset: [...] It seems to me that you could just set the line_map::start_location of the de-serialized map for the current portion of A.h to the current line_table->highest_location of the main CU your are currently parsing (i.e, just forget about the serialized line_table->highest_location into account. Actually I would think that it's unnecessary to serialize the line_table->highest_location) + 1. Then you should also set the de-serialized line_map::to_line to the current line in your context. Then you should add that line map to the current line_table and set line_table->highest_location to the line_map::start_location you have just computed. Then, assign the source_location of each token that belongs to that de-serialized map to a source_location that is handed out by your new linemap_position_for_line_and_column, with de-serialized map passed in argument. This is where you would progress "backward" in the token stream, for that given map. Then somehow, when you need to suck in a new pph (so this time you are going downward in the stream again), just start this dance of de-serializing the map for that new pph, updating its properties, adding it to line_table, and setting the source_location of the tokens coming for that pph again. -- Dodji
Question about SMS scheduling windows
I've been looking at SMS, and have a question about get_sched_window. When there are previously-scheduled predessors, we use: if (e->data_type == MEM_DEP) end = MIN (end, SCHED_TIME (v_node) + ii - 1); to get an upper bound on the scheduling window that is permitted by memory dependencies. I think this: SCHED_TIME (v_node) + ii - 1 is an inclusive bound, in that scheduling the node at that time would not break the memory dependence, whereas scheduling at SCHED_TIME (v_node) would. Is that right? I ask because in the final range: start = early_start; end = MIN (end, early_start + ii); /* Schedule the node close to it's predecessors. */ step = 1; END is an exclusive bound. It seems like we might be double-counting here, and effectively limiting the schedule to SCHED_TIME (v_node) + ii - 2. While I'm here, I was also curious about: /* If there are more successors than predecessors schedule the node close to it's successors. */ if (count_succs >= count_preds) { int old_start = start; start = end - 1; end = old_start - 1; step = -1; } This doesn't seem to be in the paper, and the comment suggests "count_succs > count_preds" rather than "count_succs >= count_preds". Is the ">=" vs ">" important? Thanks, Richard
Re: Question about SMS scheduling windows
Richard Sandiford writes: > I've been looking at SMS, and have a question about get_sched_window. > When there are previously-scheduled predessors, we use: > > if (e->data_type == MEM_DEP) > end = MIN (end, SCHED_TIME (v_node) + ii - 1); > > to get an upper bound on the scheduling window that is permitted > by memory dependencies. I think this: > > SCHED_TIME (v_node) + ii - 1 > > is an inclusive bound, in that scheduling the node at that time > would not break the memory dependence, whereas scheduling at > SCHED_TIME (v_node) would. Is that right? I meant of course "scheduling at SCHED_TIME (v_node) + ii would". Richard
Re: Question about SMS scheduling windows
Hello Richard, > I ask because in the final range: > > start = early_start; > end = MIN (end, early_start + ii); > /* Schedule the node close to it's predecessors. */ > step = 1; > > END is an exclusive bound. It seems like we might be double-counting here, > and effectively limiting the schedule to SCHED_TIME (v_node) + ii - 2. Yes, I think it indeed should be fixed. Thanks for reporting on this. Revital
Re: Linemap and pph
I think I wasn't clear in the way I expressed my assumptions in my last email: On Wed, Jul 27, 2011 at 1:11 AM, Dodji Seketeli wrote: > > Gabriel Charette a écrit: > > > From what I understand, the source_locations allocated for > > everything in a given set of headers (from the LC_ENTER for the > > header in the line_table up to, and including everything in between, > > the corresponding LC_LEAVE) is dependent on only one thing; the > > value of line_table->highest_location when the header was inserted > > The source_location handed out by a given line map is dependant on > three things: > > 1/ The line_map::start_location. For a given map, this one equals > line_table->highest_location + 1, because at the time the line map is > created, its line_map::start_location must be greater than the highest > source_location handed out by any line map previously created. At any > point in time, line_table->highest_location equals the > line_map::start_location of the lastly created line_map. This is part > of the monotonically increasing property of source_location we were > talking about earlier. > Actually from my understanding, highest_location is not equal to the last start_location, but to the last source_location returned by either linemap_line_start or linemap_positition_for_column (which is >= to the start_location of the current line_map). > 2/ The line number of the location > > 3/ The column number of the location > Right, that's what I mean, I would not actually stream highest_location. What I meant by they "all depend on only highest_location" is that IF highest_location is the same in file B.c and in a different compiled file C.c when they happen to include A.h, then all of the source_locations for the tokens in A.h (in both compilation) would be identical (i.e. token1's loc in B == token1's loc in C, etc.) (as 2/3 always is the same since we're talking about the same file A.h in both compilation, hence if 1 also holds, we get the same result). > > (i.e. if in two different contexts we happen to have the same > > line_table->highest_location when inserting header A.h, all of the > > tokens for A.h in each context will have the same source_location). > > Each token coming from a given A.h will have a different > source_location, as they will presumably have a different {line,column} > pair. > What I meant is that all of the source locations handed out in the first compilation will be the same as all of the source locations handed out in the second compilation, pairwise (not that ALL token's source locations themselves will be the same within a single compilation of course!). > If in two different contexts we happen to have the same > line_table->highest_location, the line_map::start_location of the two > different line maps of the two different A.hs will be equal, though. > > > If the previous statement is true, then we calculate an offset > > between the line_table->highest_location as it was in the serialized > > version of the header and as it is in the C file in which we are > > about to de-serialize the line table. > > > > We then need to update some things based on that offset: > > [...] > Hence, given that they only depend on start_location, I just have to calculate an offset between the serialized start_location and the start_location as it would be (highest_location + 1) in the C file including the header, and offset all of the source_locations on each token coming from the pph (without even needing to recalculate them!). > It seems to me that you could just set the line_map::start_location of > the de-serialized map for the current portion of A.h to the current > line_table->highest_location of the main CU your are currently parsing > (i.e, just forget about the serialized line_table->highest_location > into account. Actually I would think that it's unnecessary to > serialize the line_table->highest_location) + 1. > > Then you should also set the de-serialized line_map::to_line to the > current line in your context. Then you should add that line map to > the current line_table and set line_table->highest_location to the > line_map::start_location you have just computed. > > Then, assign the source_location of each token that belongs to that > de-serialized map to a source_location that is handed out by your new > linemap_position_for_line_and_column, with de-serialized map passed in > argument. This is where you would progress "backward" in the token > stream, for that given map. > > Then somehow, when you need to suck in a new pph (so this time you are > going downward in the stream again), just start this dance of > de-serializing the map for that new pph, updating its properties, > adding it to line_table, and setting the source_location of the tokens > coming for that pph again. > Doing it this way (with the offset) I would read in all the tokens and linemap entries inherited from that header and it's underlying include tree, thus no need to be tricky about inserting li
Re: Linemap and pph
Gabriel Charette a écrit: > Actually from my understanding, highest_location is not equal to the > last start_location, but to the last source_location returned by > either linemap_line_start or linemap_positition_for_column (which is > >>= to the start_location of the current line_map). Right, it equals the highest location yielded by any map in the system. Sorry. > >> 2/ The line number of the location >> >> 3/ The column number of the location >> > > Right, that's what I mean, I would not actually stream > highest_location. What I meant by they "all depend on only > highest_location" is that IF highest_location is the same in file B.c > and in a different compiled file C.c when they happen to include A.h, > then all of the source_locations for the tokens in A.h (in both > compilation) would be identical (i.e. token1's loc in B == token1's > loc in C, etc.) (as 2/3 always is the same since we're talking about > the same file A.h in both compilation, hence if 1 also holds, we get > the same result). > Oh, OK. >> > (i.e. if in two different contexts we happen to have the same >> > line_table->highest_location when inserting header A.h, all of the >> > tokens for A.h in each context will have the same source_location). >> >> Each token coming from a given A.h will have a different >> source_location, as they will presumably have a different {line,column} >> pair. >> > > What I meant is that all of the source locations handed out in the > first compilation will be the same as all of the source locations > handed out in the second compilation, pairwise (not that ALL token's > source locations themselves will be the same within a single > compilation of course!). > OK. >> If in two different contexts we happen to have the same >> line_table->highest_location, the line_map::start_location of the two >> different line maps of the two different A.hs will be equal, though. >> >> > If the previous statement is true, then we calculate an offset >> > between the line_table->highest_location as it was in the serialized >> > version of the header and as it is in the C file in which we are >> > about to de-serialize the line table. >> > >> > We then need to update some things based on that offset: >> >> [...] >> > > Hence, given that they only depend on start_location, I just have to > calculate an offset between the serialized start_location and the > start_location as it would be (highest_location + 1) in the C file > including the header, and offset all of the source_locations on each > token coming from the pph (without even needing to recalculate them!). > That could work. But then you'd need to do something for a map encoding the locations of tokens coming from the pph to appear in line_table, right? Otherwise, at lookup time, (when you want to find the map that matches the source_location of a token coming for that pph), you'll be in trouble. I am saying this b/c you are not calling linemap_line_start anymore. And that function was the one that was including the said map to line_table. And the map still must be inserted into line_table somehow. >> It seems to me that you could just set the line_map::start_location of >> the de-serialized map for the current portion of A.h to the current >> line_table->highest_location of the main CU your are currently parsing >> (i.e, just forget about the serialized line_table->highest_location >> into account. Actually I would think that it's unnecessary to >> serialize the line_table->highest_location) + 1. >> >> Then you should also set the de-serialized line_map::to_line to the >> current line in your context. Then you should add that line map to >> the current line_table and set line_table->highest_location to the >> line_map::start_location you have just computed. >> >> Then, assign the source_location of each token that belongs to that >> de-serialized map to a source_location that is handed out by your new >> linemap_position_for_line_and_column, with de-serialized map passed in >> argument. This is where you would progress "backward" in the token >> stream, for that given map. >> >> Then somehow, when you need to suck in a new pph (so this time you are >> going downward in the stream again), just start this dance of >> de-serializing the map for that new pph, updating its properties, >> adding it to line_table, and setting the source_location of the tokens >> coming for that pph again. >> > > Doing it this way (with the offset) I would read in all the tokens and > linemap entries inherited from that header and it's underlying include > tree, thus no need to be tricky about inserting line tables for the > header's included file, as they are part of the header's serialized > line_table by recursion (a pph'ed header can include other pph'ed > header), This is what I am not sure to understand. There is only one line table per CU. The headers included by the CU generate instances of struct line map that are inserted into the line table of the CU. So I
Register pressure analysis
Hello! Is there any scripts/tools that parse GCC dumps and report register pressure in the dump - either overall, or in different parts of the dump? regards, Sergos
Re: Linemap and pph
>> >> Hence, given that they only depend on start_location, I just have to >> calculate an offset between the serialized start_location and the >> start_location as it would be (highest_location + 1) in the C file >> including the header, and offset all of the source_locations on each >> token coming from the pph (without even needing to recalculate them!). >> > > That could work. But then you'd need to do something for a map encoding > the locations of tokens coming from the pph to appear in line_table, > right? Otherwise, at lookup time, (when you want to find the map that > matches the source_location of a token coming for that pph), you'll be > in trouble. I am saying this b/c you are not calling linemap_line_start > anymore. And that function was the one that was including the said map > to line_table. And the map still must be inserted into line_table > somehow. > The lookup, from what I understand, only depends on the line_map entries for the header to be present, and the same as they would be *had* the functions been called (not that each token actually called the linemap getters to get its location), and also of course depends on that the tokens have the correct source_locations *as if* obtained from the linemap getters. >> Doing it this way (with the offset) I would read in all the tokens and >> linemap entries inherited from that header and it's underlying include >> tree, thus no need to be tricky about inserting line tables for the >> header's included file, as they are part of the header's serialized >> line_table by recursion (a pph'ed header can include other pph'ed >> header), > > This is what I am not sure to understand. There is only one line table > per CU. The headers included by the CU generate instances of struct > line map that are inserted into the line table of the CU. So I don't > understand what you mean by "header's serialized line_table", as I don't > think there is such a thing as a header's line_table. What I mean by "serialized header line_table" is the serialized version of the line_table as it was when were done parsing the header being pph'ed. I would then de-serialize that and insert it's line_map entries in the C file's line_table, doing the necessary offset adjustements in the process (and updating all other line_table variables like highest_location that would have changed if we had actually called the linemap functions) Best, Gabriel
Re: Register pressure analysis
On 07/27/2011 10:26 AM, Sergey Ostanevich wrote: Hello! Is there any scripts/tools that parse GCC dumps and report register pressure in the dump - either overall, or in different parts of the dump? -fdump-rtl-ira creates an info dump of IRA. There is info about maximal pressure for each region (currently loops) and each register pressure class. Loop 0 corresponds to all program. By default, a loop region is created when register pressure is high. If you need to see register pressure for all loops you can use option -fira-region=all.
Re: Linemap and pph
Gabriel Charette a écrit: >>> >>> Hence, given that they only depend on start_location, I just have to >>> calculate an offset between the serialized start_location and the >>> start_location as it would be (highest_location + 1) in the C file >>> including the header, and offset all of the source_locations on each >>> token coming from the pph (without even needing to recalculate them!). >>> >> >> That could work. But then you'd need to do something for a map encoding >> the locations of tokens coming from the pph to appear in line_table, >> right? Otherwise, at lookup time, (when you want to find the map that >> matches the source_location of a token coming for that pph), you'll be >> in trouble. I am saying this b/c you are not calling linemap_line_start >> anymore. And that function was the one that was including the said map >> to line_table. And the map still must be inserted into line_table >> somehow. >> > > The lookup, from what I understand, only depends on the line_map > entries for the header to be present, Exactly. The line_map need to be present inside line_table->maps, so you need to insert it in there somehow. It wasn't clear to me from my reading of your previous messages. [...] >>> Doing it this way (with the offset) I would read in all the tokens and >>> linemap entries inherited from that header and it's underlying include >>> tree, thus no need to be tricky about inserting line tables for the >>> header's included file, as they are part of the header's serialized >>> line_table by recursion (a pph'ed header can include other pph'ed >>> header), >> >> This is what I am not sure to understand. There is only one line table >> per CU. The headers included by the CU generate instances of struct >> line map that are inserted into the line table of the CU. So I don't >> understand what you mean by "header's serialized line_table", as I don't >> think there is such a thing as a header's line_table. > > What I mean by "serialized header line_table" is the serialized > version of the line_table as it was when were done parsing the header > being pph'ed. OK. Please note that you don't necessarily need to serialize the entire line_map at the parsing point of the pph. I believe that just serializing the line maps of the pph'ed header should suffice. How to determine them is another question. :-) > I would then de-serialize that and insert it's line_map entries in the > C file's line_table, doing the necessary offset adjustements in the > process (and updating all other line_table variables like > highest_location that would have changed if we had actually called the > linemap functions) OK. We are on the same page now. Thanks. -- Dodji
Re: Question about SMS scheduling windows
(sorry for replicated submissions, had to convert to plain text) >2011/7/27 Revital1 Eres > >Hello Richard, > > >> I ask because in the final range: >> >> start = early_start; >> end = MIN (end, early_start + ii); >> /* Schedule the node close to it's predecessors. */ >> step = 1; >> >> END is an exclusive bound. It seems like we might be double-counting here, >> and effectively limiting the schedule to SCHED_TIME (v_node) + ii - 2. > > >Yes, I think it indeed should be fixed. Thanks for reporting on this. > >Revital Agreed; if (e->data_type == MEM_DEP) end = MIN (end, SCHED_TIME (v_node) + ii - 1); should be replaced with if (e->data_type == MEM_DEP) end = MIN (end, p_st + ii); also for the (3rd) case when there are both previously-scheduled predessors and previously-scheduled successors. The range is inclusive of start and exclusive of end: for (c = start; c != end; c += step)... >This doesn't seem to be in the paper, and the comment suggests >"count_succs > count_preds" rather than "count_succs >= count_preds". >Is the ">=" vs ">" important? I think not: in both cases you'll be trying to minimize the same number of live ranges. But indeed it's good to be consistent with the comment. Thanks, Ayal.
RFC: PATCH: Require and use int64 for x86 options
On Wed, Jul 13, 2011 at 6:22 AM, Ian Lance Taylor wrote: > Igor Zamyatin writes: > >> As you may see pta_flags enum in i386.c is almost full. So there is a >> risk of overflow in quite near future. Comment in source code advises >> "widen struct pta flags" which is now defined as unsigned. But it >> looks not optimal. >> >> What will be the most proper solution for this problem? > > Why is widening pta_flags "not optimal?" > > It's hard for me to believe that we still care about bootstrapping a > i386-*-* compiler with a compiler which doesn't support any 64-bit type. > So I don't see any problem with setting need_64bit_hwint=yes in > config.gcc for i386-*-*, changing pta_flags to be unsigned > HOST_WIDE_INT, and letting pta_flags go up to (unsigned HOST_WIDE_INT) 1 > << 63. > > If anybody doesn't like that idea, we can simply add a flags2 field and > a pta_flags2 enum with PTA2_xxx constants. > Hi, We are also running out of bits in ix86_isa_flags. This patch uses int64 on both ix86_isa_flags and PTA. I added a new option to opt: ; Maximum number of mask bits in a variable. MaxMaskBits ix86_isa_flags = 64 It mark ix86_isa_flags as 64bit. Any comments? Thanks. -- H.J. --- gcc/ 2011-07-27 H.J. Lu * config.gcc: Set need_64bit_hwint to yes for x86 targets. * opt-read.awk (BEGIN): Set max_mask_bits[var] and var_mask_1[var]. * opth-gen.awk: Use var_mask_1[var] instead of 1. Check max_mask_bits[var] instead of 31. * config/i386/i386.c (pta): Use HOST_WIDE_INT on flags. (builtin_isa): Use HOST_WIDE_INT on isa. (def_builtin): Use HOST_WIDE_INT on mask. (def_builtin_const): Likewise. (builtin_description): Likewise. * config/i386/i386.opt (MaxMaskBits): New. (ix86_isa_flags): Replace int with HOST_WIDE_INT. (ix86_isa_flags_explicit): Likewise. (x_ix86_isa_flags_explicit): Likewise. libcpp/ 2011-07-27 H.J. Lu * configure.ac: Set need_64bit_hwint to yes for x86 targets. * configure: Regenerated. gcc/ 2011-07-27 H.J. Lu * config.gcc: Set need_64bit_hwint to yes for x86 targets. * opt-read.awk (BEGIN): Set max_mask_bits[var] and var_mask_1[var]. * opth-gen.awk: Use var_mask_1[var] instead of 1. Check max_mask_bits[var] instead of 31. * config/i386/i386.c (pta): Use HOST_WIDE_INT on flags. (builtin_isa): Use HOST_WIDE_INT on isa. (def_builtin): Use HOST_WIDE_INT on mask. (def_builtin_const): Likewise. (builtin_description): Likewise. * config/i386/i386.opt (MaxMaskBits): New. (ix86_isa_flags): Replace int with HOST_WIDE_INT. (ix86_isa_flags_explicit): Likewise. (x_ix86_isa_flags_explicit): Likewise. libcpp/ 2011-07-27 H.J. Lu * configure.ac: Set need_64bit_hwint to yes for x86 targets. * configure: Regenerated. diff --git a/gcc/config.gcc b/gcc/config.gcc index d7cf895..54ac985 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -345,6 +345,7 @@ i[34567]86-*-*) cpu_type=i386 c_target_objs="i386-c.o" cxx_target_objs="i386-c.o" + need_64bit_hwint=yes extra_options="${extra_options} fused-madd.opt" extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 96263ed..c5dd881 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2918,7 +2918,7 @@ ix86_option_override_internal (bool main_args_p) PTA_F16C = 1 << 26, PTA_BMI = 1 << 27, PTA_TBM = 1 << 28 - /* if this reaches 32, need to widen struct pta flags below */ + /* if this reaches 64, need to widen struct pta flags below */ }; static struct pta @@ -2926,7 +2926,7 @@ ix86_option_override_internal (bool main_args_p) const char *const name; /* processor name or nickname. */ const enum processor_type processor; const enum attr_cpu schedule; - const unsigned /*enum pta_flags*/ flags; + const unsigned HOST_WIDE_INT /*enum pta_flags*/ flags; } const processor_alias_table[] = { @@ -24016,7 +24016,7 @@ static GTY(()) tree ix86_builtins[(int) IX86_BUILTIN_MAX]; struct builtin_isa { const char *name; /* function name */ enum ix86_builtin_func_type tcode; /* type to use in the declaration */ - int isa; /* isa_flags this builtin is defined for */ + HOST_WIDE_INT isa; /* isa_flags this builtin is defined for */ bool const_p; /* true if the declaration is constant */ bool set_and_not_built_p; }; @@ -24041,7 +24041,8 @@ static struct builtin_isa ix86_builtins_isa[(int) IX86_BUILTIN_MAX]; errors if a builtin is added in the middle of a function scope. */ static inline tree -def_builtin (int mask, const char *name, enum ix86_builtin_func_type tcode, +def_builtin (HOST_WIDE_INT mask, const char *name, + enum ix86_builtin_func_type tcode, enum ix86_builtins code) { tree decl = NULL_TREE; @@ -24079,7 +24080,7 @@ def_bu
Re: RFC: PATCH: Require and use int64 for x86 options
On Wed, Jul 27, 2011 at 6:42 PM, H.J. Lu wrote: >>> As you may see pta_flags enum in i386.c is almost full. So there is a >>> risk of overflow in quite near future. Comment in source code advises >>> "widen struct pta flags" which is now defined as unsigned. But it >>> looks not optimal. >>> >>> What will be the most proper solution for this problem? >> >> Why is widening pta_flags "not optimal?" >> >> It's hard for me to believe that we still care about bootstrapping a >> i386-*-* compiler with a compiler which doesn't support any 64-bit type. >> So I don't see any problem with setting need_64bit_hwint=yes in >> config.gcc for i386-*-*, changing pta_flags to be unsigned >> HOST_WIDE_INT, and letting pta_flags go up to (unsigned HOST_WIDE_INT) 1 >> << 63. >> >> If anybody doesn't like that idea, we can simply add a flags2 field and >> a pta_flags2 enum with PTA2_xxx constants. >> > > Hi, > > We are also running out of bits in ix86_isa_flags. This patch uses > int64 on both ix86_isa_flags and PTA. I added a new option to opt: > > ; Maximum number of mask bits in a variable. > MaxMaskBits > ix86_isa_flags = 64 > > It mark ix86_isa_flags as 64bit. Any comments? We should just introduce ix86_isa_flags2. We shouldn't stop at 128 flags. ;) Uros.
Re: RFC: PATCH: Require and use int64 for x86 options
On Wed, Jul 27, 2011 at 10:00 AM, Uros Bizjak wrote: > On Wed, Jul 27, 2011 at 6:42 PM, H.J. Lu wrote: > As you may see pta_flags enum in i386.c is almost full. So there is a risk of overflow in quite near future. Comment in source code advises "widen struct pta flags" which is now defined as unsigned. But it looks not optimal. What will be the most proper solution for this problem? >>> >>> Why is widening pta_flags "not optimal?" >>> >>> It's hard for me to believe that we still care about bootstrapping a >>> i386-*-* compiler with a compiler which doesn't support any 64-bit type. >>> So I don't see any problem with setting need_64bit_hwint=yes in >>> config.gcc for i386-*-*, changing pta_flags to be unsigned >>> HOST_WIDE_INT, and letting pta_flags go up to (unsigned HOST_WIDE_INT) 1 >>> << 63. >>> >>> If anybody doesn't like that idea, we can simply add a flags2 field and >>> a pta_flags2 enum with PTA2_xxx constants. >>> >> >> Hi, >> >> We are also running out of bits in ix86_isa_flags. This patch uses >> int64 on both ix86_isa_flags and PTA. I added a new option to opt: >> >> ; Maximum number of mask bits in a variable. >> MaxMaskBits >> ix86_isa_flags = 64 >> >> It mark ix86_isa_flags as 64bit. Any comments? > > We should just introduce ix86_isa_flags2. We shouldn't stop at 128 flags. ;) > It is used to control which insns are are available. See OPTION_MASK_ISA_XXX_SET and OPTION_MASK_ISA_XXX_UNSET in common/config/i386/i386-common.c. Adding ix86_isa_flags2 makes it very complicated: 1. We need to turn on a set of ISAs for -mXXX. 2. We need to turn off a set of ISAs for -mno-XXX. 3. We need to check 2 fields in def_builtin to see if an insn is available. As a side benefit, need_64bit_hwint=yes will resolve many 32bit code generation differences on ia32 and x86-64 hosts. We can close a bunch of bugs, like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43226 -- H.J.
Re: Romain Geissler copyright assignment
Le 26 juil. 2011 à 16:45, Yvan ROUX a écrit : > Hi, > > Romain is doing an internship at STMicroelectronics on GCC plugins, and > as his mentor, I confirm and/or inform you that he is covered by the > copyright assignement RT 211150 between ST and FSF. > > Regards. > > -- > Yvan ROUX > STMicroelectronics Hi, As an intern, i'm already covered by a copyright assignment from STMicroelectronics, but do i also need one from my school (Ensimag: École nationale supérieure d’informatique et de mathématiques appliquées de Grenoble (France)) ? If yes, how do i proceed to get one ? Regards Romain Geissler.
[gnu.org #702521] Re: Romain Geissler copyright assignment
> [romain.geiss...@gmail.com - Wed Jul 27 14:05:48 2011]: > > Le 26 juil. 2011 à 16:45, Yvan ROUX a écrit : > > Hi, > > > > Romain is doing an internship at STMicroelectronics on GCC plugins, and > > as his mentor, I confirm and/or inform you that he is covered by the > > copyright assignement RT 211150 between ST and FSF. > > > > Regards. > > > > -- > > Yvan ROUX > > STMicroelectronics > > Hi, > > As an intern, i'm already covered by a copyright assignment from > STMicroelectronics, but do i also need one from my school (Ensimag: École > nationale supérieure d’informatique et de mathématiques appliquées de > Grenoble (France)) ? If yes, how do i proceed to get one ? I would assume that at most we would need just a disclaimer from the university (included in text at the bottom of this email). If the school would have a claim to owning your work (perhaps because they are paying for it or it is being done as part of a course) then we would need an assignment. Otherwise, you can just print out the disclaimer, give it to the appropriate person at the school to fill out and sign, and then send a scanned copy back to me. If you are not sure who the proper person would be to ask to sign it, talk to your department chair, s/he should know. Thank you so much for contributing, and I hope to hear from you soon. > > Regards > > Romain Geissler. > > > > -- Sincerely, Donald R. Robertson, III, J.D. Assignment Administrator Free Software Foundation 51 Franklin Street, Fifth Floor Boston, MA 02110 Phone +1-617-542-5942 Fax +1-617-542-2652 --- If you are a student at a college or university where, due to general policy or a specific agreement which you signed, a claim might be made by the institution to your work or to copyrights or patents arising from it, then you and we need a signed piece of paper from the school disclaiming rights to your changes. Here is a disclaimer you can get signed to cover your future changes to GNU software, as well as your past changes. The disclaimer should be signed by someone authorized to license patents and copyrights for the school; you should find out who has that authorization. It may be a specific office that deals with patent, copyright, and trademark issues, it may be the school's lawyer or president, or (hopefully) some more accessible official. If you can't get at them, anyone else authorized to issue licenses for software produced there will do. Much of this disclaimer consists of a description of which kinds of work it applies to. This description is just a suggestion--it can be replaced with any description that says clearly which kinds of work they want to disclaim and which kinds they don't. Paragraph (b) is optional; they can delete it if they wish. If the school says they do have a claim that could conflict with the use of the program, then please put us in touch with a suitable representative of the school, so that we can negotiate what to do about it. Send a note about the issue to the school representative and to ass...@gnu.org. IMPORTANT: When you talk to the school representative, *no matter what instructions they have given you*, don't fail to show them the sample disclaimer below, or a disclaimer with the details filled in for your specific case. Schools are often willing to sign a disclaimer without any fuss. If you make your request less specific, you may cloud the issue and cause a long and unnecessary delay. Please keep a copy of the school's signed disclaimer, and snail the original to us at Attn: Disclaimer Clerk Free Software Foundation 51 Franklin Street, Fifth Floor Boston, MA 02110 USA DISCLAIMER OF RIGHTS BY A COLLEGE OR UNIVERSITY We agree that software and other authored works of the 'Released Category' (defined below), made by _, a student or graduate student at this school, prior to the date of this document, and for _ years thereafter, are freely assignable by said student to Free Software Foundation (FSF) for distribution and sharing under its free software policies. We disclaim any status as the author or owner of such works; we do not consider them as works made for hire for us. The Released Category comprises (a) changes and enhancements to software already (as of the time such change or enhancement is made) freely circulating under stated terms permitting public redistribution, whether in the public domain, or under the FSF's GNU General Public License, or under the FSF's GNU Lesser General Public License (a.k.a. the GNU Library General Public License), or under other such terms; and (b) operating system components for operating systems providing substantially the same functionality as portions of UNIX, BSD, Microsoft Windows, or other popular operating systems. The Released Category excludes __ [if 'none', please so state; thank you--FSF]. We affirm that we will do nothing in the future to undermine this release. If we have or acquire
ANN: gcc-python-plugin 0.5
gcc-python-plugin is a plugin for GCC 4.6 onwards which embeds the CPython interpreter within GCC, allowing you to write new compiler warnings in Python, generate code visualizations, etc. Tarball releases are available at: https://fedorahosted.org/releases/g/c/gcc-python-plugin/ Prebuilt-documentation can be seen at: http://readthedocs.org/docs/gcc-python-plugin/en/latest/index.html Project homepage: https://fedorahosted.org/gcc-python-plugin/ High level summary of the changes since the initial announcement: - new contributors - lots of bug fixes and compatibility fixes (e.g. for Python 3): the selftest suite now works for me with all eight different combinations of: - optimized vs debug builds of Python - Python 2.7 vs Python 3.2 - i686 and x86_64 building against gcc-4.6.1 (also tested with gcc-4.6.0) - new example scripts; see: http://readthedocs.org/docs/gcc-python-plugin/en/latest/examples.html - if PLUGIN_PYTHONPATH is defined at build time, hardcode the value into the plugin's sys.path, allowing multiple builds to be independently packaged - more documentation - work-in-progress on detecting reference-count errors in C Python extension code. Although this usage example can now detect errors, it isn't yet ready for general use. It can generate HTML visualizations of those errors; see http://dmalcolm.livejournal.com/6560.html for examples. - numerous other improvements (see below) - new dependency: the "six" module is required at both build time and run-time, to smooth over Python 2 vs Python 3 differences: http://pypi.python.org/pypi/six/ I've also packaged the plugin in RPM form for Fedora 16 onwards; see: https://fedoraproject.org/wiki/Features/GccPythonPlugin Enjoy! Dave Detailed change notes follow Version 0.5 === David Malcolm (7): Override all locale information with LC_ALL=C when running selftests Revamp support for options in selftests Add note about ccache Improvements to the example scripts Split up examples within the docs Fix gcc.Pass.__repr__ Version 0.4 === David Malcolm (10): add explicit BR on gmp-devel (rhbz#725569) Make the test suites be locale-independent Suppress buffering of output in Python 3 run-test-suite.py: support excluding tests from a run Python 3 fixes to testcpychecker.py Add 'str_no_uid' attribute to gcc.Tree and gcc.Gimple; use it to fix a selftest Fix segfault seen on i686 due to erroneous implementation of 'pointer' attribute of gcc.TypeDecl Selftest fixes and exclusions for 32-bit builds Fix the test for 32/64-bit in selftests so that it works with both Python 2 and 3 Version 0.3 === David Malcolm (3): If PLUGIN_PYTHONPATH is defined at build time, hardcode it into the plugin's sys.path Python 3 fixes Add the beginnings of a manpage for gcc-with-python Version 0.2 === Alexandre Lissy (2): fix: permerror() misusage Using Freedesktop standard for image viewing David Malcolm (98): Introduce gcc.Parameter and gcc.get_parameters() Introduce a compatibility header file Document the 'basic_blocks' attribute of gcc.Cfg Fix a mismatch between gccutils.pformat() and the API docs Add note about debugging Move the debugging information to be more prominent, and reword Automatically supply the correct header search directory for selftests that #include Set up various things in sys, including sys.path Cope with calls to function pointers in the arg checker Remove stray import Fix issue with PyArg_ParseTuple("K") seen compiling gdb Fix erroneous error messages for the various "s" and "z" format codes Format codes "U" and "S" can support several different argument types Add a way of turning of const-correctness for "const char*" checking Fix breakage of the various "es" and "et" format codes introduced in last commit Implement verification of the "O&" format code (converter callback, followed by appropriate arg) Use newlines and indentation to try to make the PyArg_ error messages more readable Add the example from my blog post ( http://dmalcolm.livejournal.com/6364.html ) Remove redundant (and non-functioning) selftest for "O&" format code Add 'local_decls', 'start', 'end', 'funcdef_no' to gcc.Function Add 'arguments' and 'result' to gcc.FunctionDecl Fix typos in docs Add 'operand' to gcc.Unary; check that the keywords table to PyArg_ has a NULL terminator Add 'location' to more tcc types; fill out more documentation Start building out examples of C syntax vs how it's seen by the Python API Fix the behavior of the various "#" format codes. Add Alexandre Lissy to contributors The various "e" codes can accept NULL as the encoding Add s
Re: RFC: PATCH: Require and use int64 for x86 options
On Wed, 27 Jul 2011, H.J. Lu wrote: > ; Maximum number of mask bits in a variable. > MaxMaskBits > ix86_isa_flags = 64 > > It mark ix86_isa_flags as 64bit. Any comments? The patch won't work as is. set_option, for example, casts a pointer to (int *), and stores a mask that came from option->var_value, which is an int, so this won't work with option fields not of type int or values that don't fit in int; you'd need to check all uses of CLVC_BIT_CLEAR and CLVC_BIT_SET in the source tree to adapt things for the possibility of wider mask fields, and track the type of each such field. Independently, I approve of setting need_64bit_hwint for all x86 targets, but your patch doesn't achieve the expected simplification. In config.gcc, there are settings for various individual targets that should be removed once it's set in one place for all x86 targets. In libcpp/configure.ac, similarly the cases for i[34567]86-*-darwin* i[34567]86-*-solaris2.1[0-9]* x86_64-*-solaris2.1[0-9]* i[34567]86-w64-mingw* i[34567]86-*-linux* (the last only if --enable-targets=all) should all be removed as obsolete once i[34567]86-*-* is there along with x86_64-*-*. -- Joseph S. Myers jos...@codesourcery.com
Re: RFC: PATCH: Require and use int64 for x86 options
On Wed, Jul 27, 2011 at 2:23 PM, Joseph S. Myers wrote: > On Wed, 27 Jul 2011, H.J. Lu wrote: > >> ; Maximum number of mask bits in a variable. >> MaxMaskBits >> ix86_isa_flags = 64 >> >> It mark ix86_isa_flags as 64bit. Any comments? > > The patch won't work as is. set_option, for example, casts a pointer to > (int *), and stores a mask that came from option->var_value, which is an > int, so this won't work with option fields not of type int or values that > don't fit in int; you'd need to check all uses of CLVC_BIT_CLEAR and > CLVC_BIT_SET in the source tree to adapt things for the possibility of > wider mask fields, and track the type of each such field. We will prepare a separate patch. > Independently, I approve of setting need_64bit_hwint for all x86 targets, > but your patch doesn't achieve the expected simplification. In > config.gcc, there are settings for various individual targets that should > be removed once it's set in one place for all x86 targets. In > libcpp/configure.ac, similarly the cases for i[34567]86-*-darwin* > i[34567]86-*-solaris2.1[0-9]* x86_64-*-solaris2.1[0-9]* > i[34567]86-w64-mingw* i[34567]86-*-linux* (the last only if > --enable-targets=all) should all be removed as obsolete once > i[34567]86-*-* is there along with x86_64-*-*. > Is this patch OK for trunk? Thanks. H.J. gcc/ 2011-07-27 H.J. Lu * config.gcc: Set need_64bit_hwint to yes for x86 targets. libcpp/ 2011-07-27 H.J. Lu * configure.ac: Set need_64bit_hwint to yes for x86 targets. * configure: Regenerated. gcc/ 2011-07-27 H.J. Lu * config.gcc: Set need_64bit_hwint to yes for x86 targets. libcpp/ 2011-07-27 H.J. Lu * configure.ac: Set need_64bit_hwint to yes for x86 targets. * configure: Regenerated. diff --git a/gcc/config.gcc b/gcc/config.gcc index d7cf895..02cc556 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -345,6 +345,7 @@ i[34567]86-*-*) cpu_type=i386 c_target_objs="i386-c.o" cxx_target_objs="i386-c.o" + need_64bit_hwint=yes extra_options="${extra_options} fused-madd.opt" extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h @@ -1211,7 +1212,6 @@ hppa[12]*-*-hpux11*) fi ;; i[34567]86-*-darwin*) - need_64bit_hwint=yes need_64bit_isa=yes # Baseline choice for a machine that allows m64 support. with_cpu=${with_cpu:-core2} @@ -1293,7 +1293,6 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i esac done TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'` - need_64bit_hwint=yes need_64bit_isa=yes case X"${with_cpu}" in Xgeneric|Xatom|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver2|Xbdver1|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3) @@ -1415,7 +1414,6 @@ i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*) tm_file="${tm_file} i386/x86-64.h i386/sol2-bi.h sol2-bi.h" tm_defines="${tm_defines} TARGET_BI_ARCH=1" tmake_file="$tmake_file i386/t-sol2-64" - need_64bit_hwint=yes need_64bit_isa=yes case X"${with_cpu}" in Xgeneric|Xatom|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver2|Xbdver1|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3) @@ -1478,7 +1476,6 @@ i[34567]86-*-mingw* | x86_64-*-mingw*) xm_file=i386/xm-mingw32.h case ${target} in x86_64-*-* | *-w64-*) - need_64bit_hwint=yes need_64bit_isa=yes ;; *) diff --git a/libcpp/configure b/libcpp/configure index b453a7b..c400d23 100755 --- a/libcpp/configure +++ b/libcpp/configure @@ -7312,9 +7312,7 @@ case $target in x86_64-*-* | \ ia64-*-* | \ hppa*64*-*-* | \ - i[34567]86-*-darwin* | \ - i[34567]86-*-solaris2.1[0-9]* | x86_64-*-solaris2.1[0-9]* | \ - i[34567]86-w64-mingw* | \ + i[34567]86-*-* | x86_64-*-solaris2.1[0-9]* | \ mips*-*-* | \ mmix-*-* | \ powerpc*-*-* | \ @@ -7324,13 +7322,6 @@ case $target in spu-*-* | \ sh[123456789lbe]*-*-* | sh-*-*) need_64bit_hwint=yes ;; - i[34567]86-*-linux*) - if test "x$enable_targets" = xall; then - need_64bit_hwint=yes - else - need_64bit_hwint=no - fi - ;; *) need_64bit_hwint=no ;; esac diff --git a/libcpp/configure.ac b/libcpp/configure.ac index 170932c..e1d8851 100644 --- a/libcpp/configure.ac +++ b/libcpp/configure.ac @@ -150,9 +150,7 @@ case $target in x86_64-*-* | \ ia64-*-* | \ hppa*64*-*-* | \ - i[34567]86-*-darwin* | \ - i[34567]86-*-solaris2.1[0-9]* | x86_64-*-solaris2.1[0-9]* | \ - i[34567]86-w64-mingw* | \ + i[34567]86-*-* | x86_64-*-solaris2.1[0-9]* | \ mips*-*-* | \ mmix-*-* | \ powerpc*-*-* | \ @@ -162,13 +160,6 @@ case $target in spu-*-* | \ sh[123456789lbe]*-*-* | sh-*-*) need_64bit_hwint=yes ;; - i[34567]86-*-linux*) - if test "x$enable_targets" = xall; then - need_64bit_hwint=yes - else - need_64bit_hwint=no - fi - ;; *) need_64bit_hwint=no ;; esac
IRA vs CANNOT_CHANGE_MODE_CLASS, + 4.7 IRA regressions?
Consider this bit of code: extern double a[20]; double test1 (int n) { double accum = 0.0; int i; for (i=0; imipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops -march=74kf1_1 -S abstest.c With a GCC 4.6 compiler, this produces: ... .L3: mtc1$3,$f2 ldc1$f0,0($5) addiu $5,$5,8 mtc1$2,$f3 sub.d $f2,$f2,$f0 mfc1$3,$f2 bne $5,$4,.L3 mfc1$2,$f3 ext $5,$2,0,31 move$4,$3 .L2: mtc1$4,$f0 j $31 mtc1$5,$f1 ... This is terrible code, with all that pointless register-shuffling inside the loop -- what's gone wrong? Well, the bit-twiddling expansion of "fabs" produced by optabs.c uses subreg expressions, and on MIPS CANNOT_CHANGE_MODE_CLASS disallows use of FP registers for integer operations. And, when IRA sees that, it decides it cannot alloc "accum" to a FP reg at all, even if it obviously makes sense to put it there for the rest of its lifetime. On mainline trunk, things are even worse as it's spilling to memory, not just shuffling between registers: .L3: ldc1$f0,0($2) addiu $2,$2,8 sub.d $f2,$f2,$f0 bne $2,$3,.L3 sdc1$f2,0($sp) lw $2,0($sp) ext $3,$2,0,31 lw $2,4($sp) .L2: sw $2,4($sp) sw $3,0($sp) lw $3,4($sp) lw $2,0($sp) addiu $sp,$sp,8 mtc1$3,$f0 j $31 mtc1$2,$f1 I've been experimenting with a patch to the MIPS backend to add define_insn_and_split patterns for floating-point abs -- the idea is to attach some constraints to the insns to tell IRA it needs a GP reg for these operations, so it can apply its usual cost analysis and reload logic instead of giving up. Then the split to introduce the subreg expansion happens after reload when we already have the right register class. This seems to work well enough on 4.6; for this particular example, I'm getting: .L3: ldc1$f2,0($2) addiu $2,$2,8 bne $2,$4,.L3 sub.d $f0,$f0,$f2 mfc1$2,$f1 ext $2,$2,0,31 j $31 mtc1$2,$f1 However, same patch on mainline is still giving spills to memory. :-( So, here's my question. Is it worthwhile for me to continue this approach of trying to make the MIPS backend smarter? Or is the way IRA deals with CANNOT_CHANGE_MODE_CLASS fundamentally broken and in need of fixing in a target-inspecific way? And/or is there some other regression in IRA on mainline that's causing it to spill to memory when it didn't used to in 4.6? BTW, the unary "neg" operator has the same problem as "abs" on MIPS; can't use the hardware instruction because it does the wrong thing with NaNs, and can't twiddle the sign bit directly in a FP register. With both abs/neg now generating unnecessary memory spills, this seems like a fairly important performance regression -Sandra