SSA Question related to Dominator Trees
Greetings, Sorry if this question has been asked before but do we extend out the core tree type for SSA or is there a actual dominator tree type. It seems to be we just extend or override the core tree type parameters but was unable to verify it by looking in the manual. Thanks, Nick
Re: GCC GSoC 2020: Call for mentors and project ideas
On 1/15/20 11:45 PM, Martin Jambor wrote: Therefore, first and foremost, I would like to ask all (moderately) seasoned GCC contributors to consider mentoring a student this year and ideally also come up with a project that they would like to lead. I'm collecting proposal on our wiki page @David would you be interested in a analyzer topics? Seems to me ideal for newcomers to come up with a static analyzer check? Martin
Re: SSA Question related to Dominator Trees
On Mon, 2020-01-27 at 10:18 -0500, Nicholas Krause wrote: > Greetings, > > Sorry if this question has been asked before but do we extend out the > core tree type for SSA or > is there a actual dominator tree type. It seems to be we just extend or > override the core tree > type parameters but was unable to verify it by looking in the manual. There is no type or class for the dominator tree. Having one would be useful. jeff
Re: SSA Question related to Dominator Trees
On 1/27/20 10:46 AM, Jeff Law wrote: On Mon, 2020-01-27 at 10:18 -0500, Nicholas Krause wrote: Greetings, Sorry if this question has been asked before but do we extend out the core tree type for SSA or is there a actual dominator tree type. It seems to be we just extend or override the core tree type parameters but was unable to verify it by looking in the manual. There is no type or class for the dominator tree. Having one would be useful. jeff Jeff, Thought so and the manual isn't very clear on all of the state used by it and where, so implementing a class may be tricky. After looking in ssa.h seems that there are four main header files for definitions: gimple-ssa.h tree-ssanames.h tree-phinodes.h ssa-iterators.h I've not sure if there are other files lying around related to dom tree walking but I'm ccing Richard as he would know better than me where the other parts are. It also makes sense to contain it if we want to make the domtrees and SSA passes multi threaded aware, Nick
Re: Aliasing rules for unannotated SYMBOL_REFs
On Sat, 2020-01-25 at 09:31 +, Richard Sandiford wrote: > TL;DR: if we have two bare SYMBOL_REFs X and Y, neither of which have an > associated source-level decl and neither of which are in an anchor block: > > (Q1) can a valid byte access at X+C alias a valid byte access at Y+C? > > (Q2) can a valid byte access at X+C1 alias a valid byte access at Y+C2, > C1 != C2? > > Also: > > (Q3) If X has a source-level decl and Y doesn't, and neither of them are > in an anchor block, can valid accesses based on X alias valid accesses > based on Y? So what are the cases where Y won't have a source level decl but we have a decl in RTL? anchors, other cases? > > (well, OK, that wasn't too short either...) I would have thought the answer would be "no" across the board. But the code clearly indicates otherwise. Interposition clearly complicates things as do explicit aliases though. > > This part seems obvious enough. But then, apart from the special case of > forced address alignment, we use an offset-based check even for cmp==-1: > > /* Assume a potential overlap for symbolic addresses that went >through alignment adjustments (i.e., that have negative >sizes), because we can't know how far they are from each >other. */ > if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0)) > return -1; > /* If decls are different or we know by offsets that there is no > overlap, >we win. */ > if (!cmp || !offset_overlap_p (c, xsize, ysize)) > return 0; > > So we seem to be taking cmp==-1 to mean that although we don't know > the relationship between the symbols, it must be the case that either > (a) the symbols are equal (e.g. via aliasing) or (b) the accesses are > to non-overlapping objects. In other words, one of the situations > described by cmp==1 or cmp==0 must be true, but we don't know which > at compile time. Right. That was the conclusion I came to. If a SYMBOL_REF has an alias, the alias must have the same value as the SYMBOL_REF. So their either equal or there's no valid case for overlap. > > This means that in practice, the answer to (Q1) appears to be "yes" > but the answer to (Q2) appears to be "no". That would be my understanding once aliases/interpositioning come into play. > > This somewhat contradicts: > > /* In general we assume that memory locations pointed to by different labels > may overlap in undefined ways. */ > return -1; > > at the end of compare_base_symbol_refs, which seems to be saying > that the answer to (Q2) ought to be "yes" instead. Which is right? I'm not sure how we could get to yes in that case. A symbol alias or interposition ultimately still results in two symbols having the same final address. Thus for a byte access if C1 != C2, then we can't have an overlap. > > In PR92294 we have a symbol X at ANCHOR+OFFSET that's preemptible. > Under the (Q1)==yes/(Q2)==no assumption, cmp==-1 means that either > (a) X = ANCHOR+OFFSET or (b) X and ANCHOR reference non-overlapping > objects. So we should take the offset into account when doing: > > if (!cmp || !offset_overlap_p (c, xsize, ysize)) > return 0; > > Let's call this FIX1. So this is a really interesting wrinkle. Doesn't this change Q2 to a yes? In particular it changes the "invariant" that the symbols have the same address in the event of an symbol alias or interposition. Of course one could ask the question of whether or not we should handle cases with anchors specially. > > But that then brings us to: why does memrefs_conflict_p return -1 > when one symbol X has a decl and the other symbol Y doesn't, and neither > of them are block symbols? Is the answer to (Q3) that we allow equality > but not overlap here too? E.g. a linker script could define Y to X but > not to a region that contains X at a nonzero offset? Does digging into the history provide any insights here? I'm not sure given the issues you've introduced if I could actually fill out the matrix of answers without more underlying information. ie, when can we get symbols without source level decls, anchors+interposition issues, etc. Jeff >
Question about changing {machine,type} modes during LTO
Hello, I have a problem with a transformation I'm working on and I would appreciate some help. The transformation I am working on removes fields in structs early during link-time. For the purposes of development and this example, my transformation deletes the field identified as "delete_me" from the struct identified as "astruct_s". These identifiers are hard coded in the transformation at the moment. For example: ```c int main() { struct astruct_s { _Bool a; _Bool delete_me; _Bool c;}; // more } ``` should be equivalent to ```c int main() { struct astruct_s { _Bool a; _Bool c;}; // more } ``` as long as no instruction accesses field "delete me". I have succeeded in eliminating field "delete_me" from struct "astruct_s" and at the same time successfully calculating field offsets and array offsets for a subset of the C syntax. I am working on expanding the allowed syntax and at the same time creating tests to verify my assumptions/work is still producing correct results. I was starting work on supporting arrays of multiple dimensions, when I found an interesting edge case in my transformation. I was able to transform structs of size 2, 3, (but not 4), 5, 6, 7, (but not 8), 9, 10... This was the stack trace when the error was triggered: ``` a.c: In function ‘main’: a.c:11:19: internal compiler error: in convert_move, at expr.c:219 11 | struct astruct_s b = a[argc][argc]; | ^ 0xb8bac3 convert_move(rtx_def*, rtx_def*, int) /home/eochoa/code/gcc/gcc/expr.c:219 0xb9f5cf store_expr(tree_node*, rtx_def*, int, bool, bool) /home/eochoa/code/gcc/gcc/expr.c:5825 0xb9d913 expand_assignment(tree_node*, tree_node*, bool) /home/eochoa/code/gcc/gcc/expr.c:5509 0xa08bfb expand_gimple_stmt_1 /home/eochoa/code/gcc/gcc/cfgexpand.c:3746 0xa09047 expand_gimple_stmt /home/eochoa/code/gcc/gcc/cfgexpand.c:3844 0xa1170f expand_gimple_basic_block /home/eochoa/code/gcc/gcc/cfgexpand.c:5884 0xa134b7 execute /home/eochoa/code/gcc/gcc/cfgexpand.c:6539 Please submit a full bug report, ``` Looking at expr.c:219 I found the following assertions ```c /* Copy data from FROM to TO, where the machine modes are not the same. Both modes may be integer, or both may be floating, or both may be fixed-point. UNSIGNEDP should be nonzero if FROM is an unsigned type. This causes zero-extension instead of sign-extension. */ void convert_move (rtx to, rtx from, int unsignedp) { machine_mode to_mode = GET_MODE (to); machine_mode from_mode = GET_MODE (from); gcc_assert (to_mode != BLKmode); gcc_assert (from_mode != BLKmode); <-- crashes here ``` I started reading the gcc internals around machine modes: https://gcc.gnu.org/onlinedocs/gccint/Machine-Modes.html and tried the experiment where I first compiled a struct of size 2 (and delete field "delete_me"), then of size 3 and so on, and so on. I noticed that the TYPE_MODE for matches the machine mode. And that it varies with the size of the struct. (Which agrees with the definition of machine mode.) I originally thought that I needed to set TYPE_MODE myself, but if layout_type is called after deleting the field (which it is), then TYPE_MODE is correctly set somewhere within layout_type: https://github.com/gcc-mirror/gcc/blob/68697710fdd35077e8617f493044b0ea717fc01a/gcc/stor-layout.c#L2203 I verified that layout_type is setting the correct values for TYPE_MODE when transforming struct "astruct_s" by comparing the TYPE_MODE of different sizes without the transformation applied. When transforming structs, layout_type always returned a TYPE_MODE which matched the TYPE_MODE for unmodified structs with the same size as the transformed struct (post transformation). In other words: For variable "struct not_transformed b" without transformation I obtain the following relationship. Without transformation: | size | typemode | |--|--| | 1| 13 | | 2| 14 | | 3| 1| | 4| 15 | | 5| 1| | 6| 1| | 7| 1| | 8| 16 | | 9| 1| With transformation (i.e. astruct_s b with a field named "delete_me") | size before | size after | typemode | |-||--| | 2 | 1 | 13 | | 3 | 2 | 14 | | 4 | 3 | 1| | 5 | 4 | 15 | | 6 | 5 | 1| | 7 | 6 | 1| | 8 | 7 | 1| | 9 | 8 | 16 | I have a similar result for variable "struct astructs b[]". Without modifications: | size | type_mode | |--|---| | 1| 14 | | 2| 15 | | 3| 1| | 4| 16 | | 5| 1| | 6| 1| With deletion of a field: | old size | size | type_mode| |--|--|--| | 2| 1| 14 | | 3| 2| 15 | | 4
Re: fast_math_flags_set_p vs. set_fast_math_flags inconsistency?
Joseph Myers wrote: > On Tue, 21 Jan 2020, Ulrich Weigand wrote: > > > It looks like there's multiple cases here. For the two flags > > -fassociative-math and -freciprocal-math, it seems to have happened just as > > you describe: they were created (split out of -funsafe-math-optimizations) > > in commit a1a826110720eda37c73f829daa4ee243ee953f5, which however did not > > update fast_math_flags_set_p. > > So that's a bug. OK, agreed. > > For the other three flags, -fsignaling-nans, -frounding-math, and > > -fcx-limited-range, the story appears to be a bit different: from the > > The first two of those are disabled by default as well as disabled by > -ffast-math, so it seems right that -fno-fast-math does nothing with them > and that they aren't checked by fast_math_flags_set_p. I see. I guess that makes me wonder what -fno-fast-math *ever* does (except canceling a -ffast-math earlier on the command line). Looking at the current code, -fno-fast-math (just like -ffast-math) only ever sets flags whose default is not overridden on the command line, but then it always sets them to their default value! Am I missing something here? If that's the intent, it might be cleaner to write set_fast_math_flags as just one big if (set) { } > The last one is disabled by default but enabled by -ffast-math. So it > would seem appropriate to handle it like other such options, disable it > with -fno-fast-math, and check it in fast_math_flags_set_p. OK. > > Finally, there is one "mixed" flag, -fexcess-precision, which is handled > > like the above three in that its default is only modified as a result of > > -ffast-math, not -fno-fast-math; but nevertheless this flag *is* checked > > in fast_math_flags_set_p. > > That one's trickier because the default depends on whether a C standards > conformance mode is specified. This also makes sense if we consider the semantics of -fno-fast-math to just leave all component flags at their default, as above ... (As an aside, the current code is even more confusing as it has a dead condition: if (set) { if (opts->frontend_set_flag_excess_precision == EXCESS_PRECISION_DEFAULT) opts->x_flag_excess_precision = set ? EXCESS_PRECISION_FAST : EXCESS_PRECISION_DEFAULT; The second test of "set" must always be true here, so this will never actually actively set the flag to EXCESS_PRECISION_DEFAULT.) Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com
Re: fast_math_flags_set_p vs. set_fast_math_flags inconsistency?
On Mon, 27 Jan 2020, Ulrich Weigand wrote: > I see. I guess that makes me wonder what -fno-fast-math *ever* does > (except canceling a -ffast-math earlier on the command line). Looking > at the current code, -fno-fast-math (just like -ffast-math) only ever > sets flags whose default is not overridden on the command line, but > then it always sets them to their default value! As a general principle, more specific flags take precedence over less specific ones, regardless of the command-line order. So it's correct for -ffast-math and -fno-fast-math not to do anything with a flag that was explicitly overridden by the user (modulo any issues where a particular combination of flags is unsupported by GCC, as with the "%<-fassociative-math%> disabled; other options take precedence" case in toplev.c). -- Joseph S. Myers jos...@codesourcery.com
Re: GCC Multi-Threading Ideas
On 1/24/20, Richard Earnshaw (lists) wrote: > On 24/01/2020 10:27, Jonathan Wakely wrote: >> On Fri, 24 Jan 2020 at 03:39, Nicholas Krause >> wrote: >>> Sorry for the second message Allan but make -j does not scale well >>> beyond 4 or >>> 8 threads and that's considering a 4 core or 8 machine. The problem has >>> to >>> do with large build machines with CPUs with more cores than this or as >>> is becoming >>> more common on mainstream systems. >> >> And make scales well beyond 8 processes (not threads) on such machines. >> > > The problem isn't make, per se, or even gcc. It's the build system as a > whole. > > On a highly multi-core machine, gcc itself hits the bottle-neck called > configure. That's serial, run *many* times (especially when there are > many multilibs) and dominates build time. > > On high multi-core machines, gcc's 15-minute system load gets no-where > near to the number of threads on the machine because of this. > > R. > It would be great if we could get some new autotools releases some time to help with this; autoconf in particular hasn't had an update in several years now AFAIK. While automake has had updates more recently than autoconf, they've mostly just been to the automake part itself and not to the aclocal program that comes with it, and aclocal in particular is another bottleneck for people who regenerate the build system files (although it could just be that way in my case because I have so many m4 macro files installed on my system for it to search thru for macros every time...)