SSA Question related to Dominator Trees

2020-01-27 Thread Nicholas Krause

Greetings,

Sorry if this question has been asked before but do we extend out the 
core tree type for SSA or
is there a actual dominator tree type. It seems to be we just extend or 
override the core tree

type parameters but was unable to verify it by looking in the manual.

Thanks,

Nick


Re: GCC GSoC 2020: Call for mentors and project ideas

2020-01-27 Thread Martin Liška

On 1/15/20 11:45 PM, Martin Jambor wrote:

Therefore, first and foremost, I would like to ask all (moderately)
seasoned GCC contributors to consider mentoring a student this year and
ideally also come up with a project that they would like to lead.  I'm
collecting proposal on our wiki page


@David would you be interested in a analyzer topics? Seems to me
ideal for newcomers to come up with a static analyzer check?

Martin


Re: SSA Question related to Dominator Trees

2020-01-27 Thread Jeff Law
On Mon, 2020-01-27 at 10:18 -0500, Nicholas Krause wrote:
> Greetings,
> 
> Sorry if this question has been asked before but do we extend out the 
> core tree type for SSA or
> is there a actual dominator tree type. It seems to be we just extend or 
> override the core tree
> type parameters but was unable to verify it by looking in the manual.
There is no type or class for the dominator tree.  Having one would be
useful.


jeff



Re: SSA Question related to Dominator Trees

2020-01-27 Thread Nicholas Krause




On 1/27/20 10:46 AM, Jeff Law wrote:

On Mon, 2020-01-27 at 10:18 -0500, Nicholas Krause wrote:

Greetings,

Sorry if this question has been asked before but do we extend out the
core tree type for SSA or
is there a actual dominator tree type. It seems to be we just extend or
override the core tree
type parameters but was unable to verify it by looking in the manual.

There is no type or class for the dominator tree.  Having one would be
useful.


jeff

Jeff,
Thought so and the manual isn't very clear on all of the state used by 
it and where,
so implementing a class may be tricky. After looking in ssa.h seems that 
there are

four main header files for definitions:
gimple-ssa.h
tree-ssanames.h
tree-phinodes.h
ssa-iterators.h

I've not sure if there are other files lying around related to dom tree 
walking but

I'm ccing Richard as he would know better than me where the other parts are.

It also makes sense to contain it if we want to make the domtrees and 
SSA passes

multi threaded aware,
Nick






Re: Aliasing rules for unannotated SYMBOL_REFs

2020-01-27 Thread Jeff Law
On Sat, 2020-01-25 at 09:31 +, Richard Sandiford wrote:
> TL;DR: if we have two bare SYMBOL_REFs X and Y, neither of which have an
> associated source-level decl and neither of which are in an anchor block:
> 
> (Q1) can a valid byte access at X+C alias a valid byte access at Y+C?
> 
> (Q2) can a valid byte access at X+C1 alias a valid byte access at Y+C2,
>  C1 != C2?
> 
> Also:
> 
> (Q3) If X has a source-level decl and Y doesn't, and neither of them are
>  in an anchor block, can valid accesses based on X alias valid accesses
>  based on Y?
So what are the  cases where Y won't have a source level decl but we
have a decl in RTL?  anchors, other cases? 


> 
> (well, OK, that wasn't too short either...)
I would have thought the answer would be "no" across the board.  But
the code clearly indicates otherwise.

Interposition clearly complicates things as do explicit aliases though.



> 
> This part seems obvious enough.  But then, apart from the special case of
> forced address alignment, we use an offset-based check even for cmp==-1:
> 
>   /* Assume a potential overlap for symbolic addresses that went
>through alignment adjustments (i.e., that have negative
>sizes), because we can't know how far they are from each
>other.  */
>   if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>   return -1;
>   /* If decls are different or we know by offsets that there is no 
> overlap,
>we win.  */
>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>   return 0;
> 
> So we seem to be taking cmp==-1 to mean that although we don't know
> the relationship between the symbols, it must be the case that either
> (a) the symbols are equal (e.g. via aliasing) or (b) the accesses are
> to non-overlapping objects.  In other words, one of the situations
> described by cmp==1 or cmp==0 must be true, but we don't know which
> at compile time.
Right.  That was the conclusion I came to.  If a  SYMBOL_REF has an
alias, the alias must have the same value as the SYMBOL_REF.  So their
either equal or there's no valid case for overlap.

> 
> This means that in practice, the answer to (Q1) appears to be "yes"
> but the answer to (Q2) appears to be "no".
That would be my understanding once aliases/interpositioning come into
play.

> 
> This somewhat contradicts:
> 
>   /* In general we assume that memory locations pointed to by different labels
>  may overlap in undefined ways.  */
>   return -1;
> 
> at the end of compare_base_symbol_refs, which seems to be saying
> that the answer to (Q2) ought to be "yes" instead.  Which is right?
I'm not sure how we could get to yes in that case.  A symbol alias or
interposition ultimately still results in two symbols having the same
final address.  Thus for a byte access if C1 != C2, then we can't have
an overlap.


> 
> In PR92294 we have a symbol X at ANCHOR+OFFSET that's preemptible.
> Under the (Q1)==yes/(Q2)==no assumption, cmp==-1 means that either
> (a) X = ANCHOR+OFFSET or (b) X and ANCHOR reference non-overlapping
> objects.  So we should take the offset into account when doing:
> 
>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>   return 0;
> 
> Let's call this FIX1.
So this is a really interesting wrinkle.  Doesn't this change Q2 to a
yes?  In particular it changes the "invariant" that the symbols have
the same address in the event of an symbol alias or interposition.  Of
course one could ask the question of whether or not we should handle
cases with anchors specially.


> 
> But that then brings us to: why does memrefs_conflict_p return -1
> when one symbol X has a decl and the other symbol Y doesn't, and neither
> of them are block symbols?  Is the answer to (Q3) that we allow equality
> but not overlap here too?  E.g. a linker script could define Y to X but
> not to a region that contains X at a nonzero offset?
Does digging into the history provide any insights here?

I'm not sure given the issues you've introduced if I could actually
fill out the matrix of answers without more underlying information. 
ie, when can we get symbols without source level decls, 
anchors+interposition issues, etc.

Jeff
> 



Question about changing {machine,type} modes during LTO

2020-01-27 Thread Erick Ochoa
Hello,

I have a problem with a transformation I'm working on and I would appreciate
some help. The transformation I am working on removes fields in structs early
during link-time. For the purposes of development and this example, my
transformation deletes the field identified as "delete_me" from the struct
identified as "astruct_s". These identifiers are hard coded in the
transformation at the moment.

For example:

```c
int
main()
{
   struct astruct_s { _Bool a; _Bool delete_me; _Bool c;};
   // more
}
```

should be equivalent to

```c
int
main()
{
   struct astruct_s { _Bool a; _Bool c;};
   // more
}
```

as long as no instruction accesses field "delete me".

I have succeeded in eliminating field "delete_me" from struct "astruct_s" and
at the same time successfully calculating field offsets and array offsets for
a subset of the C syntax. I am working on expanding the allowed syntax and at
the same time creating tests to verify my assumptions/work is still producing
correct results.

I was starting work on supporting arrays of multiple dimensions, when I found
an interesting edge case in my transformation. I was able to transform structs
of size 2, 3, (but not 4), 5, 6, 7, (but not 8), 9, 10... This was the stack
trace when the error was triggered:

```
a.c: In function ‘main’:
a.c:11:19: internal compiler error: in convert_move, at expr.c:219
   11 |  struct astruct_s b = a[argc][argc];
  |   ^
0xb8bac3 convert_move(rtx_def*, rtx_def*, int)
/home/eochoa/code/gcc/gcc/expr.c:219
0xb9f5cf store_expr(tree_node*, rtx_def*, int, bool, bool)
/home/eochoa/code/gcc/gcc/expr.c:5825
0xb9d913 expand_assignment(tree_node*, tree_node*, bool)
/home/eochoa/code/gcc/gcc/expr.c:5509
0xa08bfb expand_gimple_stmt_1
/home/eochoa/code/gcc/gcc/cfgexpand.c:3746
0xa09047 expand_gimple_stmt
/home/eochoa/code/gcc/gcc/cfgexpand.c:3844
0xa1170f expand_gimple_basic_block
/home/eochoa/code/gcc/gcc/cfgexpand.c:5884
0xa134b7 execute
/home/eochoa/code/gcc/gcc/cfgexpand.c:6539
Please submit a full bug report,
```

Looking at expr.c:219 I found the following assertions

```c
/* Copy data from FROM to TO, where the machine modes are not the same.
   Both modes may be integer, or both may be floating, or both may be
   fixed-point.
   UNSIGNEDP should be nonzero if FROM is an unsigned type.
   This causes zero-extension instead of sign-extension.  */

void
convert_move (rtx to, rtx from, int unsignedp)
{
  machine_mode to_mode = GET_MODE (to);
  machine_mode from_mode = GET_MODE (from);

  gcc_assert (to_mode != BLKmode);
  gcc_assert (from_mode != BLKmode); <-- crashes here
```

I started reading the gcc internals around machine modes:
https://gcc.gnu.org/onlinedocs/gccint/Machine-Modes.html
and tried the experiment where I first compiled a struct of size 2 (and delete
field "delete_me"), then of size 3 and so on, and so on. I noticed that the
TYPE_MODE for matches the machine mode. And that it varies with the size of the
struct. (Which agrees with the definition of machine mode.)

I originally thought that I needed to set TYPE_MODE myself, but if layout_type
is called after deleting the field (which it is), then TYPE_MODE is correctly
set somewhere within layout_type:
https://github.com/gcc-mirror/gcc/blob/68697710fdd35077e8617f493044b0ea717fc01a/gcc/stor-layout.c#L2203
I verified that layout_type is setting the correct values for TYPE_MODE when
transforming struct "astruct_s" by comparing the TYPE_MODE of different sizes
without the transformation applied. When transforming structs, layout_type
always returned a TYPE_MODE which matched the TYPE_MODE for unmodified structs
with the same size as the transformed struct (post transformation).

In other words:

For variable "struct not_transformed b" without transformation I obtain
the following relationship. Without transformation:

| size | typemode |
|--|--|
| 1| 13   |
| 2| 14   |
| 3| 1|
| 4| 15   |
| 5| 1|
| 6| 1|
| 7| 1|
| 8| 16   |
| 9| 1|

With transformation (i.e. astruct_s b with a field named "delete_me")

| size before | size after | typemode |
|-||--|
| 2   | 1  | 13   |
| 3   | 2  | 14   |
| 4   | 3  | 1|
| 5   | 4  | 15   |
| 6   | 5  | 1|
| 7   | 6  | 1|
| 8   | 7  | 1|
| 9   | 8  | 16   |

I have a similar result for variable 
"struct astructs b[]". Without modifications:

| size | type_mode |
|--|---|
| 1| 14   |
| 2| 15   |
| 3| 1|
| 4| 16   |
| 5| 1|
| 6| 1|

With deletion of a field:

| old size | size | type_mode|
|--|--|--|
| 2| 1| 14   |
| 3| 2| 15   |
| 4

Re: fast_math_flags_set_p vs. set_fast_math_flags inconsistency?

2020-01-27 Thread Ulrich Weigand
Joseph Myers wrote:
> On Tue, 21 Jan 2020, Ulrich Weigand wrote:
> 
> > It looks like there's multiple cases here.  For the two flags
> > -fassociative-math and -freciprocal-math, it seems to have happened just as
> > you describe: they were created (split out of -funsafe-math-optimizations)
> > in commit a1a826110720eda37c73f829daa4ee243ee953f5, which however did not
> > update fast_math_flags_set_p.
> 
> So that's a bug.

OK, agreed.

> > For the other three flags, -fsignaling-nans, -frounding-math, and
> > -fcx-limited-range, the story appears to be a bit different: from the
> 
> The first two of those are disabled by default as well as disabled by 
> -ffast-math, so it seems right that -fno-fast-math does nothing with them 
> and that they aren't checked by fast_math_flags_set_p.

I see.  I guess that makes me wonder what -fno-fast-math *ever* does
(except canceling a -ffast-math earlier on the command line).  Looking
at the current code, -fno-fast-math (just like -ffast-math) only ever
sets flags whose default is not overridden on the command line, but
then it always sets them to their default value!

Am I missing something here?  If that's the intent, it might be cleaner
to write set_fast_math_flags as just one big
  if (set)
{
}

> The last one is disabled by default but enabled by -ffast-math.  So it 
> would seem appropriate to handle it like other such options, disable it 
> with -fno-fast-math, and check it in fast_math_flags_set_p.

OK.

> > Finally, there is one "mixed" flag, -fexcess-precision, which is handled
> > like the above three in that its default is only modified as a result of
> > -ffast-math, not -fno-fast-math; but nevertheless this flag *is* checked
> > in fast_math_flags_set_p.
> 
> That one's trickier because the default depends on whether a C standards 
> conformance mode is specified.

This also makes sense if we consider the semantics of -fno-fast-math to
just leave all component flags at their default, as above ...

(As an aside, the current code is even more confusing as it has a dead
condition:

  if (set)
{
  if (opts->frontend_set_flag_excess_precision == EXCESS_PRECISION_DEFAULT)
opts->x_flag_excess_precision
  = set ? EXCESS_PRECISION_FAST : EXCESS_PRECISION_DEFAULT;

The second test of "set" must always be true here, so this will never actually
actively set the flag to EXCESS_PRECISION_DEFAULT.)

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: fast_math_flags_set_p vs. set_fast_math_flags inconsistency?

2020-01-27 Thread Joseph Myers
On Mon, 27 Jan 2020, Ulrich Weigand wrote:

> I see.  I guess that makes me wonder what -fno-fast-math *ever* does
> (except canceling a -ffast-math earlier on the command line).  Looking
> at the current code, -fno-fast-math (just like -ffast-math) only ever
> sets flags whose default is not overridden on the command line, but
> then it always sets them to their default value!

As a general principle, more specific flags take precedence over less 
specific ones, regardless of the command-line order.  So it's correct for 
-ffast-math and -fno-fast-math not to do anything with a flag that was 
explicitly overridden by the user (modulo any issues where a particular 
combination of flags is unsupported by GCC, as with the 
"%<-fassociative-math%> disabled; other options take precedence" case in 
toplev.c).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Multi-Threading Ideas

2020-01-27 Thread Eric Gallager
On 1/24/20, Richard Earnshaw (lists)  wrote:
> On 24/01/2020 10:27, Jonathan Wakely wrote:
>> On Fri, 24 Jan 2020 at 03:39, Nicholas Krause 
>> wrote:
>>> Sorry for the second message Allan but make -j does not scale well
>>> beyond 4 or
>>> 8 threads and that's considering a 4 core or 8 machine. The problem has
>>> to
>>> do with large build machines with CPUs with more cores than this or as
>>> is becoming
>>> more common on mainstream systems.
>>
>> And make scales well beyond 8 processes (not threads) on such machines.
>>
>
> The problem isn't make, per se, or even gcc.  It's the build system as a
> whole.
>
> On a highly multi-core machine, gcc itself hits the bottle-neck called
> configure.  That's serial, run *many* times (especially when there are
> many multilibs) and dominates build time.
>
> On high multi-core machines, gcc's 15-minute system load gets no-where
> near to the number of threads on the machine because of this.
>
> R.
>

It would be great if we could get some new autotools releases some
time to help with this; autoconf in particular hasn't had an update in
several years now AFAIK. While automake has had updates more recently
than autoconf, they've mostly just been to the automake part itself
and not to the aclocal program that comes with it, and aclocal in
particular is another bottleneck for people who regenerate the build
system files (although it could just be that way in my case because I
have so many m4 macro files installed on my system for it to search
thru for macros every time...)