Re: GCC documentation: porting to Sphinx

2021-06-24 Thread Martin Liška

On 6/23/21 6:00 PM, Joseph Myers wrote:

On Wed, 23 Jun 2021, Martin Liška wrote:


@Joseph: Can you share your thoughts about the used Makefile integration? What
do you suggest for 2)
(note that explicit listing of all .rst file would be crazy)?


You can write dependencies on e.g. doc/gcc/*.rst (which might be more
files than actually are relevant in some cases, if the directory includes
some common files shared by some but not all manuals, but should be
conservatively safe if you list appropriate directories there), rather
than needing to name all the individual files.  Doing things with makefile
dependencies seems better than relying on what sphinx-build does when
rerun unnecessarily (if sphinx-build avoids rebuilding in some cases where
the makefiles think a rebuild is needed, that's fine as an optimization).


All right. I've just done that and it was easier than I expected. Now the 
dependencies
are properly followed.



It looks like this makefile integration loses some of the srcinfo / srcman
support.  That support should stay (be updated for the use of Sphinx) so
that release tarballs (as generated by maintainer-scripts/gcc_release,
which uses --enable-generated-files-in-srcdir) continue to include man
pages / info files (and make sure that, if those files are present in the
source directory, then building and installing GCC does install them even
when sphinx-build is absent at build/install time).



Oh, and I've just recovered this one as well. Pushed changes to the me/sphinx-v2
branch and I'm waiting for more feedback.

In the meantime, I'm going to prepare further integration of other manuals and
targets (PDF, HTML).

Martin


daily report on extending static analyzer project [GSoC]

2021-06-24 Thread Ankur Saini via Gcc
CURRENT STATUS :

analyzer is now splitting nodes even at call sites which doesn’t have a 
cgraph_edge. But as now the call and return nodes are not connected, the part 
of the function after such calls becomes unreachable making them impossible to 
properly analyse.

AIM for today : 

- try to create an intra-procedural link between the calls the calling and 
returning snodes 
- find the place where the exploded nodes and edges are being formed 
- figure out the program point where exploded graph would know about the 
function calls

—

PROGRESS :

- I initially tried to connect the calling and returning snodes with an 
intraprocedural sedge but looks like for that only nodes which have a 
cgraph_edge or a CFG edge are connected in the supergraph. I tried a few ways 
to connect them but at the end thought I would be better off leaving them like 
this and connecting them during the creation of exploded graph itself.

- As the exploded graph is created during building and processing of the 
worklist, "build_initial_worklist ()” and “process_worklist()” should be the 
interesting areas to analyse, especially the processing part.

- “build_initial_worklist()” is just creating enodes for functions that can be 
called explicitly ( possible entry points ) so I guess the better place to 
investigate is “process_worklist ()” function.

—

STATUS AT THE END OF THE DAY :- 

- try to create an intra-procedural link between the calls the calling and 
returning snodes ( Abandoned )
- find the place where the exploded nodes and edges are being formed ( Done )
- figure out the program point where exploded graph knows about the function 
call ( Pending )


Thank you
- Ankur

Re: replacing the backwards threader and more

2021-06-24 Thread Jeff Law via Gcc




On 6/21/2021 8:40 AM, Aldy Hernandez wrote:



On 6/9/21 2:09 PM, Richard Biener wrote:
On Wed, Jun 9, 2021 at 1:50 PM Aldy Hernandez via Gcc 
 wrote:


Hi Jeff.  Hi folks.

What started as a foray into severing the old (forward) threader's
dependency on evrp, turned into a rewrite of the backwards threader
code.  I'd like to discuss the possibility of replacing the current
backwards threader with a new one that gets far more threads and can
potentially subsume all threaders in the future.

I won't include code here, as it will just detract from the high level
discussion.  But if it helps, I could post what I have, which just 
needs

some cleanups and porting to the latest trunk changes Andrew has made.

Currently the backwards threader works by traversing DEF chains through
PHIs leading to possible paths that start in a constant.  When such a
path is found, it is checked to see if it is profitable, and if so, the
constant path is threaded.  The current implementation is rather 
limited

since backwards paths must end in a constant.  For example, the
backwards threader can't get any of the tests in
gcc.dg/tree-ssa/ssa-thread-14.c:

    if (a && b)
  foo ();
    if (!b && c)
  bar ();

etc.

After my refactoring patches to the threading code, it is now possible
to drop in an alternate implementation that shares the profitability
code (is this path profitable?), the jump registry, and the actual jump
threading code.  I have leveraged this to write a ranger-based threader
that gets every single thread the current code gets, plus 90-130% more.

Here are the details from the branch, which should be very similar to
trunk.  I'm presenting the branch numbers because they contain Andrew's
upcoming relational query which significantly juices up the results.

New threader:
   ethread:65043    (+3.06%)
   dom:32450  (-13.3%)
   backwards threader:72482   (+89.6%)
   vrp:40532  (-30.7%)
    Total threaded:  210507 (+6.70%)

This means that the new code gets 89.6% more jump threading
opportunities than the code I want to replace.  In doing so, it reduces
the amount of DOM threading opportunities by 13.3% and by 30.7% from 
the

VRP jump threader.  The total  improvement across the jump threading
opportunities in the compiler is 6.70%.

However, these are pessimistic numbers...

I have noticed that some of the threading opportunities that DOM and 
VRP

now get are not because they're smarter, but because they're picking up
opportunities that the new code exposes.  I experimented with 
running an
iterative threader, and then seeing what VRP and DOM could actually 
get.

   This is too expensive to do in real life, but it at least shows what
the effect of the new code is on DOM/VRP's abilities:

    Iterative threader:
  ethread:65043    (+3.06%)
  dom:31170    (-16.7%)
  thread:86717    (+127%)
  vrp:33851    (-42.2%)
    Total threaded:  216781 (+9.90%)

This means that the new code not only gets 127% more cases, but it
reduces the DOM and VRP opportunities considerably (16.7% and 42.2%
respectively).   The end result is that we have the possibility of
getting almost 10% more jump threading opportunities in the entire
compilation run.


Yeah, DOM once was iterating ...

You probably have noticed that we have very man (way too many)
'thread' passes, often in close succession with each other or
DOM or VRP.  So in the above numbers I wonder if you can break
down the numbers individually for the actual passes (in their order)?


As promised.

*** LEGACY:
ethread42:61152 30.1369% (61152 threads for 30.1% of total)
thread117:29646 14.6101%
vrp118:62088 30.5982%
thread132:2232 1.09997%
dom133:31116 15.3346%
thread197:1950 0.960998%
dom198:10661 5.25395%
thread200:587 0.289285%
vrp201:3482 1.716%
Total:  202914


The above is from current trunk with my patches applied, defaulting to 
legacy mode.  It follows the pass number nomenclature in the 
*.statistics files.


New threader code (This is what I envision current trunk to look with 
my patchset):


*** RANGER:
ethread42:64389 30.2242%
thread117:49449 23.2114%
vrp118:46118 21.6478%
thread132:8153 3.82702%
dom133:27168 12.7527%
thread197:5542 2.60141%
dom198:8191 3.84485%
thread200:1038 0.487237%
vrp201:2990 1.40351%
Total:  213038
So this makes me think we should focus on dropping thread197, thread200, 
& vrp201 and I'd probably focus on vrp201 first since we know we want to 
get rid of it anyway and that may change the data for thread???.  Then 
I'd be looking at thread200 and thread197 in that order.  I suspect that 
at least some of the cases in thread200 and vrp201 are exposed by dom198.



Jeff


Re: daily report on extending static analyzer project [GSoC]

2021-06-24 Thread David Malcolm via Gcc
On Thu, 2021-06-24 at 19:59 +0530, Ankur Saini wrote:
> CURRENT STATUS :
> 
> analyzer is now splitting nodes even at call sites which doesn’t have
> a cgraph_edge. But as now the call and return nodes are not
> connected, the part of the function after such calls becomes
> unreachable making them impossible to properly analyse.
> 
> AIM for today : 
> 
> - try to create an intra-procedural link between the calls the
> calling and returning snodes 
> - find the place where the exploded nodes and edges are being formed 
> - figure out the program point where exploded graph would know about
> the function calls
> 
> —
> 
> PROGRESS :
> 
> - I initially tried to connect the calling and returning snodes with
> an intraprocedural sedge but looks like for that only nodes which
> have a cgraph_edge or a CFG edge are connected in the supergraph. I
> tried a few ways to connect them but at the end thought I would be
> better off leaving them like this and connecting them during the
> creation of exploded graph itself.
> 
> - As the exploded graph is created during building and processing of
> the worklist, "build_initial_worklist ()” and “process_worklist()”
> should be the interesting areas to analyse, especially the processing
> part.
> 
> - “build_initial_worklist()” is just creating enodes for functions
> that can be called explicitly ( possible entry points ) so I guess
> the better place to investigate is “process_worklist ()” function.

Yes.

Have a look at exploded_graph::process_node (which is called by
process_worklist).
The eedges for calls with supergraph edges happens there in
the "case PK_AFTER_SUPERNODE:", which looks at the outgoing superedges
from that supernode and calls node->on_edge on them, creating a
exploded nodes/exploded edge for each outgoing-superedge.

So you'll need to make some changes there, I think.

> 
> —
> 
> STATUS AT THE END OF THE DAY :- 
> 
> - try to create an intra-procedural link between the calls the
> calling and returning snodes ( Abandoned )

You may find the above useful if you're going to do it based on the
code I mentioned above.

> - find the place where the exploded nodes and edges are being formed
> ( Done )
> - figure out the program point where exploded graph knows about the
> function call ( Pending )
> 

Thanks for the update.
Hope the above is helpful.

Dave



RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-06-24 Thread Eugene Rozenfeld via Gcc
Hi Andy,

I'm trying to revive autofdo testing. One of the issues I'm running into with 
my setup is that PEBS doesn't work for with perf record even though PEBS is 
enabled.
I'm running Ubuntu 20.04 in a Hyper-V virtual machine; the processor is Icelake 
(GenuineIntel-6-7E).

I did the following:

1. Enabled pmu, lbr, and pebs in my Hyper-V virtual machine as described in 
https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/manage/performance-monitoring-hardware
2. Verified that pmu, lbr, and pebs are enabled in the vm by running 
erozen@erozen-Virtual-Machine:~/objdir/gcc$ dmesg | egrep -i 'pmu'
[0.266474] Performance Events: PEBS fmt4+, Icelake events, 32-deep 
LBR, full-width counters, Intel PMU driver.
3. Ran
erozen@erozen-Virtual-Machine:~/objdir/gcc$ perf record -e 
cpu/event=0xc4,umask=0x20/pu -b -m8 true -v
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) 
for event (cpu/event=0xc4,umask=0x20/pu).
/bin/dmesg | grep -i perf may provide additional information.

Omitting /p works fine:
erozen@erozen-Virtual-Machine:~/objdir/gcc$ perf record -e 
cpu/event=0xc4,umask=0x20/u -b -m8 true -v
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 0.007 MB perf.data (11 samples) ]

Is there a way to get PEBS working with perf record in a vm? I would appreciate 
any pointers on how to investigate this.

The version of perf I'm using is 5.8.18.

Thanks,

Eugene

-Original Message-
From: Andi Kleen  
Sent: Friday, April 30, 2021 2:46 PM
To: Eugene Rozenfeld via Gcc 
Cc: Xinliang David Li ; Richard Biener 
; Eugene Rozenfeld 
; Jan Hubicka 
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

Eugene Rozenfeld via Gcc  writes:

> Is the format produced by create_gcov and expected by GCC under 
> -fauto-rpofile documented somewhere? How is it different from .gcda 
> used in FDO, e.g., as described here:
> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsrc.gnu-darwin.org%2Fsrc%2Fcontrib%2Fgcc%2Fgcov-io.h.html&data=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C6c14ea3d93c44364845008d90c214cb6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637554159427749575%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=q9ROiOTma41UeQ%2FQG%2BUktEOrHAWonTTpcPRPmx%2Fgw0g%3D&reserved=0?

I believe it's very similar.

> I would prefer that AutoFDO is not removed from GCC and it would be 
> helpful if create_gcov were restored in google/autofdo. I checked out 
> a revision before the recent merge and tried it on a simple example 
> and it seems to work.
> I'm also interested in contributing improvements for AutoFDO so will 
> try to investigate 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D71672&data=04%7C01%7CEuge
> ne.Rozenfeld%40microsoft.com%7C6c14ea3d93c44364845008d90c214cb6%7C72f9
> 88bf86f141af91ab2d7cd011db47%7C1%7C0%7C637554159427749575%7CUnknown%7C
> TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVC
> I6Mn0%3D%7C1000&sdata=99Igueuxq7AoHU%2B20BZs4E4K5rgdPFCiR8eygKaJdK
> E%3D&reserved=0 and 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D81379&data=04%7C01%7CEuge
> ne.Rozenfeld%40microsoft.com%7C6c14ea3d93c44364845008d90c214cb6%7C72f9
> 88bf86f141af91ab2d7cd011db47%7C1%7C0%7C637554159427759566%7CUnknown%7C
> TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVC
> I6Mn0%3D%7C1000&sdata=Es91Dtt5Wt6%2BJtPWxHhkHqdWBVwzCiF5PcuXoHjY%2
> Bzs%3D&reserved=0

That would be great.

-Andi


gcc-9-20210624 is now available

2021-06-24 Thread GCC Administrator via Gcc
Snapshot gcc-9-20210624 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/9-20210624/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 9 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-9 
revision 9b997caa72498bc3a14a064648b721fe0f11945e

You'll find:

 gcc-9-20210624.tar.xzComplete GCC

  SHA256=eeb8581533b18381da806203b6cde8c114b87a918a47da8ea8053cbbc3548925
  SHA1=74ba9599eb5bf5ef8e7c353d209cd2519c9f5ebd

Diffs from 9-20210617 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


__fp16 is ambiguous error in C++

2021-06-24 Thread ALO via Gcc
#include 

__fp16 foo (__fp16 a, __fp16 b)
{
return a + std::exp(b);
}

compiler options:
=

riscv64-unknown-linux-gnu-g++ foo.c -march=rv64gc_zfh -mabi=lp64

error:
==

foo.c: In function '__fp16 foo(__fp16, __fp16)':
foo.c:6:23: error: call of overloaded 'exp(__fp16&)' is ambiguous
6 | return a + std::exp(b);
| ^
In file included from $INSTALL/sysroot/usr/include/features.h:465,
from 
$INSTALL/riscv64-unknown-linux-gnu/include/c++/10.2.0/riscv64-unknown-linux-gnu/bits/os_defines.h:39,
from 
$INSTALL/riscv64-unknown-linux-gnu/include/c++/10.2.0/riscv64-unknown-linux-gnu/bits/c++config.h:518,
from $INSTALL/riscv64-unknown-linux-gnu/include/c++/10.2.0/cmath:41,
from $INSTALL/riscv64-unknown-linux-gnu/include/c++/10.2.0/math.h:36,
from foo.c:2:
$INSTALL/sysroot/usr/include/bits/mathcalls.h:95:1: note: candidate: 'double 
exp(double)'
95 | __MATHCALL_VEC (exp,, (Mdouble __x));
| ^~
In file included from 
$INSTALL/riscv64-unknown-linux-gnu/include/c++/10.2.0/math.h:36,
from foo.c:2:
$INSTALL/riscv64-unknown-linux-gnu/include/c++/10.2.0/cmath:222:3: note: 
candidate: 'constexpr float std::exp(float)'
222 | exp(float __x)
| ^~~
$INSTALL/riscv64-unknown-linux-gnu/include/c++/10.2.0/cmath:226:3: note: 
candidate: 'constexpr long double std::exp(long double)'
226 | exp(long double __x)
| ^~~

I think there is no prototype of __fp16 in libmath of glibc,
I could cast '__fp16' to 'float' or 'double' to fix this issue with modifying 
code, it's not invisible for developers :(

Is there any other method to fix this ?

Maybe there is some c++ compiler's option for this ?

— Jojo