Re: Project Ranger

2018-06-01 Thread Eric Botcazou
> First, we'll collect the cases that demonstrate a unique situation we
> care about.  I have 4 very specific case that show current
> shortcomings.. Not just with the Ranger, but a couple we don't handle
> with VRP today. .. I'll eventually get those put onto the wiki so the
> list can be updated.

I'm not specifically worried here, there is already a couple of testcases in 
the testsuite that require symbolic information to be properly optimized.

> I think most of these cases that care about symbolics are not so much
> range related, but rather an algorithmic layer on top. Any follow on
> optimization to either enhance or replace vrp or anything similar will
> simply use the ranger as a client.  If it turns out there are cases
> where we *have* to remember the end point of a range as a symbolic, then
> the algorithm to track that symbolic along with the range, and request a
> re-evaluation of the range when the value of that symbolic is changes.
>
> [...]
>
> Does that help?   If it does, I'll add this to the coverage in the wiki
> page.

Yes, thanks for the detailed explanation.  This approach of setting aside the 
handling of symbolic information might end up being a good compromise between 
the necessary minimal[*] handling of this information and the complexity of 
doing it directly in the Ranger.

[*] The implicit assumption hee is that a VRP implementation with full-blown 
support of symbolic ranges is not worth the hassle in practice.

-- 
Eric Botcazou


Re: RISC-V ELF multilibs

2018-06-01 Thread Michael Clark



> On 1/06/2018, at 6:16 PM, Sebastian Huber 
>  wrote:
> 
> On 29/05/18 20:02, Jim Wilson wrote:
>>> Most variants include the C extension. Would it be possible to add 
>>> -march=rv32g and -march=rv64g variants?
>> 
>> The expectation is that most implementations will include the C extension.  
>> It reduces code size, improves performance, and I think I read somewhere 
>> that it takes only 400 gates to implement.
>> 
>> It isn't practical to try to support every possible combination of 
>> architecture and ABI here, as there are too many possible combinations. But 
>> if there is a major RISC-V target that is rv32g or rv64g then we should 
>> consider it.  You can of course define your own set of multilibs.
> 
> I am not a hardware developer, but I heard that the C extension is not easy 
> to implement in out of order machines.

Easy is relative.

The RISC-V ISA encoding has been designed so that wide fetch is relatively 
easy, possibly an order of magnitude easier than an ISA with a complex variable 
length encoding like x86. 

RV64GC instruction lengths can be derived from the 2 least significant bits in 
the first half-word packet of each instruction for mixed 16/32 bit 
instructions. It does not require an attempt to parse multiple prefixes with 
instructions ranging from 1 byte up to 15 bytes, before arriving at an 
instruction length (x86). From that perspective RV64GC is decidedly simple. 
That said, RV64GC is a little more complex than regular 32-bit encodings like 
RV64G, PowerPC, AArch64, or SPARC.

I don’t think the paper you have linked to draws the conclusion you are 
arriving at. I spoke to Chris Celio about RVC and he said he just just didn’t 
get around to implementing RVC in BOOM, not necessarily that it’s absence is a 
statement of its difficulty rather the academic objectives of BOOM, one of a 
very small number of Open Source OoO processors. Sure, if it was “trivial” then 
BOOM would include it, so it’s certainly non trivial.

For BOOM, RVC requires an instruction buffer that handles unaligned fetches and 
the way the PC is derived further down the pipeline needs to be modified as the 
same micro-op may have a different width depending on whether it was expanded 
from an RVC encoding. It may or may not an extra pre-decoder pipe stage. I have 
a fair understanding of what’s requested. The misaligned fetch is probably the 
most complex but BOOM already needs to have complex cases like the fetch buffer 
spanning cache lines and page boundaries. Unaligned instruction fetches add to 
this.

> For example:
> 
> https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-157.html
> 
> -- 
> Sebastian Huber, embedded brains GmbH
> 
> Address : Dornierstr. 4, D-82178 Puchheim, Germany
> Phone   : +49 89 189 47 41-16
> Fax : +49 89 189 47 41-09
> E-Mail  : sebastian.hu...@embedded-brains.de
> PGP : Public key available on request.
> 
> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
> 


Re: Project Ranger

2018-06-01 Thread Richard Biener
On Wed, May 30, 2018 at 1:53 AM Andrew MacLeod  wrote:
>
> I'd like to introduce a project we've been working on for the past year
> an a half.
>
> The original project goal was to see if we could derived accurate range
> information from the IL without requiring much effort on the client
> side. The idea being that a pass could simply ask "what is the range of
> this ssa_name on this statement? "  and the compiler would go figure it out.
>
> After lots of experimenting and prototyping the project evolved into
> what we are introducing here. I call it the Ranger.
>
> Existing range infrastructure in the compiler works from the top down.
> It walks through the IL computing all ranges and propagates these values
> forward in case they are needed.  For the most part, other passes are
> required to either use global information, or process things in
> dominator order and work lockstep with EVRP to get more context
> sensitive ranges.
>
> The Ranger's model is purely on-demand, and designed to have minimal
> overhead.   When a range is requested, the Ranger walking backwards
> through use-def chains to determine what ranges it can find relative to
> the name being requested.  This means it only looks at statements which
> are deemed necessary to evaluate a range.  This can result is some
> significant  speedups when a pass is only interested in a few specific
> cases, as is demonstrated in some of the pass conversions we have
> performed. We have also implemented a "quick and dirty" vrp-like pass
> using the ranger to demonstrate that it can also handle much heavier
> duty range work and still perform well.
>
> The code is located on an svn branch *ssa-range*.  It is based on trunk
> at revision *259405***circa mid April 2018. **The branch currently
> bootstraps with no regressions.  The top level ranger class is called
> 'path_ranger' and is found in the file ssa-range.h.  It has 4 primary API's:
>I'd like to introduce a project we've been working on for the past year
an a half.
>   * bool path_range_edge (irange& r, tree name, edge e);
>   * bool path_range_entry (irange& r, tree name, basic_block bb);
>   * bool path_range_on_stmt (irange&r, tree name, gimple *g);
>   * bool path_range_stmt (irange& r, gimple *g);
>
> This allows queries for a range on an edge, on entry to a block, as an
> operand on an specific statement, or to calculate the range of the
> result of a statement.  There are no prerequisites to use it, simply
> create a path ranger and start using the API.   There is even an
> available function which can be lightly called and doesn't require
> knowledge of the ranger:

So what the wiki pages do not show is how contextual information
is handled (and how that's done w/o dominators as the wiki page
claims).  That is, path_range_on_edge suggests that range information
provided by the (all) controlling stmts of that edge are used when
determining the range for 'name'.

That's probably true for the 'stmt' variant as well.

This leads me to guess that either the cache has to be
O(number of SSA uses) size or a simple VRP pass implemented
using Ranger querying ranges of all SSA ops for each stmt
will require O(number of SSA uses) times analysis?

One of the advantages of a DOM-style or SSA-with-ASSERT_EXPR
pass is that the number of evaluations and the size of the cache
isn't that way "unbound".  On the WIKI page you mention edge
info isn't cached yet - whatever that means ;)

So I wonder what's the speed for doing

  FOR_EACH_BB_FN (bb)
 FOR_EACH_STMT (stmt)
   FOR_EACH_SSA_USE (stmt)
   path_range_on_stmt (..., SSA-USE, stmt);

assuming that it takes into account controlling predicates.
That's actually what EVRP does (within a DOM walk) given
it tries to simplify each stmt with based on range info of the ops.

> static inline bool
> on_demand_get_range_on_stmt (irange &r, tree ssa, gimple *stmt)
> {
> path_ranger ranger;
> return ranger.path_range_on_stmt (r, ssa, stmt);
> }
>
> The Ranger consists of 3 primary components:
>
>   * range.[ch] - A new range representation purely based on wide-int ,
> and allows ranges to consist of multiple non-overlapping sub-ranges.
>   * range-op.[ch] - Implements centralized tree-code operations on the
> irange class (allowing adding, masking, multiplying, etc).
>   * ssa-range*.[ch]  - Files containing a set of classes which implement
> the Ranger.
>
> We have set up a project page on the wiki which contains documentation
> for the approach as well as some converted pass info and a to-do list here:
>
> https://gcc.gnu.org/wiki/AndrewMacLeod/Ranger
>
> We would like to include the ranger in GCC for this release, and realize
> there are still numerous things to be done before its ready for
> integration. It has been in prototype mode until now,  so we have not
> prepared the code for a merge yet.  No real performance analysis has
> been done on it either, but there is an integration page where you will
> 

Re: RISC-V ELF multilibs

2018-06-01 Thread Sebastian Huber




On 01/06/18 10:23, Michael Clark wrote:



On 1/06/2018, at 6:16 PM, Sebastian Huber  
wrote:

On 29/05/18 20:02, Jim Wilson wrote:

Most variants include the C extension. Would it be possible to add -march=rv32g 
and -march=rv64g variants?

The expectation is that most implementations will include the C extension.  It 
reduces code size, improves performance, and I think I read somewhere that it 
takes only 400 gates to implement.

It isn't practical to try to support every possible combination of architecture 
and ABI here, as there are too many possible combinations. But if there is a 
major RISC-V target that is rv32g or rv64g then we should consider it.  You can 
of course define your own set of multilibs.

I am not a hardware developer, but I heard that the C extension is not easy to 
implement in out of order machines.

Easy is relative.

The RISC-V ISA encoding has been designed so that wide fetch is relatively 
easy, possibly an order of magnitude easier than an ISA with a complex variable 
length encoding like x86.

RV64GC instruction lengths can be derived from the 2 least significant bits in 
the first half-word packet of each instruction for mixed 16/32 bit 
instructions. It does not require an attempt to parse multiple prefixes with 
instructions ranging from 1 byte up to 15 bytes, before arriving at an 
instruction length (x86). From that perspective RV64GC is decidedly simple. 
That said, RV64GC is a little more complex than regular 32-bit encodings like 
RV64G, PowerPC, AArch64, or SPARC.

I don’t think the paper you have linked to draws the conclusion you are 
arriving at.


Yes, please use my words with care. I am not a hardware developer. 
Reading "A.3 The Compressed (“C”) ISA Extension" in


https://github.com/ccelio/riscv-boom-doc/

lead me to this statement that it is not easy from my inexperienced 
point of view.


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Copyright assignment form

2018-06-01 Thread Andrew Benson
Hello,

I'd like to contribute a patch to GCC, and was told I need to get and sign a 
copyright assignment form before I can do so. Could you send me the form - and 
if there's a separate form needed for any release of copyright by my employer 
(as the patch was developed as part of my job) I would need that also.

Thank you,
Andrew Benson


Fwd: Is there a pass in GCC that provide us may alias information

2018-06-01 Thread vineet singh
Hi,

I want to know that whether GCC provides a pass that gives us "may alias"
information and how we can access it?

If yes, then is this may alias information available before and after each
pass of GCC?

-- 
Regards
Vineet Singh
Master of Technology
Department of Computer Science & Technology
Indian Institute of Technology Bombay


Doing pdp11 cc0->CC conversion, running into ICE related to compare-elim

2018-06-01 Thread Paul Koning
Gentlepeople,

I'm using Eric Botcazou's recipe #2 for the CCmode version of pdp11 -- where 
most instructions step on the condition codes so the CC references are inserted 
post-reload.  As part of that, the compare-elim pass gets rid of compares that 
are redundant given that the instruction before it already set the CC correctly.

If I do that I get an ICE in cselib that I can't figure out.  If I turn off the 
post-reload compare elimination optimization then everything works.  

The error is this:

during RTL pass: dse2
dump file: unwind-dw2-fde.c.288r.dse2
../../../../gcc/libgcc/unwind-dw2-fde.c: In function ‘get_cie_encoding’:
../../../../gcc/libgcc/unwind-dw2-fde.c:342:1: internal compiler error: in 
cselib_record_set, at cselib.c:2409
 }
 ^

I don't understand what it's looking for or why it is unhappy.

The RTL that blows up, as reported in the 288r.dse dump file, looks like this:

(insn 195 194 197 11 (parallel [
(set (reg:CC 16 cc)
(compare:CC (mem:QI (post_inc:HI (reg/v/f:HI 0 r0 [orig:41 p ] 
[41])) [0 MEM[base: p_38, offset: 65535B]+0 S1 A8])
(const_int 0 [0])))
(set (reg:QI 2 r2 [orig:38 byte.10_40 ] [38])
(mem:QI (post_inc:HI (reg/v/f:HI 0 r0 [orig:41 p ] [41])) [0 
MEM[base: p_38, offset: 65535B]+0 S1 A8]))
]) "../../../../gcc/libgcc/unwind-pe.h":166 20 {*mov_ccqi}
 (expr_list:REG_UNUSED (reg:QI 2 r2 [orig:38 byte.10_40 ] [38])
(expr_list:REG_UNUSED (reg/v/f:HI 0 r0 [orig:41 p ] [41])
(expr_list:REG_INC (reg/v/f:HI 0 r0 [orig:41 p ] [41])
(nil)

I wonder if the REG_UNUSED is the issue?  The input to the compare elimination 
pass looks like this:

(insn 195 194 196 11 (parallel [
(set (reg:QI 2 r2 [orig:38 byte.10_40 ] [38])
(mem:QI (post_inc:HI (reg/v/f:HI 0 r0 [orig:41 p ] [41])) [0 
MEM[base: p_38, offset: 65535B]+0 S1 A8]))
(clobber (reg:CC 16 cc))
]) "../../../../gcc/libgcc/unwind-pe.h":166 18 {*mov_noccqi}
 (expr_list:REG_INC (reg/v/f:HI 0 r0 [orig:41 p ] [41])
(nil)))
(insn 196 195 197 11 (set (reg:CC 16 cc)
(compare:CC (reg:QI 2 r2 [orig:38 byte.10_40 ] [38])
(const_int 0 [0]))) "../../../../gcc/libgcc/unwind-pe.h":166 7 
{*cmpqi}
 (nil))

and the compare (insn 196) was eliminated because there is a pattern for that 
load operation that sets CC the right way.  However... the only reason the load 
was done was to provide the input to the compare.  So by eliminating the 
compare, the load no longer serves a purpose, as indicated by the REG_UNUSED 
note in the DSE output.  Is that why it complained? 

If that's the issue, what can I do to make it go away?  One answer would be not 
to have that load; the compare can accept a memory operand just fine, 
presumably the costs need adjusting.  But it seems a bit odd to rely on costs 
to avoid an ICE.

paul



Re: Doing pdp11 cc0->CC conversion, running into ICE related to compare-elim

2018-06-01 Thread Jakub Jelinek
On Fri, Jun 01, 2018 at 02:35:41PM -0400, Paul Koning wrote:
> during RTL pass: dse2
> dump file: unwind-dw2-fde.c.288r.dse2
> ../../../../gcc/libgcc/unwind-dw2-fde.c: In function ‘get_cie_encoding’:
> ../../../../gcc/libgcc/unwind-dw2-fde.c:342:1: internal compiler error: in 
> cselib_record_set, at cselib.c:2409
>  }
>  ^
> 
> I don't understand what it's looking for or why it is unhappy.
> 
> The RTL that blows up, as reported in the 288r.dse dump file, looks like this:

I'd say it is upset by multiple post_inc in the same instruction, that is I
believe invalid.

> (insn 195 194 197 11 (parallel [
> (set (reg:CC 16 cc)
> (compare:CC (mem:QI (post_inc:HI (reg/v/f:HI 0 r0 [orig:41 p 
> ] [41])) [0 MEM[base: p_38, offset: 65535B]+0 S1 A8])
> (const_int 0 [0])))
> (set (reg:QI 2 r2 [orig:38 byte.10_40 ] [38])
> (mem:QI (post_inc:HI (reg/v/f:HI 0 r0 [orig:41 p ] [41])) [0 
> MEM[base: p_38, offset: 65535B]+0 S1 A8]))
> ]) "../../../../gcc/libgcc/unwind-pe.h":166 20 {*mov_ccqi}
>  (expr_list:REG_UNUSED (reg:QI 2 r2 [orig:38 byte.10_40 ] [38])
> (expr_list:REG_UNUSED (reg/v/f:HI 0 r0 [orig:41 p ] [41])
> (expr_list:REG_INC (reg/v/f:HI 0 r0 [orig:41 p ] [41])
> (nil)

Jakub


Re: Doing pdp11 cc0->CC conversion, running into ICE related to compare-elim

2018-06-01 Thread Paul Koning



> On Jun 1, 2018, at 2:40 PM, Jakub Jelinek  wrote:
> 
> On Fri, Jun 01, 2018 at 02:35:41PM -0400, Paul Koning wrote:
>> during RTL pass: dse2
>> dump file: unwind-dw2-fde.c.288r.dse2
>> ../../../../gcc/libgcc/unwind-dw2-fde.c: In function ‘get_cie_encoding’:
>> ../../../../gcc/libgcc/unwind-dw2-fde.c:342:1: internal compiler error: in 
>> cselib_record_set, at cselib.c:2409
>> }
>> ^
>> 
>> I don't understand what it's looking for or why it is unhappy.
>> 
>> The RTL that blows up, as reported in the 288r.dse dump file, looks like 
>> this:
> 
> I'd say it is upset by multiple post_inc in the same instruction, that is I
> believe invalid.
> 
>> (insn 195 194 197 11 (parallel [
>>(set (reg:CC 16 cc)
>>(compare:CC (mem:QI (post_inc:HI (reg/v/f:HI 0 r0 [orig:41 p 
>> ] [41])) [0 MEM[base: p_38, offset: 65535B]+0 S1 A8])
>>(const_int 0 [0])))
>>(set (reg:QI 2 r2 [orig:38 byte.10_40 ] [38])
>>(mem:QI (post_inc:HI (reg/v/f:HI 0 r0 [orig:41 p ] [41])) [0 
>> MEM[base: p_38, offset: 65535B]+0 S1 A8]))
>>]) "../../../../gcc/libgcc/unwind-pe.h":166 20 {*mov_ccqi}
>> (expr_list:REG_UNUSED (reg:QI 2 r2 [orig:38 byte.10_40 ] [38])
>>(expr_list:REG_UNUSED (reg/v/f:HI 0 r0 [orig:41 p ] [41])
>>(expr_list:REG_INC (reg/v/f:HI 0 r0 [orig:41 p ] [41])
>>(nil)

I was wondering about that.  compare-elim.cc is generating that.  It looks for 
the general pattern of
(parallel [(load dst src) (clobber cc)])
(compare dst val)
and converts that to
(parallel [ (compare src val) (load dst src)])

where there is a insn pattern of that form, such as this one:

(define_insn "*mov_cc"
  [(set (reg:CC CC_REGNUM)
(compare:CC
 (match_operand:PDPint 1 "general_operand" "rRN,Qi,rRN,Qi")
 (const_int 0)))
   (set (match_operand:PDPint 0 "nonimmediate_operand" "=rR,rR,Q,Q")
(match_dup 1))]
  "reload_completed"

Given that the starting insn had a post_inc in it, what would be a proper 
parallel... construct?  If the post_inc only appears in one of the two mentions 
of the source operatnd, then the match_dup is going to fail.  I suppose I can 
disallow pre and post_inc addressing modes, but that would be a lost 
optimization opportunity.

paul



Re: Doing pdp11 cc0->CC conversion, running into ICE related to compare-elim

2018-06-01 Thread Jakub Jelinek
On Fri, Jun 01, 2018 at 02:49:51PM -0400, Paul Koning wrote:
> Given that the starting insn had a post_inc in it, what would be a proper
> parallel...  construct?  If the post_inc only appears in one of the two
> mentions of the source operatnd, then the match_dup is going to fail.  I
> suppose I can disallow pre and post_inc addressing modes, but that would
> be a lost optimization opportunity.

E.g. compare-elim could punt on instructions with the auto-incs.

Jakub


Re: [GSOC] LTO dump tool project

2018-06-01 Thread Hrishikesh Kulkarni
Hi,
I have pushed the changes to github
(https://github.com/hrisearch/gcc). Added a command line option for
specific dumps of variables and functions used in IL e.g.
-fdump-lto-list=foo will dump:
Call Graph:

foo/1 (foo)
  Type: function
 visibility: default

Regards,
Hrishikesh

On Tue, May 29, 2018 at 11:13 PM, Martin Liška  wrote:
> On 05/29/2018 07:38 PM, Martin Liška wrote:
>>
>> $ nm main.o
>>  T main
>>  T mystring
>>  C pole
>
>
> Or we can be inspired by readelf:
>
> $ readelf -s a.out
> [snip]
> Symbol table '.symtab' contains 74 entries:
>Num:Value  Size TypeBind   Vis  Ndx Name
> 66: 00601250 0 NOTYPE  GLOBAL DEFAULT   24 _end
> 67: 004004b043 FUNCGLOBAL DEFAULT   13 _start
> 68: 00601038 0 NOTYPE  GLOBAL DEFAULT   24 __bss_start
> 69: 0040058270 FUNCGLOBAL DEFAULT   13 main
> 70:  0 FUNCGLOBAL DEFAULT  UND
> fwrite@@GLIBC_2.2.5
>
> Martin
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 9a7d54d..b868695 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2234,6 +2234,7 @@ cgraph_node::dump (FILE *f)
 fprintf (f, "  Is instrumented version.\n");
   else if (instrumented_version)
 fprintf (f, "  Has instrumented version.\n");
+
 }
 
 /* Dump call graph node NODE to stderr.  */
diff --git a/gcc/lto/Make-lang.in b/gcc/lto/Make-lang.in
index 4695077..465662e 100644
--- a/gcc/lto/Make-lang.in
+++ b/gcc/lto/Make-lang.in
@@ -22,7 +22,7 @@
 # The name of the LTO compiler.
 LTO_EXE = lto1$(exeext)
 # The LTO-specific object files inclued in $(LTO_EXE).
-LTO_OBJS = lto/lto-lang.o lto/lto.o lto/lto-object.o attribs.o lto/lto-partition.o lto/lto-symtab.o
+LTO_OBJS = lto/lto-lang.o lto/lto.o lto/lto-object.o attribs.o lto/lto-partition.o lto/lto-symtab.o lto/lto-dump.o
 lto_OBJS = $(LTO_OBJS)
 
 # this is only useful in a LTO bootstrap, but this does not work right
diff --git a/gcc/lto/lang.opt b/gcc/lto/lang.opt
index 0a408d3..7600840 100644
--- a/gcc/lto/lang.opt
+++ b/gcc/lto/lang.opt
@@ -63,6 +63,18 @@ fwpa=
 LTO Driver RejectNegative Joined Var(flag_wpa)
 Whole program analysis (WPA) mode with number of parallel jobs specified.
 
+fdump
+LTO Var(flag_lto_dump)
+Call the dump function.
+
+fdump-lto-list
+LTO Var(flag_lto_dump_list)
+Call the dump function for variables and function in IL.
+
+fdump-lto-list=
+LTO Driver RejectNegative Joined Var(flag_lto_dump_list2)
+
+
 fresolution=
 LTO Joined
 The resolution file.
diff --git a/gcc/lto/lto-dump.c b/gcc/lto/lto-dump.c
new file mode 100644
index 000..90976cb
--- /dev/null
+++ b/gcc/lto/lto-dump.c
@@ -0,0 +1,105 @@
+/* LTO dump tool
+   Copyright (C) 2009-2018 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "target.h"
+#include "function.h"
+#include "basic-block.h"
+#include "tree.h"
+#include "gimple.h"
+#include "cgraph.h"
+#include "lto-streamer.h"
+#include "ipa-utils.h"
+#include "builtins.h"
+#include "alias.h"
+#include "lto-symtab.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "stdio.h"
+
+/*Dump everything*/
+void dump()
+{
+	fprintf(stderr, "\nHello World!\n");
+}
+
+/*Dump variables and functions used in IL*/
+void dump_list()
+{
+
+	fprintf (stderr, "Call Graph:\n");
+	cgraph_node *cnode;
+	
+	static const char * const symtab_type_names[] = {"symbol", "function", "variable"};
+  	static const char * const visibility_types[] = {
+"default", "protected", "hidden", "internal" };
+	FOR_EACH_FUNCTION (cnode)
+	{
+		fprintf (stderr, "\n%s (%s)", cnode->dump_asm_name (), cnode->name ());
+		fprintf (stderr, "\n  Type: %s", symtab_type_names[cnode->type]);
+		fprintf (stderr, "\n visibility: %s\n",
+		visibility_types [DECL_VISIBILITY (cnode->decl)]);
+	}
+
+	fprintf (stderr, "\nVarpool:\n");
+	varpool_node *vnode;
+FOR_EACH_VARIABLE (vnode)
+{
+		fprintf (stderr, "\n%s (%s)", vnode->dump_asm_name (), vnode->name ());
+		fprintf (stderr, "\n  Type: %s", symtab_type_names[vnode->type]);
+		fprintf (stderr, "\n visibility:%s\n",
+		visibility_types [DECL_VISIBILITY (vnode->decl)]);
+	}
+}
+
+/*Dump specific variables and functions used in IL*/
+void dump_list2()
+{
+
+	fprintf (stderr, "Call Graph:\n");
+	cgraph_node *cnode;
+	
+	static const char * const symtab_type_names[] = {"symbol",

Re: Fwd: Is there a pass in GCC that provide us may alias information

2018-06-01 Thread Richard Biener
On June 1, 2018 7:34:25 PM GMT+02:00, vineet singh 
 wrote:
>Hi,
>
>I want to know that whether GCC provides a pass that gives us "may
>alias"
>information and how we can access it?
>
>If yes, then is this may alias information available before and after
>each
>pass of GCC?

May alias info is available everywhere via tha API in tree-ssa-alias.h

Richard. 



Re: Project Ranger

2018-06-01 Thread Andrew MacLeod

On 06/01/2018 05:48 AM, Richard Biener wrote:

On Wed, May 30, 2018 at 1:53 AM Andrew MacLeod  wrote:


This allows queries for a range on an edge, on entry to a block, as an
operand on an specific statement, or to calculate the range of the
result of a statement.  There are no prerequisites to use it, simply
create a path ranger and start using the API.   There is even an
available function which can be lightly called and doesn't require
knowledge of the ranger:

So what the wiki pages do not show is how contextual information
is handled (and how that's done w/o dominators as the wiki page
claims).  That is, path_range_on_edge suggests that range information
provided by the (all) controlling stmts of that edge are used when
determining the range for 'name'.
the wiki doesn't talk about it, but the document itself goes into great 
detail of how the dynamic process works.



That's probably true for the 'stmt' variant as well.

This leads me to guess that either the cache has to be
O(number of SSA uses) size or a simple VRP pass implemented
using Ranger querying ranges of all SSA ops for each stmt
will require O(number of SSA uses) times analysis?
The current cache is not particularly efficient, its size is #ssa_name * 
#BB, assuming you actually go and look at every basic block, which many 
passes do not.  It also only puts a range in there when it has to.    Im 
not sure why it really matters unless we find a pathological case that 
kills compile time which cannot be addressed.  This would all fall under 
performance analysis and tweaking.  Its the end performance that matters.


When a  range request is made for NAME in a block, it requests a range 
on each incoming edge, and then unions those ranges, and caches it. This 
is the range-on-entry cache.   The next time a range request for  NAME 
is made for on-entry to that block, it simply picks it up from the cache.


The requests for range on an edge uses the gori_map summary to decide 
whether that block creates a range for NAME or not.
If the block does not generate a range, it simply requests the 
range-on-entry to that block.
If it does generate a range, then it examines the statements defining 
the condition, gets ranges for those,  and calculates the range that 
would be generated on that edge.  It also requests ranges for any of 
those names in the defining chain which originate outside this block.


So it queries all the names which feed other names and so on. The 
queries end when a statement is found that doesn't generate a useful 
range. Sometimes it has to go hunt quite a way, but usually they are 
nearby.  And caching the range-on-entry values prevents the lookups from 
doing the same work over and over.


Once a range has been calculated, we don't spend much more time 
calculating it again.   Any other ssa-name which uses that name also 
doesnt need to recalculate it.  For the most part, we walk the def 
chains for any given ssa-name and its feeding names once and future 
requests pretty much come from the cache.


So yes, there is also a lot of recursion in here at the moment. calls 
for ranges spawning other calls before they can be resolved. I suspect 
sometimes this call chain is deep. I don't know, we haven't performed 
any analysis on it yet.



One of the advantages of a DOM-style or SSA-with-ASSERT_EXPR
pass is that the number of evaluations and the size of the cache
isn't that way "unbound".  On the WIKI page you mention edge
info isn't cached yet - whatever that means ;)

So I wonder what's the speed for doing
The on-entry caches prevent it from being "unbound". . and the 
intersection of incoming edge calculation performs the equivalent 
merging of ranges that DOM based processing give you.


   if (x > 10)
   A
   else
   B
   C

block A has x [11, MAX]
block B has x [MIN, 10]
block C, at the bottom of a diamond, the 2 incoming edges union those 2 
ranges back to [MIN, MAX] and we know nothing about the range of X 
again.  Accomplished the same thing dominators would tell you.


The edge info that "isnt cached yet", (and may not ever be), is simply 
the 'static' info calculated by evaluating the few statements at the 
bottom of the block feeding the condition.  any values coming into the 
block are cached.    I this case its simply looking at x > 10 and 
calculating the range on whichever edge.    One of the original design 
elements called for this to be cached, but in practice I don't think it 
needs to be.  But we haven't analyzed the performance yet.. so it can 
only get quicker :-)



  


   FOR_EACH_BB_FN (bb)
  FOR_EACH_STMT (stmt)
FOR_EACH_SSA_USE (stmt)
path_range_on_stmt (..., SSA-USE, stmt);

assuming that it takes into account controlling predicates.
That's actually what EVRP does (within a DOM walk) given
it tries to simplify each stmt with based on range info of the ops.


  FOR_EACH_BB_FN (bb)
 FOR_EACH_STMT (stmt)
path_range_on_stmt (..., stmt);

would be sufficient.  

gcc-8-20180601 is now available

2018-06-01 Thread gccadmin
Snapshot gcc-8-20180601 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/8-20180601/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-8-branch 
revision 261089

You'll find:

 gcc-8-20180601.tar.xzComplete GCC

  SHA256=f348ab66acf84f8c8cc28bb7fdea131b9016c1a1d9b7e23e93127c13062400de
  SHA1=64bf18cb23ca7e9663fe3e8a0dbbb456db2f9039

Diffs from 8-20180525 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.