from:"Tobias Grosser"

Re: [GSoC] questions about graphite_clast_to_gimple.c

2014-04-27 Thread Tobias Grosser




On 27/04/2014 21:52, Roman Gareev wrote:

Hi Tobias,

I'm considering the current code in graphite_clast_to_gimple.c, and
want to ask a few questions about it.

1. Where can I find the definitions of basic_block_def and edge from
coretypes.h?


gcc/basic-block.h


2. Why are recompute_all_dominators() and graphite_verify() called
before initialization of if_region in gloog? Does it happen just in
case (since the SSA form hasn't been modified)?


I am unsure. Somewhere in the back of my head, I remember that some 
preceding passes did not update the dominators properly and this caused 
issues in graphite. This may or may not be true today, so verifying this 
might be a good thing at some point. We could replace this with an
assert that verifies the dominator information and run a couple of tests 
with this. Maybe this gives us a test case. (So this is probably not 
high priority).



3. Why doesn't clast_guard have “else” statements?


CLooG does not compute else statements. Instead of having

if (i > 0)
 ...
else
 ...

it will emit

if (i > 0)
  ...
if (i <= 0)
  ...

The isl code generation can emit else statements to generate the more 
efficient first code.



4. Why is compute_bounds_for_param (it's from
add_names_to_union_domain) called after save_clast_name_index?


No idea. This looks suspicious. Good catch.

A comment on all this 'type_for_clast_name' infrastructure. It tries to 
address the problem of what type to use when code generating loop ivs or 
index expressions. However, it does it very poorly and has little chance 
to be correct. For the new code generation
we want to do this right. This means, we do not try to add any 
infrastructure ourselves to try to derive the correct types. Instead, we 
ask the isl code generation for the minimal bitwidth/signedness 
required. This will give us a precise and always correct answer. I have 
a preliminary patch in my repo. We can look at this later on. To start 
with, I propose to always go for a signed 64 (or for testing even 128 
bit type). This has proven to be very robust.



5. I've considered practically all the code of a CLAST generation
through GLooG and started consideration of translate_clast. However, I
don't understand the idea behind if_region creation. Do you know
something about it?


The idea is that we generate a construct:

if (executeNewCode)
  newly_generated_code
else
  old_code

In the trivial case executeNewCode is true and gcc will dead code 
eliminate old_code. However, we can also replace executeNewCode with a 
run-time checking e.g. for the absence of aliasing or other assumptions 
we take. In this case, the newly generated code will only be executed if 
our assumptions hold. Otherwise, we will automatically fall back to the 
old code.


Tobias

[GSoC] How to get started with the isl code generation

2014-05-06 Thread Tobias Grosser


Hi Roman,

as you already look very actively through the graphite code, I was 
wondering if you already have an idea how to start with the 
implementation? It is obviously a good idea to first browse the code
and get a general picture of the existing implementation as it helps to 
get a good picture of what is needed and to avoid writing unneeded code.
Hudging from the questions you ask you seem to understand precisely what 
is going on (you spot many bugs just by inspection), so this is good.


On the other side, I think it is a good idea to simultaneously keep 
track of the design you have in mind and the first steps you are 
planning to take. Even though the full design may still need some time,
some basic decisions can probably already be taken and maybe even 
implemented. Staying in some way close to the coding you will do,

may be helpful to direct your code inspections to areas of code that
will be important for your implementation.

E.g. just setting up a second code generation in parallel that does
minimal work (generates no code at all), but that can be enabled by a 
command line flag might be a first start.


Cheers,
Tobias

P.S.: If your wiki account is still not writeable, could you ask on the 
mailing list for a fix?

Re: [GSoC] questions about graphite_clast_to_gimple.c

2014-05-06 Thread Tobias Grosser


On 06/05/2014 10:19, Richard Biener wrote:

Hi Richi,

thanks for the comments.


On Tue, May 6, 2014 at 8:57 AM, Tobias Grosser  wrote:

On 05/05/2014 21:11, Roman Gareev wrote:


Hi Tobias,

thank you for your reply! I have questions about types. Could you
please answer them?



I looked through them and most seem to be related to how we derive types in
graphite. As I said before, this is a _very_ fragile hack
that works surprisingly well, but which is both too complex and
in the end still incorrect. Sebastian wrote this code, so I am not familiar
with the details. I also don't think it is necessary to
understand the details. Instead of using any code, we should start
implementing the new code using 64 bit signed integers. This
should be correct in 99.9% of the cases.


Of course compilers have to work correctly in 100% of the cases, so
if you choose an approach that will be incorrect in > 0% of the cases
then you should make sure to detect those and not apply any transform.


I agree we want to get to 100%. Just the way how to get there needs to 
be chosen.


Detecting broken cases does not work. During code generation we generate
new expressions, e.g. i + j + 200 * b. To code generate them we need to
choose a type for the computation.

cloog has zero knowledge about possible types, that's why graphite tries 
to derive types by estimating the minimal/maximal value of
an expression i + j from the knowledge it has about i and j. This 
estimate is very imprecise especially as the initial knowledge we have 
is incomplete. As Roman pointed out, several of the 'estimates' just 
don't make sense at all.


To get it 100% right we need to derive the minimal/maximal value a 
subexpression i + j can take and to use this to find a type that is 
large enough and also fast on our target platform. The best solution I 
see is to compute this information within the isl code generation, where 
we have all necessary information available.


Unfortunately, this patch is not finished yet. There are two ways to 
proceed.


1) finish the patch as the very first step

2) Go for 64bits and plug in the patch later

I would obviously prefer to get 1) done as soon as possible, but in case 
it still needs more time, defaulting to 64/128 bit types allows Roman to 
proceed. In the end, 64 bits is almost always large enough.


I am busy for the next 6 weeks, but am planning to work on the isl patch 
after. Sven, do you happen to have any time to work on the isl patch?



One of the selling points for the new isl code generation was however,
that it will be possible to get precise information about the types
needed for code generation. There existed already a patch for an older
isl version and there is a partial patch for newer versions that Sven and me
have been working on. It is not yet stable enough to be tested, but I
attached it anyway for illustration. The idea is to
introduce a set of functions

+   int isl_ast_expr_has_known_size(
+   __isl_keep isl_ast_expr *expr);
+   int isl_ast_expr_is_bounded(
+   __isl_keep isl_ast_expr *expr);
+   int isl_ast_expr_is_signed(
+   __isl_keep isl_ast_expr *expr);
+   size_t isl_ast_expr_size_in_bits(
+   __isl_keep isl_ast_expr *expr);

in isl, where we can precisely compute the minimal legal type. We can then
use this during code generation to derive good types.


You should be able to do this for all types you need up-front and check
if there is a suitable GIMPLE type available.  For example by using
lang_hooks.types.type_for_size () which will return NULL_TREE if there
isn't one.


How could we do this upfront? For each subexpression, we need to know 
what is the minimal legal type. Only after we know this we can derive a 
type for it.



2. Why do we want to generate signed types as much as possible?



Because in the code cloog generates negative values are common. To be save
we generate unsigned code.


That should have been _signed_ code.


Again, I do not think spending time to understand the heuristics in
type_for_clast is worth it. Some are rather complex and work well, some
or just buggy but have never been triggered and a small percentage actually
might be reusable later (pointer handling). As the approach
has generally too little information to work reliably, I would not spend
any time on it. I pointed out the correct approach above. Going with 64bit
types will bring us a very long way, and we can finish the isl patch to get
it 100% right.


If ISL can give you for each expression a type precision and signedness
then I'd stick to that if it is available (or else punt).


Not yet, but hopefully soon.

At the moment, we have zero information about the types (the same holds 
for cloog).


I see only three choices:

1) Finish this feature of the isl code generation first
2) Try to do 'estimate' the types from the graphite side as
   we did it b

Re: [GSoC] questions about graphite_clast_to_gimple.c

2014-05-06 Thread Tobias Grosser


On 06/05/2014 13:52, Richard Biener wrote:

On Tue, May 6, 2014 at 1:02 PM, Tobias Grosser  wrote:

On 06/05/2014 10:19, Richard Biener wrote:

Hi Richi,

thanks for the comments.



On Tue, May 6, 2014 at 8:57 AM, Tobias Grosser  wrote:


On 05/05/2014 21:11, Roman Gareev wrote:



Hi Tobias,

thank you for your reply! I have questions about types. Could you
please answer them?




I looked through them and most seem to be related to how we derive types
in
graphite. As I said before, this is a _very_ fragile hack
that works surprisingly well, but which is both too complex and
in the end still incorrect. Sebastian wrote this code, so I am not
familiar
with the details. I also don't think it is necessary to
understand the details. Instead of using any code, we should start
implementing the new code using 64 bit signed integers. This
should be correct in 99.9% of the cases.



Of course compilers have to work correctly in 100% of the cases, so
if you choose an approach that will be incorrect in > 0% of the cases
then you should make sure to detect those and not apply any transform.



I agree we want to get to 100%. Just the way how to get there needs to be
chosen.

Detecting broken cases does not work. During code generation we generate
new expressions, e.g. i + j + 200 * b. To code generate them we need to
choose a type for the computation.

cloog has zero knowledge about possible types, that's why graphite tries to
derive types by estimating the minimal/maximal value of
an expression i + j from the knowledge it has about i and j. This estimate
is very imprecise especially as the initial knowledge we have is incomplete.
As Roman pointed out, several of the 'estimates' just don't make sense at
all.

To get it 100% right we need to derive the minimal/maximal value a
subexpression i + j can take and to use this to find a type that is large
enough and also fast on our target platform. The best solution I see is to
compute this information within the isl code generation, where we have all
necessary information available.

Unfortunately, this patch is not finished yet. There are two ways to
proceed.

1) finish the patch as the very first step

2) Go for 64bits and plug in the patch later

I would obviously prefer to get 1) done as soon as possible, but in case it
still needs more time, defaulting to 64/128 bit types allows Roman to
proceed. In the end, 64 bits is almost always large enough.

I am busy for the next 6 weeks, but am planning to work on the isl patch
after. Sven, do you happen to have any time to work on the isl patch?



One of the selling points for the new isl code generation was however,
that it will be possible to get precise information about the types
needed for code generation. There existed already a patch for an older
isl version and there is a partial patch for newer versions that Sven and
me
have been working on. It is not yet stable enough to be tested, but I
attached it anyway for illustration. The idea is to
introduce a set of functions

+   int isl_ast_expr_has_known_size(
+   __isl_keep isl_ast_expr *expr);
+   int isl_ast_expr_is_bounded(
+   __isl_keep isl_ast_expr *expr);
+   int isl_ast_expr_is_signed(
+   __isl_keep isl_ast_expr *expr);
+   size_t isl_ast_expr_size_in_bits(
+   __isl_keep isl_ast_expr *expr);

in isl, where we can precisely compute the minimal legal type. We can
then
use this during code generation to derive good types.



You should be able to do this for all types you need up-front and check
if there is a suitable GIMPLE type available.  For example by using
lang_hooks.types.type_for_size () which will return NULL_TREE if there
isn't one.



How could we do this upfront? For each subexpression, we need to know what
is the minimal legal type. Only after we know this we can derive a type for
it.


I thought that ISL gives you this information.  If not, then of course there
is no way - but then there is no way at any point.


Not yet. We are close, but it is not finished enough to use it in 
production.



2. Why do we want to generate signed types as much as possible?




Because in the code cloog generates negative values are common. To be
save
we generate unsigned code.



That should have been _signed_ code.



Again, I do not think spending time to understand the heuristics in
type_for_clast is worth it. Some are rather complex and work well, some
or just buggy but have never been triggered and a small percentage
actually
might be reusable later (pointer handling). As the approach
has generally too little information to work reliably, I would not spend
any time on it. I pointed out the correct approach above. Going with
64bit
types will bring us a very long way, and we can finish the isl patch to
get
it 100% right.



If ISL can give you for each expression a type precision and signedness
then I'd stick to that if it is available (or else punt).



Not y

Re: [GSoC] How to get started with the isl code generation

2014-05-12 Thread Tobias Grosser


On 12/05/2014 20:35, Roman Gareev wrote:

Hi Tobias,

thank you for your advice!


On the other side, I think it is a good idea to simultaneously keep track of
the design you have in mind and the first steps you are planning to take.
Even though the full design may still need some time,
some basic decisions can probably already be taken and maybe even
implemented. Staying in some way close to the coding you will do,
may be helpful to direct your code inspections to areas of code that
will be important for your implementation.

E.g. just setting up a second code generation in parallel that does
minimal work (generates no code at all), but that can be enabled by a
command line flag might be a first start.


Hi Roman,


I agree with this. I'll start setting up a second code generation as
soon as I achieve success with separate ISL AST generation.


what is the difference you see between ISL AST generation and code 
generation?


What are your plans to separate the ISL AST generation? Do you foresee 
any difficulties/problems?


Cheers,
Tobias

Re: [GSoC] Status - 20140516

2014-05-16 Thread Tobias Grosser




On 17/05/2014 00:27, Maxim Kuvyrkov wrote:

Hi Community,

The community bonding period is coming to a close, students can officially 
start coding on Monday, May 19th.

In the past month the student should have applied for FSF copyright assignment 
and, hopefully, executed on a couple of test tasks to get a feel for GCC 
development.


In the last mail, I got the impression that you will keep track of the 
copyright assignments. Is this the case?


Cheers,
Tobias

Re: [GSoC] Status - 20140516

2014-05-16 Thread Tobias Grosser


On 17/05/2014 00:43, Maxim Kuvyrkov wrote:

On May 17, 2014, at 10:41 AM, Tobias Grosser  wrote:




On 17/05/2014 00:27, Maxim Kuvyrkov wrote:

Hi Community,

The community bonding period is coming to a close, students can officially 
start coding on Monday, May 19th.

In the past month the student should have applied for FSF copyright assignment 
and, hopefully, executed on a couple of test tasks to get a feel for GCC 
development.


In the last mail, I got the impression that you will keep track of the 
copyright assignments. Is this the case?


Yes.  Two of the students already have copyright assignment in place, and I 
have asked the other 3 about their assignment progress today.


Great. Could you let me know when Roman's copyright assignment is in?

Thanks,
Tobias

Re: [GSoC] How to get started with the isl code generation

2014-05-17 Thread Tobias Grosser


On 16/05/2014 21:49, Roman Gareev wrote:

Hi Tobias,


what is the difference you see between ISL AST generation and code
generation?


By “ISL AST generation”, I mean ISL AST generation without generation
of GIMPLE code.


Alright.


What are your plans to separate the ISL AST generation? Do you foresee any
difficulties/problems?


According to the plan mentioned in my proposal, I wanted to get more
familiar with ISL AST generation by generation of ISL AST in a file,
which is separate from the GCC sources. This could help to avoid
problems with interpretation and verification of results, because I
worked with my own input to ISL AST generator instead of the input
built by Graphite from GIMPLE code. This could also help to avoid
rebuilding of GCC in the process of debugging. However, I've come to
the conclusion that the way you advised me is better, because it helps
to save the time of integration of ISL AST generation in GCC.

I've set up a second code generation in parallel that generates ISL
AST and can be enabled by a command line flag. Could you please advise
me how to verify the results of this generation?

Below is the code of this generation.


This looks in general good. A couple of comments:

1. Could you send the file as patch to the gcc repository
   (e.g. created by git)

2. You seem to miss the build system integration

3. Several lines cross 80 columns

4. We also need a place to dump the isl output.

To output the isl_ast you can use the following pattern:

  if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file,
 "Loops at depths %d and %d will be interchanged.\n",
 depth1, depth2);

It will then show up in the graphite tree dump file (enabled by
 -fdump-tree-graphite-all ). You can then hopefully match the output in 
the dejagnu tests, but I am unsure how exactly multi-line matching 
works. If you don't find a solution easily it may be worth asking on the 
gcc mailing list for this feature.


Cheers,
Tobias

Re: [GSoC] How to get started with the isl code generation

2014-05-25 Thread Tobias Grosser




On 25/05/2014 13:12, Roman Gareev wrote:

Hi Tobias,

I tried to incorporate all your comments in the following patch. It
also contains traversing of ISL AST and its dump to a file. You can
find out more about this at the following link
http://romangareev.blogspot.ru/2014/05/gsoc-report-i.html


Hi Roman,

thanks for this update (and expecially the nice blog post).
Also the size of the patch is great. Small enough to be easy to review
and still adding good features.


diff --git a/gcc/common.opt b/gcc/common.opt
index 5c3f834..005042d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1250,6 +1250,10 @@ fgraphite-identity
 Common Report Var(flag_graphite_identity) Optimization
 Enable Graphite Identity transformation

+fgraphite-isl-code-gen
+Common Report Var(flag_graphite_isl_code_gen) Optimization
+Enable isl code generator in Graphite


Can we use something like:

-fgraphite-code-generator=[isl|cloog]

This allows us to start with cloog as default and then flip the switch
as soon as we are confident enough.


diff --git a/gcc/graphite-clast-to-gimple.c b/gcc/graphite-clast-to-gimple.c
index 9ac9b67..d16439d 100644
--- a/gcc/graphite-clast-to-gimple.c
+++ b/gcc/graphite-clast-to-gimple.c
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #include 
 #include 
+#include 


Can we put this into a new file? clast stands for CLOOG-AST. I would
prefer to have a new file ala graphite-isl-ast-to-gimple.c or
graphite-ast-generation.c



 #include 
 #include 
 #endif
@@ -1657,6 +1658,221 @@ debug_generated_program (scop_p scop)
   print_generated_program (stderr, scop);
 }

+
+/* Integration of ISL code generator. */
+
+/* TODO: eliminate this function as soon as possible */
+
+void
+print_indent (int indent, isl_printer *prn)
+{
+  int i;
+  for (i = 0; i < indent; i++)
+{
+  prn = isl_printer_print_str (prn, " ");
+}
+}


Why is this function necessary in the first place? Are you aware of
isl_printer_set_indent() and isl_printer_set_indent_prefix()?


+static void
+translate_isl_ast (int indent, __isl_keep isl_ast_node *node,
+  isl_printer *prn);
+
+/* Translates a isl_ast_node_for STMT to gimple.
+   TODO: replace output to FILE with translation
+   to GCC representation. */
+
+static void
+translate_isl_ast_node_for (int indent, __isl_keep isl_ast_node *node_for,
+   isl_printer *prn)
+{
+  isl_id *id;
+  const char *name;
+  const char *type;
+  type = isl_options_get_ast_iterator_type (isl_printer_get_ctx (prn));
+  isl_ast_expr *for_iterator = isl_ast_node_for_get_iterator (node_for);
+  isl_ast_expr_free (for_iterator);
+  id = isl_ast_expr_get_id (for_iterator);
+  name = isl_id_get_name (id);
+  isl_id_free (id);
+  prn = isl_printer_print_str (prn, "\n");
+  print_indent (indent, prn);
+  prn = isl_printer_print_str (prn, "for (");
+  prn = isl_printer_print_str (prn, type);
+  prn = isl_printer_print_str (prn, " ");
+  prn = isl_printer_print_str (prn, name);
+  prn = isl_printer_print_str (prn, " = ");
+  isl_ast_expr *for_init = isl_ast_node_for_get_init (node_for);
+  prn = isl_printer_print_ast_expr (prn, for_init);
+  isl_ast_expr_free (for_init);
+  prn = isl_printer_print_str (prn, "; ");
+  isl_ast_expr *for_cond = isl_ast_node_for_get_cond (node_for);
+  prn = isl_printer_print_ast_expr (prn, for_cond);
+  isl_ast_expr_free (for_cond);
+  prn = isl_printer_print_str (prn, "; ");
+  prn = isl_printer_print_str (prn, name);
+  prn = isl_printer_print_str (prn, " += ");
+  isl_ast_expr *for_inc = isl_ast_node_for_get_inc (node_for);
+  prn = isl_printer_print_ast_expr (prn, for_inc);
+  isl_ast_expr_free (for_inc);
+  prn = isl_printer_print_str (prn, ")");
+  isl_ast_node *for_body = isl_ast_node_for_get_body (node_for);
+  translate_isl_ast (indent + 2, for_body, prn);
+  isl_ast_node_free (for_body);
+}
+
+/* Translates a clast guard statement STMT to gimple.
+   TODO: replace output to FILE with translation
+   to GCC representation. */
+
+static void
+translate_isl_ast_node_if (int indent, __isl_keep isl_ast_node *node_if,
+  isl_printer *prn)
+{
+  prn = isl_printer_print_str (prn, "\n");
+  print_indent (indent, prn);
+  prn = isl_printer_print_str (prn, "if (");
+  isl_ast_expr *if_cond = isl_ast_node_if_get_cond (node_if);
+  prn = isl_printer_print_ast_expr (prn, if_cond);
+  isl_ast_expr_free (if_cond);
+  prn = isl_printer_print_str (prn, ")");
+  isl_ast_node *if_then = isl_ast_node_if_get_then (node_if);
+  translate_isl_ast (indent + 2, if_then, prn);
+  isl_ast_node_free (if_then);
+  if (isl_ast_node_if_has_else (node_if))
+{
+  prn = isl_printer_print_str (prn, "else");
+  isl_ast_node *if_else = isl_ast_node_if_get_else(node_if);
+  translate_isl_ast (indent + 2, if_else, prn);
+  isl_ast_node_free (if_else);
+}
+}
+
+/* Translates a clast guard statement STMT to gimple.
+   TODO: replace output to FILE with translation
+   to GCC representati

Re: [GSoC] How to get started with the isl code generation

2014-06-03 Thread Tobias Grosser




On 01/06/2014 01:21, Roman Gareev wrote:

Hi Tobias,


Allright. It seems you understood the tree traversel. For the actual
patch that we want to commit, let's leave it out for now.  Instead,
let's try to get a minimal patch that only adds the flag and the new
file for the isl_ast stuff. In case the isl is choosen as code
generation option, the cloog pass is disabled (they would otherwise get
into their way later) and we dump the ast. We also need a simple test case
that checks that the dump is generated.


What do you mean by simple test case that checks that the dump is
generated? Is it checking of the dump_flags?


I mean that we verify that the dump that is created contains a printed
AST that matches the one we expect.

for (int c1 = 0; c1 < n - 1; c1 += 1)
  for (int c3 = 0; c3 < n; c3 += 1)
S_5(c1, c3);


Instead of adding this functionality to gloog() I would create a
semilar function that provides the functionality gloog has for the isl
codegen counterpart. -fgraphite-code-generator switches then between
these two functions. To start with the isl counterpart is almost empty.
It just contains the parts of this patch needed for printing. When we
add more functionality and we encounter reuse opportunities, we can move
parts of graphite-clast-to-gimple.c into a generic codegen file.

What do you think?


I think that it's good. It would help to begin removing of CLooG
dependency and make the file with generation of ISL AST easier to
maintain. I implemented it in the patch, which can be found below. You
can also find my post about my second task of the proposal at the
following link http://romangareev.blogspot.com/2014/05/gsoc-report-ii.html


Nice post.


diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index e74bb67..39cb7bd 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1250,6 +1250,7 @@ OBJS = \
graphite.o \
graphite-blocking.o \
graphite-clast-to-gimple.o \
+   graphite-isl-ast-to-gimple.o \
graphite-dependences.o \
graphite-interchange.o \
graphite-optimize-isl.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 5c3f834..731aaf5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1250,6 +1250,19 @@ fgraphite-identity
 Common Report Var(flag_graphite_identity) Optimization
 Enable Graphite Identity transformation

+fgraphite-code-generator=
+Common Report RejectNegative Joined Optimization Enum(fgraphite_generator) 
Var(flag_graphite_code_gen) Init(FGRAPHITE_CODE_GEN_CLOOG)
+Choose code generator of Graphite
+
+Enum
+Name(fgraphite_generator) Type(enum fgraphite_generator) UnknownError(unknown 
code generator of graphite %qs)
+
+EnumValue
+Enum(fgraphite_generator) String(isl) Value(FGRAPHITE_CODE_GEN_ISL)
+
+EnumValue
+Enum(fgraphite_generator) String(cloog) Value(FGRAPHITE_CODE_GEN_CLOOG)


Nice.


+#include "config.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic-core.h"
+#include "tree.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimplify-me.h"
+#include "gimple-ssa.h"
+#include "tree-ssa-loop-manip.h"
+#include "tree-ssa-loop.h"
+#include "tree-into-ssa.h"
+#include "tree-pass.h"
+#include "cfgloop.h"
+#include "tree-chrec.h"
+#include "tree-data-ref.h"
+#include "tree-scalar-evolution.h"
+#include "sese.h"
+
+#include "graphite-poly.h"
+#include "graphite-isl-ast-to-gimple.h"
+#include "graphite-htab.h"


Are these the minimal includes necessary?


+
+/* This flag is set when an error occurred during the translation of
+   ISL AST to Gimple.  */
+
+static bool isl_gloog_error;
+
+/* Prints NODE to FILE.  */
+
+void
+print_isl_ast_node (FILE *file, __isl_keep isl_ast_node *node,
+   __isl_keep isl_ctx *ctx)
+{
+  isl_printer *prn = isl_printer_to_file (ctx, file);
+  prn = isl_printer_set_output_format (prn, ISL_FORMAT_C);
+  prn = isl_printer_print_ast_node (prn, node);
+  prn = isl_printer_print_str (prn, "\n");
+  isl_printer_free (prn);
+}
+
+/* Generates a build, which specifies the constraints on the parameters. */
+
+static isl_ast_build *
+generate_isl_context (scop_p scop)
+{
+  isl_set * context_isl = isl_set_params (isl_set_copy (scop->context));
+  return isl_ast_build_from_context (context_isl);
+}
+
+/* Generates a schedule, which specifies an order used to
+   visit elements in a domain. */
+
+static isl_union_map *
+generate_isl_schedule (scop_p scop)
+{
+  int i;
+  poly_bb_p pbb;
+  isl_union_map *schedule_isl =
+isl_union_map_empty (isl_set_get_space (scop->context));
+
+  FOR_EACH_VEC_ELT (SCOP_BBS (scop), i, pbb)
+{
+  /* Dead code elimination: when the domain of a PBB is empty,
+don't generate code for the PBB.  */
+  if (isl_set_is_empty (pbb->domain))
+   continue;
+
+  isl_map *bb_schedule = isl_map_

Re: [GSoC] How to get started with the isl code generation

2014-06-08 Thread Tobias Grosser


On 08/06/2014 19:43, Roman Gareev wrote:

Hi Tobias,


This file is empty. It seems to be the perfect place for gloog_isl,
maybe give it a more descriptive name. E.g.,

graphite_regenerate_ast_isl()

We could then rename gloog, to graphite_regenerate_ast_cloog().

gloog comes from graphite + cloog and does not make sense in the context
of isl.


Would it be better to rename gloog in a separate patch (because it
could cause renaming of gloog_error, for eaxmple, and increase the
size of the patch)?


Yes, you are right. Please make this a separate patch and rebase this
patch on the gloog patch.


I tried to incorporate all your comments in the following patch. I
also attach the change log to this message.


Comments inline.


 #endif /* ! GCC_FLAG_TYPES_H */
diff --git a/gcc/graphite-clast-to-gimple.h b/gcc/graphite-clast-to-gimple.h
index fc5a679..5e23d94 100644
--- a/gcc/graphite-clast-to-gimple.h
+++ b/gcc/graphite-clast-to-gimple.h
@@ -21,6 +21,8 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_GRAPHITE_CLAST_TO_GIMPLE_H
 #define GCC_GRAPHITE_CLAST_TO_GIMPLE_H

+#include "graphite-htab.h"
+
 extern CloogState *cloog_state;

 /* Data structure for CLooG program representation.  */
@@ -30,14 +32,7 @@ struct cloog_prog_clast {
   struct clast_stmt *stmt;
 };

-/* Stores BB's related PBB.  */
-
-struct bb_pbb_def
-{
-  basic_block bb;
-  poly_bb_p pbb;
-};
-
+extern bool gloog (scop_p, bb_pbb_htab_type);
 extern void debug_clast_stmt (struct clast_stmt *);
 extern void print_clast_stmt (FILE *, struct clast_stmt *);

diff --git a/gcc/graphite-htab.h b/gcc/graphite-htab.h
index d67dd0c..9f31fac 100644
--- a/gcc/graphite-htab.h
+++ b/gcc/graphite-htab.h
@@ -22,7 +22,14 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_GRAPHITE_HTAB_H

 #include "hash-table.h"
-#include "graphite-clast-to-gimple.h"
+
+/* Stores BB's related PBB.  */
+
+struct bb_pbb_def
+{
+  basic_block bb;
+  poly_bb_p pbb;
+};

 /* Hashtable helpers.  */

@@ -52,7 +59,6 @@ bb_pbb_hasher::equal (const value_type *bp1, const 
compare_type *bp2)

 typedef hash_table  bb_pbb_htab_type;

-extern bool gloog (scop_p, bb_pbb_htab_type);
 poly_bb_p find_pbb_via_hash (bb_pbb_htab_type, basic_block);
 bool loop_is_parallel_p (loop_p, bb_pbb_htab_type, int);
 scop_p get_loop_body_pbbs (loop_p, bb_pbb_htab_type, vec *);


I assume all this could be part of the gloog patch?


diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
new file mode 100644
index 000..309ba28
--- /dev/null
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -0,0 +1,132 @@
+/* Translation of ISL AST to Gimple.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   Contributed by Roman Gareev .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+
+#include 
+#include 
+#include 
+#include 
+
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree-ssa-loop.h"
+#include "tree-pass.h"
+#include "cfgloop.h"
+#include "tree-data-ref.h"
+#include "sese.h"


I am surprised. Are all these includes really needed to get _this_ patch
compile? (I asked this before).


diff --git a/gcc/testsuite/gcc.dg/graphite/scop-23.c 
b/gcc/testsuite/gcc.dg/graphite/scop-23.c
new file mode 100644
index 000..868222a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/scop-23.c


Maybe a more descriptive name:

isl-codegen-loop-dumping.c


@@ -0,0 +1,32 @@
+/* { dg-additional-options "-floop-parallelize-all -O2 
-fgraphite-code-generator=isl" } */


Why do we need -floop-parallelize-all? -fgraphite-identity should be
enough.


+#include 
+
+#define N 16

No need for these defines.

+
+int
+main1 (int n, int *a)
+{
+  int i, j;
+
+  for (i = 0; i < n - 1; i++)
+for (j = 0; j < n; j++)
+  a[j] = i + n;

And just a single loop.


+
+  for (j = 0; j < n; j++)
+if (a[j] != i + n - 1)
+  __builtin_abort ();

No need for the check.


+int
+main ()
+{
+  int a[N];
+  main1 (N, a);
+  return 0;
+}

No need for main.

And we have a minimal test case.


+/* { dg-final { scan-tree-dump-times "ISL AST generated by ISL: \nfor \\(int c1 = 0; c1 < n - 1; 
c1 \\+= 1\\)\n  for \\(int c3 = 0; c3 < n; c3 \\+= 1\\)\nS_5\\(c1, c3\\);" 1 
"graphite"} } *

Re: [GSoC] How to get started with the isl code generation

2014-06-18 Thread Tobias Grosser


On 18/06/2014 15:22, Roman Gareev wrote:

Hi Tobias,

I made a separate patch and rebased the previous one. They are
attached to this letter.



I am surprised. Are all these includes really needed to get _this_ patch
compile? (I asked this before).


I saw your previous comment related to this and the following includes
were removed: isl/list.h, isl/constraint.h, isl/ilp.h, isl/aff.h,
diagnostic-core.h, tree-ssa-loop-manip.h, tree-into-ssa.h,
tree-chrec.h, tree-scalar-evolution.h.

However, it seems that if we want to use the struct scop_p from
graphite-poly.h, we have to include sese.h, which requires
tree-data-ref.h, cfgloop.h, tree-ssa-loop.h, gimple-iterator.h,
gimple.h, is-a.h, gimple-expr.h, internal-fn.h, tree-ssa-alias.h,
basic-block.h, tree.h, coretypes.h, system.h, config.h

We also have to include tree-pass.h, if we want to use timevar_push,
timevar_pop to to measure elapsed time (it was used previously by
graphite-clast-to-gimple.c).


This is ugly, but there is little we can do about it. Maybe you can ask
on the mailing list if there is a way to write this in multiple lines?


The patch was reduced, but I didn't found out how to make the regexp
easier to read. I wrote about this to the community.


Nice. These patches look good to me.

One last question. How did you test them? Specifically, do they compile 
with the oldest isl version supported by gcc trunk? Or do we need to 
update isl as well?


Cheers,
Tobias

Re: [GSoC] How to get started with the isl code generation

2014-06-18 Thread Tobias Grosser

On Jun 18, 2014 6:03 PM, Roman Gareev  wrote:
>
> I used trunk and compiled these patches only with isl 0.12 and ClooG 
> 0.18.1. Which versions of these libraries are need to be checked for 
> compatibility?

That's fine. Please post the patches for wider review at gcc patches. Also 
mention that your copyright assignment has been processed.

Thanks Tobias 

>
> -- 
>    Cheers, Roman Gareev

Re: [GSoC] generation of GCC expression trees from isl ast expressions

2014-06-23 Thread Tobias Grosser


On 23/06/2014 08:34, Roman Gareev wrote:

Hi Tobias,

I'm currently working on generation of GCC expression trees from isl
ast expressions . Could you please answer a few questions about it?

1. How is it better to generate tree from isl_ast_expr_int? In the
temporary variant I call isl_ast_expr_get_int to get isl_int from
isl_ast_expr. After this, gmp_cst_to_tree (it was used in
graphite-clast-to-gimple.c) is called to generate tree frome isl_int.
It is possible now, because isl_int is mpz_t. However, it can be
changed in the future, according to comments from its source code.


I think a better approach is to use isl_ast_expr_get_val() and to then
use isl_val_get_num_gmp() (see val_gmp.h). isl_int will be removed in
newer versions of isl and the above functions are the new interface to
get gmp values.


2. As you said in previous messages, we can always use signed 64/128,
until isl is able to give information about types. I haven't found
them in types of Generic
(https://gcc.gnu.org/onlinedocs/gccint/Types.html#Types). Could they
be declared using build_nonstandard_integer_type (128, 1)?


I assume so. However, we always want signed types, so the second
argument should be zero, no?


3. If I am not mistaken, the structure ivs_params from
graphite-clast-to-gimple.c is used to store induction variables and
parameters, rename them according to SSA form. Could it be used in
graphite-isl-ast-to-gimple.c, too?


May be. Feel free to reuse it if you feel you need it, but don't worry
if you need different data structures. I will review its use in case you
add it.


4. Should we transform all isl_ast_expr_op types of isl ast
expressions to GCC expression tree?
For example, the following types
are not transformed at all in Polly project: isl_ast_op_cond,
isl_ast_op_and_then, isl_ast_op_or_else, isl_ast_op_call,
isl_ast_op_member, isl_ast_op_access, isl_ast_op_pdiv_q,
isl_ast_op_pdiv_r, isl_ast_op_div, isl_ast_op_fdiv_q.


We should support all that can occur. isl_ast_op_access won't ever be
built for the input we provide, similarly isl_ast_op_call, and possibly
isl_ast_op_member. The others may just be missing in Polly or may not be
needed for certain options.

For now, let's start with a basic set that we can test and extend later.


The first draft of generation of GCC expression trees from isl ast
expressions can be found below:


Comments inline.


+/* Converts a GMP constant VAL to a tree and returns it.  */
+
+static tree
+gmp_cst_to_tree (tree type, mpz_t val)
+{
+  tree t = type ? type : integer_type_node;
+  mpz_t tmp;
+
+  mpz_init (tmp);
+  mpz_set (tmp, val);
+  wide_int wi = wi::from_mpz (t, tmp, true);
+  mpz_clear (tmp);
+
+  return wide_int_to_tree (t, wi);
+}


OK.


+static tree
+gcc_expression_from_isl_expression (tree type, __isl_keep isl_ast_expr *);
+
+/* Converts a isl_ast_expr_int expression E to a GCC expression tree of
+   type TYPE.  */
+
+static tree
+gcc_expression_from_isl_expr_int (tree type, __isl_keep isl_ast_expr *expr)
+{
+  gcc_assert (isl_ast_expr_get_type (expr) == isl_ast_expr_int);
+  isl_int val;
+  isl_int_init (val);
+  if (isl_ast_expr_get_int (expr, &val) == -1)
+{
+  isl_int_clear (val);
+  return NULL_TREE;
+}
+  else
+return gmp_cst_to_tree (type, val);
+}


As discussed above avoid the use of isl_int.


+/* Converts a binary isl_ast_expr_op expression E to a GCC expression tree of
+   type TYPE.  */
+
+static tree
+binary_op_to_tree (tree type, __isl_keep isl_ast_expr *expr)
+{
+  isl_ast_expr *arg_expr = isl_ast_expr_get_op_arg (expr, 0);
+  tree tree_lhs_expr = gcc_expression_from_isl_expression (type, arg_expr);
+  isl_ast_expr_free (arg_expr);
+  arg_expr = isl_ast_expr_get_op_arg (expr, 1);
+  tree tree_rhs_expr = gcc_expression_from_isl_expression (type, arg_expr);
+  isl_ast_expr_free (arg_expr);
+  switch (isl_ast_expr_get_op_type (expr))
+{
+case isl_ast_op_add:
+  return fold_build2 (PLUS_EXPR, type, tree_lhs_expr, tree_rhs_expr);
+
+case isl_ast_op_sub:
+  return fold_build2 (MINUS_EXPR, type, tree_lhs_expr, tree_rhs_expr);
+
+case isl_ast_op_mul:
+  return fold_build2 (MULT_EXPR, type, tree_lhs_expr, tree_rhs_expr);
+
+case isl_ast_op_div:
+  return fold_build2 (EXACT_DIV_EXPR, type, tree_lhs_expr, tree_rhs_expr);
+
+case isl_ast_op_fdiv_q:
+  return fold_build2 (FLOOR_DIV_EXPR, type, tree_lhs_expr, tree_rhs_expr);
+
+case isl_ast_op_and:
+  return fold_build2 (TRUTH_ANDIF_EXPR, type,
+ tree_lhs_expr, tree_rhs_expr);
+
+case isl_ast_op_or:
+  return fold_build2 (TRUTH_ORIF_EXPR, type, tree_lhs_expr, tree_rhs_expr);


How did you verify that the semantics of the GCC and isl expressions are
identical?


+case isl_ast_op_eq:
+  return fold_build2 (EQ_EXPR, type, tree_lhs_expr, tree_rhs_expr);
+
+case isl_ast_op_le:
+  return fold_build2 (LE_EXPR, type, tree_lhs_expr, tree_rhs_expr);
+
+case isl_ast_op_lt:
+  return fold_build2 (LT

Re: [GSoC] generation of GCC expression trees from isl ast expressions

2014-06-26 Thread Tobias Grosser


On 27/06/2014 07:31, Roman Gareev wrote:

Are you saying we should better not do unit testing at the moment? (This is
perfectly fine with me, I am just verifying what you said)


Yes, I think we should better not to do it. It seems that unit-testing
isn't supported in gcc.


If we don't have a convenient way to do unit-testing, we need to do testing
for larger programs (at least empty loop bodies). This is what we did
previously and even though not ideal should work as well.


I agree that generation of loops with empty bodies is required
functionality for testing (I wrote about this in previous message.
Sorry for being unclear.)


I did not fully understand them. Are you comparing the resulting GIMPLE
trees or are you statically evaluating the isl and gimple trees to check
if the computed value is the same?


I'm statically evaluating gimple tree expressions, which are generated
from isl ast expressions. After this the result of the evaluation is
being compared with assumed result. I haven't found functions, which
evaluate isl ast expressions. That is why it compares results of
evaluations with assumed result. Maybe there is a better way to
compare the semantics of the GCC and isl expressions.


Just leave it out for now.

I originally hoped we could just ast generate an isl_ast_expr and then
just compare if the resulting gimple code is identical to a 
hand-written/provided one. (Just comparing the individual instructions).


Tobias

Re: [GSoC] generation of GCC expression trees from isl ast expressions

2014-07-01 Thread Tobias Grosser


On 01/07/2014 14:53, Roman Gareev wrote:

Hi Tobias,

could you please advise me how to verify the results of gimple code
generation?


More comments inline, but here something on a very high level.

I personally like testing already on the GIMPLE level and could see us
matching for certain expressions in the dumped gimple output.
Unfortunately this kind of testing may be a little fragile depending how
often gcc changes its internal dumping (hopefully not too often). On the
other side, in gcc testing is commonly done by compiling and executing
files. For this to work, we would need at least a simple implementation
of body statements before we can get anything tested and checked in.


I've written the first draft of the generation of loops
with empty bodies and tried to verify gimple code using the
representation, which is dumped at the end of the generation of the
dump_file. If we consider the following example, we'll see that cloog
and isl code generator generate similar representation (representation
generated by isl code generator doesn't have body of the loop, as was
expected).

int
main (int n, int *a)
{
   int i;

   for (i = 0; i < 100; i++)
 a[i] = i;

  return 0;
}

gcc/graphite-isl-ast-to-gimple.c

loop_0 (header = 0, latch = 1, niter = )
{
   bb_2 (preds = {bb_0 }, succs = {bb_3 })
   {
 :

   }
   bb_5 (preds = {bb_3 }, succs = {bb_1 })
   {
 :
 # .MEM_10 = PHI <.MEM_3(D)(3)>
 # VUSE <.MEM_10>
 return 0;

   }
   loop_2 (header = 3, latch = 4, niter = )
   {
 bb_3 (preds = {bb_2 bb_4 }, succs = {bb_4 bb_5 })
 {
   :
   # graphite_IV.3_1 = PHI <0(2), graphite_IV.3_14(4)>
   graphite_IV.3_14 = graphite_IV.3_1 + 1;
   if (graphite_IV.3_1 < 99)
 goto ;
   else
 goto ;

 }
 bb_4 (preds = {bb_3 }, succs = {bb_3 })
 {
   :
   goto ;

 }
   }
}

graphite-clast-to-gimple.c

loop_0 (header = 0, latch = 1, niter = )
{
   bb_2 (preds = {bb_0 }, succs = {bb_3 })
   {
 :

   }
   bb_5 (preds = {bb_3 }, succs = {bb_1 })
   {
 :
 # .MEM_18 = PHI <.MEM_11(3)>
 # VUSE <.MEM_18>
 return 0;

   }
   loop_2 (header = 3, latch = 4, niter = )
   {
 bb_3 (preds = {bb_2 bb_4 }, succs = {bb_4 bb_5 })
 {
   :
   # graphite_IV.3_1 = PHI <0(2), graphite_IV.3_14(4)>
   # .MEM_19 = PHI <.MEM_3(D)(2), .MEM_11(4)>
   _2 = (sizetype) graphite_IV.3_1;
   _15 = _2 * 4;
   _16 = a_6(D) + _15;
   _17 = (int) graphite_IV.3_1;
   # .MEM_11 = VDEF <.MEM_19>
   *_16 = _17;
   graphite_IV.3_14 = graphite_IV.3_1 + 1;
   if (graphite_IV.3_1 < 99)
 goto ;
   else
 goto ;

 }
 bb_4 (preds = {bb_3 }, succs = {bb_3 })
 {
   :
   goto ;

 }
   }
}

However, this form doesn't have loop guards which are generated by
graphite_create_new_loop_guard in gcc/graphite-isl-ast-to-gimple.c and
by graphite_create_new_loop_guard in graphite-clast-to-gimple.c.


Maybe the guards are directly constant folded? Can you try with:

 int
 main (int n, int *a)
 {
int i;

for (i = 0; i < b; i++)
  a[i] = i;

   return 0;
 }


Below is the code of this generation (It still uses isl_int for
generation of isl_expr_int, because the error related to isl/val_gmp.h
still arises. I've tried to use isl 0.12.2 and 0.13, but gotten the
same error).


Did using 'extern "C"' around the include statement not help?



+/* Stores the INDEX in a vector and the loop nesting LEVEL for a given
+   isl_id NAME.  BOUND_ONE and BOUND_TWO represent the exact lower and
+   upper bounds that can be inferred from the polyhedral representation.  */


Why do you mention BOUND_ONE & BOUND_TWO? I do not see any use of them?


+typedef struct ast_isl_name_index {
+  int index;
+  int level;
+  const char *name;
+  /* If free_name is set, the content of name was allocated by us and needs
+ to be freed.  */
+  char *free_name;
+} *ast_isl_name_index_p;
+
+/* Helper for hashing ast_isl_name_index.  */
+
+struct ast_isl_index_hasher
+{
+  typedef ast_isl_name_index value_type;
+  typedef ast_isl_name_index compare_type;
+  static inline hashval_t hash (const value_type *);
+  static inline bool equal (const value_type *, const compare_type *);
+  static inline void remove (value_type *);
+};
+
+/* Computes a hash function for database element E.  */
+
+inline hashval_t
+ast_isl_index_hasher::hash (const value_type *e)
+{
+  hashval_t hash = 0;
+
+  int length = strlen (e->name);
+  int i;
+
+  for (i = 0; i < length; ++i)
+hash = hash | (e->name[i] << (i % 4));
+
+  return hash;
+}
+
+/* Compares database elements ELT1 and ELT2.  */
+
+inline bool
+ast_isl_index_hasher::equal (const value_type *elt1, const compare_type *elt2)
+{
+  return strcmp (elt1->name, elt2->name) == 0;
+}
+
+/* Free the memory taken by a ast_isl_name_index struct.  */
+
+inline void
+ast_isl_index_hasher::remove (value_type *c)
+{
+  if (c->free_name)
+free (c->free_name);
+  free (c);
+}
+
+typed

Re: [GSoC] generation of GCC expression trees from isl ast expressions

2014-07-03 Thread Tobias Grosser

On 03/07/2014 19:24, Roman Gareev wrote:

However, this form doesn't have loop guards which are generated by
>>graphite_create_new_loop_guard in gcc/graphite-isl-ast-to-gimple.c and
>>by graphite_create_new_loop_guard in graphite-clast-to-gimple.c.

>
>
>Maybe the guards are directly constant folded? Can you try with:

I've tried this. It seems that the result is the same. If we consider
the following code, we'll see that the guards are generated:

int
main (int n, int *a)
{
   int i;

   for (i = n; i < 100; i++)
 a[i] = i;

   return 0;
}

[.. GIMPLE code ..]

OK. So this seems to fix your problem.

>>Below is the code of this generation (It still uses isl_int for
>>generation of isl_expr_int, because the error related to isl/val_gmp.h
>>still arises. I've tried to use isl 0.12.2 and 0.13, but gotten the
>>same error).

>
>
>Did using 'extern "C"' around the include statement not help?

I can't build gcc without using 'extern "C"'. After the successful
building the error is arising when I am trying to compile, for
example, the following code:

int
main (int n, int *a)
{
   int i;

   for (i = n; i < 100; i++)
 a[i] = i;

   return 0;
}

Sorry, I may have missed it. What kind of error was arising?

>Any reason this is not a simpel std::map, but you instead
>have this manually implemented hash table?

I think that using of std::map may create installation dependency,
which requires libstdc++. I've asked the community about this.

Great.

>Please explain why we need a special function to get the upper bound.
>Specifically, explain that isl in some configurations can generate loop
>exit conditions such as:
>
>for (i = ?; i + 2 >= 3 && 3 <= i + m; i++)
>   ;
>
>This get upper bound assumes a certain shape of loop exit condition(
>
>for (i = ?; i < expr; i++)
>
>Also, you need to set the option --no-isl-ast-build-atomic-upper-bound
>in isl_ast_build to be able to rely on this behavior.
>(You did this below)
>
>Do you have a specific motivation, why you don't want to support
>arbitrary expressions? I assume this is necessary for the if - do-while
>optimization. If this is the case please state this.

I haven't found another function for generation loops in Gimple form.
The create_empty_loop_on_edge needs, which is being used now, requires
upper bound.

  /* create_empty_loop_on_edge
 |
 |- pred_bb -  pred_bb --
 |   |   | | iv0 = initial_value |
 |-|-  --|---
 | |___| entry_edge
 | | entry_edge /  ||
 | | > /   -V---V- loop_header -
 | V ||  iv_before = phi (iv0, iv_after) |
 |- succ_bb -   |
|-
 |   |   || |
 |---|  ---V--- loop_body

 |   | | iv_after = iv_before + stride |
 |   | | if (iv_before < upper_bound) |
 |   |
---|--\--
 |   | |
\ exit_e
 |   | V   \
 |   |   - loop_latch -
---V- succ_bb -
 |   |  |   |
||
 |   |   /-
-
 |\ _ /

Furthermore, at the moment of loop generation we don't have the
induction variable, which is need for generation of a loop condition
in case of the option –no-isl-ast-build-atomic-upper-bound is unset.
The induction variable is returned by create_empty_loop_on_edge. Could
you please advise me another function to generate them?

>Please explain why we do not just generate a loop that has the loop
>bound at the top, but instead create a structure of the form
>
>if (lb > ub)
>   do {
>
>   } while (lb ..)
>
>(Such a figure, but completed might help).
>
>(I think the original motivation was that later we may be able to prove
>that a loop is executed at least once which allows us to remove the if
>which again enables better loop invariant code motion)

I didn't have special intentions for this. As was mentioned above, I
haven't found another way for generation of loops.

I see. The way you generate loops seems to be exactly right (at least 
for now). I only was asking you to add comments explaining why this was 
done this way. It seems you just went the way of least resistance, but 
maybe you can use my previous comments to motivate this a little bit 
more in the source code (I will then go through the comments and add the 
missing pieces of motivation).

+/* We always use sign

Re: [GSoC] Question about std::map

2014-07-03 Thread Tobias Grosser


On 03/07/2014 19:23, Roman Gareev wrote:

Dear gcc contributors,

could you please answer a few questions about std::map? Does gcc have
a policy that forbids using of map in the source code of gcc? Can this
using create a new installation dependency, which requires libstdc++?
I would be very grateful for your comments.


https://gcc.gnu.org/codingconventions.html#Standard_Library

This suggests that using std::map is allowed. Running "grep 'std::' *"
on gcc, we only find a couple of std::make_pair, std::min and std::max 
calls, but I don't see why we should not use std::map. I would say go 
for it if there are no vetos. It seems to be the right tool for what you 
are aiming for and certainly makes the code a lot more readable.


Cheers,
Tobias

Re: [GSoC] Question about std::map

2014-07-04 Thread Tobias Grosser


On 04/07/2014 04:16, Trevor Saunders wrote:

On Thu, Jul 03, 2014 at 07:52:59PM +0200, Tobias Grosser wrote:

On 03/07/2014 19:23, Roman Gareev wrote:

Dear gcc contributors,

could you please answer a few questions about std::map? Does gcc have
a policy that forbids using of map in the source code of gcc? Can this
using create a new installation dependency, which requires libstdc++?
I would be very grateful for your comments.


https://gcc.gnu.org/codingconventions.html#Standard_Library

This suggests that using std::map is allowed. Running "grep 'std::' *"
on gcc, we only find a couple of std::make_pair, std::min and std::max
calls, but I don't see why we should not use std::map. I would say go for it
if there are no vetos. It seems to be the right tool for what you are aiming
for and certainly makes the code a lot more readable.


I'm certainly not opposed to using the stl when it makes sense, and
reinventing our own stl has a fairly high cost.  However I think the
question of if you should use std::map is a complicated one.  First
remember std::map is a tree, not a hash table, and consider which
performance characteristic you want.  Also remember the stl in some
cases trades off possible optimizations to be more generic, if you
implement your own version of something you can say ban arrays larger
than 2^32 and thereby save a bit of space.  When you use the stl you
also need to make sure you don't call things that can throw exceptions
since gcc is built with -fno-exceptions and generally isn't exception
safe.

I think there are some stl things you should basically always prefer
e.g. std::swap, there are some things you should think about before
using, and some things that are best avoided.  Personally I think you
should favor hash tables to std::map most of the time because of there
generally better performance.

btw if you have specific gripes about the stlish bits gcc already has
I'd like to hear them.


Hi Trev,

thanks for the feedback. You seem to have a good idea for which use 
cases a std::map is possible. Maybe I can invite you to a patch review 
currently happening on gcc@gcc.gnu.org (Roman, we should do this kind of 
stuff on gcc-patches@ I suppose) under the title "[GSoC] generation of 
GCC expression trees from isl ast expressions". Feel free to ignore most 
of the code. The question I am unsure about is the use of 
ast_isl_index_hasher() and related function, which takes 105 lines of 
code and alomost the same functionality could be implemented by a 
one-line use of std::map<>. The maps generated are commonly small and I 
doubt this is in the performance critical path. I feel very worried 
about adding so much bloat? What would you suggest.


Cheers,
Tobias



Tobias

Re: [GSoC] Question about std::map

2014-07-07 Thread Tobias Grosser


On 05/07/2014 00:03, Trevor Saunders wrote:

On Fri, Jul 04, 2014 at 09:57:11AM +0200, Tobias Grosser wrote:

On 04/07/2014 04:16, Trevor Saunders wrote:

On Thu, Jul 03, 2014 at 07:52:59PM +0200, Tobias Grosser wrote:

On 03/07/2014 19:23, Roman Gareev wrote:

Dear gcc contributors,

could you please answer a few questions about std::map? Does gcc have
a policy that forbids using of map in the source code of gcc? Can this
using create a new installation dependency, which requires libstdc++?
I would be very grateful for your comments.


https://gcc.gnu.org/codingconventions.html#Standard_Library

This suggests that using std::map is allowed. Running "grep 'std::' *"
on gcc, we only find a couple of std::make_pair, std::min and std::max
calls, but I don't see why we should not use std::map. I would say go for it
if there are no vetos. It seems to be the right tool for what you are aiming
for and certainly makes the code a lot more readable.


I'm certainly not opposed to using the stl when it makes sense, and
reinventing our own stl has a fairly high cost.  However I think the
question of if you should use std::map is a complicated one.  First
remember std::map is a tree, not a hash table, and consider which
performance characteristic you want.  Also remember the stl in some
cases trades off possible optimizations to be more generic, if you
implement your own version of something you can say ban arrays larger
than 2^32 and thereby save a bit of space.  When you use the stl you
also need to make sure you don't call things that can throw exceptions
since gcc is built with -fno-exceptions and generally isn't exception
safe.

I think there are some stl things you should basically always prefer
e.g. std::swap, there are some things you should think about before
using, and some things that are best avoided.  Personally I think you
should favor hash tables to std::map most of the time because of there
generally better performance.

btw if you have specific gripes about the stlish bits gcc already has
I'd like to hear them.


Hi Trev,

thanks for the feedback. You seem to have a good idea for which use cases a
std::map is possible. Maybe I can invite you to a patch review currently
happening on gcc@gcc.gnu.org (Roman, we should do this kind of stuff on
gcc-patches@ I suppose) under the title "[GSoC] generation of GCC expression
trees from isl ast expressions". Feel free to ignore most of the code. The
question I am unsure about is the use of ast_isl_index_hasher() and related
function, which takes 105 lines of code and alomost the same functionality
could be implemented by a one-line use of std::map<>. The maps generated are


have you tried the hash_map class I introduced a couple days ago? (maybe
we should teach its generic machinary about strings, but it should
already be better than hash_table for your purpose).


No. We did not yet. Thanks for pointing it out.

> One thing that

confuses me is that you seem to be mapping from ast node to ast node,
but you do the actual mapping with strings, why is that needed?


Actually it is a mapping from isl_id to tree, but there seems to be some 
additional information mixed in. All this is hidden in the current hash 
map code and makes this rather confusing.


I think the best is to use the std::map to make the code work with the 
simplest interface possible. After everything in place we can then test 
if the move to hash_map gives any performance benefits.



commonly small and I doubt this is in the performance critical path. I feel
very worried about adding so much bloat? What would you suggest.


yeah, that code kind of made my eyes glaze over on the other hand your
talking about going from average constant time to average log time,
which is the sort of thing I'm pretty hesitent to say is a great idea.


The number of elements in these maps is most likely between 3-10.


Its too bad unordered_map isn't an option.


Right.

Thanks again for your help!

Cheers,
Tobias

Re: [GSoC] generation from isl_ast_node_user

2014-07-07 Thread Tobias Grosser


On 07/07/2014 12:33, Roman Gareev wrote:

Hi Tobias,

I think that, according to the std::map feedback, we could use
std::map now and replace it with hash_map later, if its performance is
better. However, I propose to temporary postpone this and work on
gimple code generation from isl_ast_node_user, because we already have
generation of loops with empty bodies and generation from
isl_ast_node_user can be a problem. What do you think about this?

Could you please advise me an algorithm for computation of
substitutions? (ClooG uses its own algorithm for this and stores
substitutions in clast_user_stmt. There is an algorithm, which is used
in polly, but, honestly, I don't understand it.)


You may want to take a look at polly commit r212186, where I reworked
and documented how this works.


Could you please advise me how is it better to bind polly basic blocks
to a isl_ast_node_user? I'm using the following code now, but I'm not
sure if it is the right way:

bb_schedule = isl_map_intersect_domain (bb_schedule,
isl_set_copy (pbb->domain));
isl_id *dim_in_id = isl_map_get_tuple_id (bb_schedule, isl_dim_in);
isl_id *new_dim_in_id = isl_id_alloc (isl_id_get_ctx (dim_in_id),
isl_id_get_name (dim_in_id), pbb);
bb_schedule = isl_map_set_tuple_id (bb_schedule, isl_dim_in,
new_dim_in_id);

(I'm allocating an isl_id, which contains pointer to polly basic
blocks, while we're generating a isl_schedule.)


Is this necessary? The id should already be set in
(graphite-sese-to-poly.c):

static isl_id *
isl_id_for_pbb (scop_p s, poly_bb_p pbb)
{
  char name[50];
  snprintf (name, sizeof (name), "S_%d", pbb_index (pbb));
  return isl_id_alloc (s->ctx, name, pbb);
}


gcc_assert (isl_ast_node_get_type (node) == isl_ast_node_user);
isl_ast_expr *user_expr = isl_ast_node_user_get_expr (node);
isl_ast_expr *name_expr = isl_ast_expr_get_op_arg (user_expr, 0);
gcc_assert (isl_ast_expr_get_type (name_expr) == isl_ast_expr_id);
isl_id *name_id = isl_ast_expr_get_id (name_expr);
poly_bb_p pbb = (poly_bb_p) isl_id_get_user (name_id);

(I'm getting this information, while we're handling isl_ast_node_user)


Perfect! (or at least that's the same approach I have choosen for Polly)

Do you have any problems with this approach? From my perspective it
looks like a good solution.

Tobias

Re: [GSoC] generation from isl_ast_node_user

2014-07-07 Thread Tobias Grosser


[Forgot to answer two questions]

On 07/07/2014 12:33, Roman Gareev wrote:

Hi Tobias,

I think that, according to the std::map feedback, we could use
std::map now and replace it with hash_map later, if its performance is
better.


Right.

> However, I propose to temporary postpone this and work on

gimple code generation from isl_ast_node_user, because we already have
generation of loops with empty bodies and generation from
isl_ast_node_user can be a problem. What do you think about this?


As I am sometimes slow in reviewing, maybe you can do this if you find 
free time.


I would prefer to move soon to std::map, as this is the last open piece 
in your loop generation patch and we can finish the review after this is 
done.


Tobias

Re: [GSoC] Question about std::map

2014-07-07 Thread Tobias Grosser


On 07/07/2014 13:14, Jonathan Wakely wrote:

On 7 July 2014 12:08, Tobias Grosser wrote:


The number of elements in these maps is most likely between 3-10.


Then std::map is the wrong solution.

The overhead of dereferencing all the pointers while walking through a
std::map will be higher than the savings you get from logarithmic
lookup.

For ten elements a linear search of a std::vector will probably be
quicker than lookup in a std::map. A binary search of a sorted vector
(which needs no pointer-chasing because it uses random-access
iterators) will definitely be faster.


Very good point. On the other side, we still want to hide this behind a 
map-like interface. So starting with a std::map may be a good thing.
To tune this later we can introduce a specialized vector_map. Such a 
vector_map may not even want to use a std::vector, but a vector class 
that stores its data in stack memory.


Cheers,
Tobias

Re: [GSoC] generation from isl_ast_node_user

2014-07-08 Thread Tobias Grosser


On 08/07/2014 14:50, Roman Gareev wrote:

You may want to take a look at polly commit r212186, where I reworked
and documented how this works.


It is much easier for understanding. Thank you! I've tried to consider
an older version.


Is this necessary? The id should already be set in
(graphite-sese-to-poly.c):

static isl_id *
isl_id_for_pbb (scop_p s, poly_bb_p pbb)
{
  char name[50];
  snprintf (name, sizeof (name), "S_%d", pbb_index (pbb));
  return isl_id_alloc (s->ctx, name, pbb);

}


Thank you! I haven't known about this function. Should its declaration
be moved to graphite-sese-to-poly.h or it is better to redefine it in
the graphite-isl-ast-to-gimple.c?


No, nothing of the both. To my understanding the isl_ids you get should 
already contain a user pointer to a pbb, which is created at the time 
isl_id_for_pbb is called in sese-to-polly. You should be able to just 
retrieve it. If that does not work, we should figure out why.


Tobias

Re: [GSoC] generation of Gimple loops with empty bodies

2014-07-15 Thread Tobias Grosser


This is not a patch review, lets move this to gcc@gcc.gnu.org.

On 15/07/2014 17:03, Roman Gareev wrote:

I've found out that int128_integer_type_node and
long_long_integer_type_node are NULL at the moment of definition of
the graphite_expression_size_type. Maybe we should use
long_long_integer_type_node, because, as you said before, using of
signed 64 has also been proved to be very robust. What do you think
about this?


I do not fully understand this message. You first say that 
long_long_integer_type_node is NULL, but then want to use this. This 
does not seem to be a solution. Most likely it is the solution, but the 
problem description makes it hard to understand it. Is the problem

caused by initialization order issues? Or why are such types NULL?

(I am fine with using 64 bits by default, but I would like to keep the 
possibility to compile with 128 bits to allow the size to be changed

easily during debugging. So using a specific type directly without
going through a graphite specific variable is something I would like to 
avoid.


Cheers,
Tobias

Re: [GSoC] generation of Gimple loops with empty bodies

2014-07-17 Thread Tobias Grosser


On 16/07/2014 11:40, Richard Biener wrote:

On Tue, Jul 15, 2014 at 5:45 PM, Tobias Grosser  wrote:

This is not a patch review, lets move this to gcc@gcc.gnu.org.


On 15/07/2014 17:03, Roman Gareev wrote:


I've found out that int128_integer_type_node and
long_long_integer_type_node are NULL at the moment of definition of
the graphite_expression_size_type. Maybe we should use
long_long_integer_type_node, because, as you said before, using of
signed 64 has also been proved to be very robust. What do you think
about this?



I do not fully understand this message. You first say that
long_long_integer_type_node is NULL, but then want to use this. This does
not seem to be a solution. Most likely it is the solution, but the problem
description makes it hard to understand it. Is the problem
caused by initialization order issues? Or why are such types NULL?


Because they are not available on all targets or for all languages.

I suggest you use the largest available integer mode via
mode = mode_for_size (MAX_FIXED_MODE_SIZE, MODE_INT, 0);
type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), [01]);


Roman, can you give this a shot?

Cheers,
Tobias

Re: [GSoC] generation of Gimple code from isl_ast_node_block

2014-07-21 Thread Tobias Grosser


On 21/07/2014 14:23, Roman Gareev wrote:

I've attached the patch, which contains generation of Gimple code from
isl_ast_node_block.

However, I've found out a problem. The following example:

int k = 50;
static int __attribute__((noinline))
foo ()
{
   int i, res;
   for (i = 0, res = 0; i < k; i++)
 res += i;

   return res;
}

extern void abort ();

int
main (void)
{
   int res = foo ();


   if (res != 1225)
 abort ();

   return 0;
}.

produces the following code, which executes without error:

ISL AST generated by ISL:
{
   for (int c1 = 0; c1 < k.0; c1 += 1)
 S_4(c1);
   S_6();
}

Gimple code:
loop_0 (header = 0, latch = 1, niter = )
{
   bb_2 (preds = {bb_0 }, succs = {bb_3 bb_8 })
   {
 :
 # VUSE <.MEM_3(D)>
 k.0_9 = k;
 if (k.0_9 > 0)
   goto ;
 else
   goto ;

   }
   bb_3 (preds = {bb_2 }, succs = {bb_4 bb_7 })
   {
 :
 # .MEM_8 = VDEF <.MEM_3(D)>
 phi_out_of_ssa.5[0] = 0;
 _20 = k.0_9 > 0;
 if (_20 != 0)
   goto ;
 else
   goto ;

   }
   bb_4 (preds = {bb_3 }, succs = {bb_5 })
   {
 :
 _21 = (signed long) k.0_9;
 _22 = _21 + -1;

   }
   bb_7 (preds = {bb_5 bb_3 }, succs = {bb_8 })
   {
 :
 # .MEM_30 = PHI <.MEM_29(5), .MEM_8(3)>
 # VUSE <.MEM_30>
 res_32 = Close_Phi.6[0];
 # .MEM_33 = VDEF <.MEM_30>
 Cross_BB_scalar_dependence.7[0] = res_32;
 # VUSE <.MEM_33>
 res_17 = Cross_BB_scalar_dependence.7[0];
 _16 = res_17;

   }
   bb_8 (preds = {bb_7 bb_2 }, succs = {bb_1 })
   {
 :
 # res_12 = PHI <_16(7), 0(2)>
 # .MEM_2 = PHI <.MEM_33(7), .MEM_3(D)(2)>
 # VUSE <.MEM_2>
 return res_12;

   }
   loop_2 (header = 5, latch = 6, niter = (unsigned long) ((signed
long) k.0_9 + -1), upper_bound = 9223372036854775806)
   {
 bb_5 (preds = {bb_4 bb_6 }, succs = {bb_6 bb_7 })
 {
   :
   # graphite_IV.8_23 = PHI <0(4), graphite_IV.8_24(6)>
   # .MEM_31 = PHI <.MEM_8(4), .MEM_29(6)>
   # VUSE <.MEM_31>
   res_25 = phi_out_of_ssa.5[0];
   _27 = (int) graphite_IV.8_23;
   res_26 = res_25 + _27;
   # .MEM_28 = VDEF <.MEM_31>
   Close_Phi.6[0] = res_26;
   # .MEM_29 = VDEF <.MEM_28>
   phi_out_of_ssa.5[0] = res_26;
   graphite_IV.8_24 = graphite_IV.8_23 + 1;
   if (graphite_IV.8_23 < _22)
 goto ;
   else
 goto ;

 }
 bb_6 (preds = {bb_5 }, succs = {bb_5 })
 {
   :
   goto ;

 }
   }
}

It is similar to the original code:

loop_0 (header = 0, latch = 1, niter = )
{
   bb_2 (preds = {bb_0 }, succs = {bb_3 bb_8 })
   {
 :
 # VUSE <.MEM_3(D)>
 k.0_9 = k;
 if (k.0_9 > 0)
   goto ;
 else
   goto ;

   }
   bb_3 (preds = {bb_2 }, succs = {bb_5 })
   {
 :
 # .MEM_8 = VDEF <.MEM_3(D)>
 phi_out_of_ssa.5[0] = 0;
 goto ;

   }
   bb_4 (preds = {bb_7 }, succs = {bb_8 })
   {
 :
 # .MEM_19 = PHI <.MEM_18(7)>
 # VUSE <.MEM_19>
 res_17 = Cross_BB_scalar_dependence.7[0];
 _16 = res_17;
 goto ;

   }
   bb_7 (preds = {bb_5 }, succs = {bb_4 })
   {
 :
 # VUSE <.MEM_13>
 res_4 = Close_Phi.6[0];
 # .MEM_18 = VDEF <.MEM_13>
 Cross_BB_scalar_dependence.7[0] = res_4;
 goto ;

   }
   bb_8 (preds = {bb_4 bb_2 }, succs = {bb_1 })
   {
 :
 # res_12 = PHI <_16(4), 0(2)>
 # .MEM_2 = PHI <.MEM_19(4), .MEM_3(D)(2)>
 # VUSE <.MEM_2>
 return res_12;

   }
   loop_1 (header = 5, latch = 6, niter = , upper_bound = 2147483646)
   {
 bb_5 (preds = {bb_3 bb_6 }, succs = {bb_6 bb_7 })
 {
   :
   # i_10 = PHI <0(3), i_6(6)>
   # .MEM_1 = PHI <.MEM_8(3), .MEM_13(6)>
   # VUSE <.MEM_1>
   res_11 = phi_out_of_ssa.5[0];
   res_5 = res_11 + i_10;
   # .MEM_7 = VDEF <.MEM_1>
   Close_Phi.6[0] = res_5;
   # .MEM_13 = VDEF <.MEM_7>
   phi_out_of_ssa.5[0] = res_5;
   i_6 = i_10 + 1;
   if (i_6 < k.0_9)
 goto ;
   else
 goto ;

 }
 bb_6 (preds = {bb_5 }, succs = {bb_5 })
 {
   :
   goto ;

 }
   }
}

If we change the name of variable k to n:

int n = 50;
static int __attribute__((noinline))
foo ()
{
   int i, res;
   for (i = 0, res = 0; i < n; i++)
 res += i;

   return res;
}

extern void abort ();

int
main (void)
{
   int res = foo ();


   if (res != 1225)
 abort ();

   return 0;
}

the following code will be produced, which gives wrong result

ISL AST generated by ISL:
{
   S_6();
   for (int c1 = 0; c1 < n.0; c1 += 1)
 S_4(c1);
}


It seems S_6 is now scheduled before S_4 which is surprising, as
dependences from S_4 to S_6 should prevent us from generating a schedule 
that yields such a result. What is the schedule that you give to the isl 
ast generator?


Cheers,
Tobias

P.S.: The patch looks good by itself, but needs some test cases. As you 
have them in this email, we can just add them after we understood this bug.

Re: [GSoC] generation of Gimple code from isl_ast_node_block

2014-07-21 Thread Tobias Grosser


On 21/07/2014 14:55, Roman Gareev wrote:

It seems S_6 is now scheduled before S_4 which is surprising, as
dependences from S_4 to S_6 should prevent us from generating a schedule
that yields such a result. What is the schedule that you give to the isl ast
generator?


The schedule generated by the code, which uses variable k (It executes
without errors):

[k.0] -> { S_6[] -> [1] : exists (e0 = [(-1 + k.0)/4294967296]:
4294967296e0 <= -1 + k.0 and 4294967296e0 >= -2147483647 + k.0);
S_4[i0] -> [0, i0, 0] : exists (e0 = [(-1 + k.0)/4294967296]: i0 >= 0
and 4294967296e0 <= -1 + k.0 and 4294967296e0 >= -4294967296 + k.0 and
4294967296e0 <= -1 + k.0 - i0 and i0 <= 2147483646) }

The schedule generated by the code, which uses variable n:

[n.0] -> { S_6[] -> [1] : exists (e0 = [(-1 + n.0)/4294967296]:
4294967296e0 <= -1 + n.0 and 4294967296e0 >= -2147483647 + n.0);
S_4[i0] -> [0, i0, 0] : exists (e0 = [(-1 + n.0)/4294967296]: i0 >= 0
and 4294967296e0 <= -1 + n.0 and 4294967296e0 >= -4294967296 + n.0 and
4294967296e0 <= -1 + n.0 - i0 and i0 <= 2147483646) }


Perfect. The problem is that S_6 has a one-dimensional schedule [1] and 
S_4 has a three dimensional schedule [0,i0,0]. For schedules with 
different dimensionality, the isl AST generator can not define an order 
and just randomly chooses an order. The solution to this problem is to 
extend all schedules to the maximal number of schedule dimensions (using 
'0's for the remaining values).


Search for the function extend_scattering() (its implementation is 
unnecessarily verbose and could possibly simplified by using 
isl_*_equate or isl_*fix*).


Cheers,
Tobias

Re: [GSoC] checking for the loop parallelism

2014-07-31 Thread Tobias Grosser


On 31/07/2014 16:41, Roman Gareev wrote:

Tobias, could you please advise me how is it better to determine a
depth for loop (it is necessary for loop_is_parallel_p)? In
graphite-clast-to-gimple.c. it is done by using the number of
dimensions of a domain, which is attached to clast_for. As far as I
know, isl doesn't give such a opportunity. If I'm not mistaken “2 * (
level + 1) “ is also not suitable for all cases.


Hi Roman,

you can get this information from the isl_ast_build that was used when 
generating a certain loop (you can access this isl_ast_build from the 
callbacks isl_ast_build_set_before_each_for and 
isl_ast_build_set_after_each_for). With isl_ast_build_get_schedule, you 
can get an incomplete schedule (less dimensions then the schedule that 
you gave to the isl ast generator). Specifically, it only contains the 
dimensions of the current loop and all surrounding ones. Consequently 
the last dimension in this incomplete schedule is the dimension you want 
to check for parallelism.


If you use this schedule in
loop_level_carries_dependences instead of the result from 
scop_get_transformed_schedule you should even be able to detect

parallelism for something like the following:

for (i
  A[i] += i

for (i)
  A[0] += i

Here the first loop is parallel, but the second not. The old approach 
would have said this dimension is not parallel at all. If you use the 
schedule provided, you only look at the statements involved in a certain 
loop, so the first loop would be detected as parallel.


Tobias

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-05 Thread Tobias Grosser


On 04.11.2014 16:17, Tobias Burnus wrote:

Hi all,

currently, contrib/download_prerequisites downloads isl-0.12.2 from
ftp://gcc.gnu.org/pub/gcc/infrastructure/$ISL.tar.bz2

However, that version has a bug which causes an ICE (PR 62289).
End of October, ISL 0.14 was released, which should contain a fix for
that issue. Hence, one should consider using 0.14 instead of /infrastructure/
and download_prerequisites.
Download: http://isl.gforge.inria.fr


Thanks Tobias for bringing this one up.


Disclaimer: I haven't tested that version, yet.

  * * *

The page https://gcc.gnu.org/install/prerequisites.html lists 0.12.2 as
version to be used (without "(or later)").

The configure script does some version checks, but seeminly only to rule
out early version of ISL as it is only the check for an include file.

According to https://groups.google.com/forum/#!topic/isl-development/CQGVfj60SQM
"0.14 was kept backward compatible with 0.13." [But 0.15 will break it.]

Thus, assuming 0.13 works, 0.14 also should work. Release notes:
http://repo.or.cz/w/isl.git/blob/HEAD:/ChangeLog


Right. 0.13 should work from svn revision 213816.

As Richard pointed out, isl introduce a backward incompatible change in 
0.13 that unfortunately made it impossible to compile 4.8 with isl 0.13.


Judging from Richards comments, this seems to continue to cause issues.

@Sven, we need to be very careful with such changes in the future. If it 
is low cost, we may even consider to make isl_int again available for

a couple of releases (from 0.14.1 on).


Comments? Especially from Richard and Tobias?

* * *

Finally, is CLooG still needed or will GCC also work with only ISL? Looking
at https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01742.html , it seems as if
it is about the right time to remove it.


CLooG is not necessarily needed. You can run graphite just with ISL. The 
main reason that ISL code generation is not enabled by default is that 
we did not yet get extensive testing and it was unclear who will have 
the time to fix possible bugs.


@Mircae, Roman: Would you have time to help with bug-fixing if we do the 
switch now? (I am happy to review patches and give advice, but can not 
do the full move myself)


Cheers,
Tobias

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-05 Thread Tobias Grosser


On 06.11.2014 07:04, Roman Gareev wrote:

CLooG is not necessarily needed. You can run graphite just with ISL. The
main reason that ISL code generation is not enabled by default is that we
did not yet get extensive testing and it was unclear who will have the time
to fix possible bugs.


Could you please advise me which test suites should be used to make
performance comparison between CLooG and ISL generator? (I would like
to do this, even though the old generator is removed).


I do not have specific advices. You can use various open source 
benchmarks e.g. the LLVM test suite or, if you have access, you could 
run SPEC or something.



@Mircae, Roman: Would you have time to help with bug-fixing if we do the
switch now? (I am happy to review patches and give advice, but can not do
the full move myself)


I could find time for this. What do you mean by ‘switch’? If I’m not
mistaken, ISL generator is already used by default. Should we remove
support of CLooG generator and all files related to it?


Wow, I must really have been sleeping (or just forgetting). The switch 
already happened. This is amazing.


As the ISL code generator has been default since a while and we did not 
get many bug reports, the actual switch seems to have worked well. We 
could probably still need some testing, but in this case it is most 
likely time to drop the CLooG support entirely. Are you interested to 
provide the relevant patches?


Also, as Tobias suggested we should raise the minimal supported isl 
level to 0.14 to be sure PR 62289 is fixed.


Cheers,
Tobias

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Tobias Grosser


On 06.11.2014 10:05, Richard Biener wrote:

On Thu, Nov 6, 2014 at 8:02 AM, Tobias Grosser  wrote:

On 06.11.2014 07:04, Roman Gareev wrote:


CLooG is not necessarily needed. You can run graphite just with ISL. The
main reason that ISL code generation is not enabled by default is that we
did not yet get extensive testing and it was unclear who will have the
time
to fix possible bugs.



Could you please advise me which test suites should be used to make
performance comparison between CLooG and ISL generator? (I would like
to do this, even though the old generator is removed).



I do not have specific advices. You can use various open source benchmarks
e.g. the LLVM test suite or, if you have access, you could run SPEC or
something.


@Mircae, Roman: Would you have time to help with bug-fixing if we do the
switch now? (I am happy to review patches and give advice, but can not do
the full move myself)



I could find time for this. What do you mean by ‘switch’? If I’m not
mistaken, ISL generator is already used by default. Should we remove
support of CLooG generator and all files related to it?



Wow, I must really have been sleeping (or just forgetting). The switch
already happened. This is amazing.

As the ISL code generator has been default since a while and we did not get
many bug reports, the actual switch seems to have worked well. We could
probably still need some testing, but in this case it is most likely time to
drop the CLooG support entirely. Are you interested to provide the relevant
patches?

Also, as Tobias suggested we should raise the minimal supported isl level to
0.14 to be sure PR 62289 is fixed.


As I am testing with system isl/cloog that would be unfortunate as it means
I'd either drop testing graphite for 4.8 and 4.9 or for 5.0 ...  AFAIK ISL
12.x and 13+ cannot co-exist in the system due to include file conflicts.


Sven,

is there any chance we can add the deprecated isl_int includes back into 
isl 0.14.1. This would unblock the testing and we could remove them as 
soon as gcc 4.8/4.9 has been phased out.


This would also fix my polly code coverage tests which do not work since 
isl_int has been moved into different header files, as ubuntu does not 
want to update isl as long as such an update breaks gcc.


Cheers,
Tobias

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Tobias Grosser


On 06.11.2014 11:15, Richard Biener wrote:

On 11/6/14, Tobias Grosser  wrote:

On 06.11.2014 10:05, Richard Biener wrote:

On Thu, Nov 6, 2014 at 8:02 AM, Tobias Grosser  wrote:

On 06.11.2014 07:04, Roman Gareev wrote:


CLooG is not necessarily needed. You can run graphite just with ISL.
The
main reason that ISL code generation is not enabled by default is that
we
did not yet get extensive testing and it was unclear who will have the
time
to fix possible bugs.



Could you please advise me which test suites should be used to make
performance comparison between CLooG and ISL generator? (I would like
to do this, even though the old generator is removed).



I do not have specific advices. You can use various open source
benchmarks
e.g. the LLVM test suite or, if you have access, you could run SPEC or
something.


@Mircae, Roman: Would you have time to help with bug-fixing if we do
the
switch now? (I am happy to review patches and give advice, but can not
do
the full move myself)



I could find time for this. What do you mean by ‘switch’? If I’m not
mistaken, ISL generator is already used by default. Should we remove
support of CLooG generator and all files related to it?



Wow, I must really have been sleeping (or just forgetting). The switch
already happened. This is amazing.

As the ISL code generator has been default since a while and we did not
get
many bug reports, the actual switch seems to have worked well. We could
probably still need some testing, but in this case it is most likely time
to
drop the CLooG support entirely. Are you interested to provide the
relevant
patches?

Also, as Tobias suggested we should raise the minimal supported isl level
to
0.14 to be sure PR 62289 is fixed.


As I am testing with system isl/cloog that would be unfortunate as it
means
I'd either drop testing graphite for 4.8 and 4.9 or for 5.0 ...  AFAIK
ISL
12.x and 13+ cannot co-exist in the system due to include file conflicts.


Sven,

is there any chance we can add the deprecated isl_int includes back into
isl 0.14.1. This would unblock the testing and we could remove them as
soon as gcc 4.8/4.9 has been phased out.

This would also fix my polly code coverage tests which do not work since
isl_int has been moved into different header files, as ubuntu does not
want to update isl as long as such an update breaks gcc.


Ah, so it's still there?


Yes, we just need to include some additional header files.

Besides isl/aff.h we now also would need isl/deprecated/aff.h


 Maybe we can simply add some configury to detect
its location and fix build with a patch for 4.8 and 4.9?

> Is there maybe even

a preprocessor macro one can check that newer ISL provide that can tell
us where to look for isl_int?


We could detect the need to include these files based on isl's version 
macros.


So you suggest that adding conditional includes would allow us to move 
forward and would be part of some of the next gcc point releases. This

would be perfect.

Tobias

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-06 Thread Tobias Grosser


On 06.11.2014 20:08, Roman Gareev wrote:

As the ISL code generator has been default since a while and we did not get
many bug reports, the actual switch seems to have worked well. We could
probably still need some testing, but in this case it is most likely time to
drop the CLooG support entirely. Are you interested to provide the relevant
patches?

Also, as Tobias suggested we should raise the minimal supported isl level to
0.14 to be sure PR 62289 is fixed.


Yes, if you don’t mind, I’ll try to provide them by the end of the
week. I’m planning to make two patches: the first one removes files
related to CLooG code generator, the second one removes support of
CLooG library from configure and make files. What do you think about
this?


Sounds good as long as they will compile and pass tests independently. 
Please also remember updating the documentation.


Cheers,
Tobias

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-08 Thread Tobias Grosser


On 08.11.2014 20:49, Roman Gareev wrote:

Sounds good as long as they will compile and pass tests independently.
Please also remember updating the documentation.


I’m trying to build Graphite without CLooG, but I get the following error:

libbackend.a(graphite-optimize-isl.o): In function `getScheduleForBandList':
/home/roman/sec_trunk/gcc/gcc/graphite-optimize-isl.c:357: undefined
reference to `isl_band_member_is_zero_distance'
collect2: error: ld returned 1 exit status
make[3]: *** [lto1] Error 1
make[3]: *** Waiting for unfinished jobs
libbackend.a(graphite-optimize-isl.o): In function `getScheduleForBandList':
/home/roman/sec_trunk/gcc/gcc/graphite-optimize-isl.c:357: undefined
reference to `isl_band_member_is_zero_distance'
collect2: error: ld returned 1 exit status
make[3]: *** [cc1] Error 1
libbackend.a(graphite-optimize-isl.o): In function `getScheduleForBandList':
/home/roman/sec_trunk/gcc/gcc/graphite-optimize-isl.c:357: undefined
reference to `isl_band_member_is_zero_distance'
collect2: error: ld returned 1 exit status
make[3]: *** [cc1plus] Error 1
rm gcov-tool.pod gfdl.pod fsf-funding.pod gcc.pod cpp.pod gcov.pod
make[3]: Leaving directory `/home/roman/compiled/build_graphite_sec/gcc'
make[2]: *** [all-stage1-gcc] Error 2
make[2]: Leaving directory `/home/roman/compiled/build_graphite_sec'
make[1]: *** [stage1-bubble] Error 2
make[1]: Leaving directory `/home/roman/compiled/build_graphite_sec'
make: *** [all] Error 2

Could you please advise me how to fix this? If I’m not mistaken,
r216735 is the only commit related to graphite-optimize-isl.c which
has been made since my latest patch.


This is another incompatibility between isl 0.12 and 0.13. I suggest to
first test this with isl-0.12.

@Sven: Maybe Sven has an idea how to handle this best.

Cheers,
Tobias

Re: [llvm-dev] DragonEgg for GCC v8.x and LLVM v6.x is just able to work

2017-08-21 Thread Tobias Grosser

On Mon, Aug 21, 2017, at 05:31, Leslie Zhai via llvm-dev wrote:
> Hi LLVM and GCC developers,
> 
> My sincere thanks will goto:
> 
> * Duncan, the core developer of llvm-gcc and dragonegg 
> http://llvm.org/devmtg/2009-10/Sands_LLVMGCCPlugin.pdf
> 
> * David, the innovator and developer of GCC 
> https://dmalcolm.fedorapeople.org/gcc/global-state/requirements.html
> 
> and others who give me kind response for teaching me patiently and 
> carefully about how to migrate GCC v4.8.x to GCC v8.x (git-20170818)
> 
> DragonEgg has been migrated to GCC v8.x and LLVM v6.x (svn-r311142), but 
> also able to work for GCC v4.8.x and LLVM v3.3 
> https://reviews.llvm.org/D35667 and it is just able to work now, for 
> example:

Very interesting. We are still using dragonegg to process our weather
modeling codes here at ETH / CSCS. Having it updated to work with latest
LLVM will provide us with a current baseline which will also be helpful
to evaluate flang further. Thanks a lot for your effort.

Best,
Tobias

PS: Would be great to have a flang update on the mailing list as well. I
will ask for this in a separate email.

> 
> 
> $CC -fplugin=./dragonegg.so \
>  -fplugin-arg-dragonegg-debug-pass-arguments \
>  -ftime-report \
>  -fverbose-asm \
>  -fplugin-arg-dragonegg-enable-gcc-optzns \
>  -fplugin-arg-dragonegg-emit-ir \
>  -S \
>  test/hello.c \
>  -wrapper gdb,--args
> 
> 
> hello.s (LLVM IR, the extension name is not important)
> 
> ; ModuleID = 'test/hello.c'
> source_filename = "test/hello.c"
> target triple = "x86_64-redhat-linux"
> 
> module asm "\09.ident\09\22GCC: (GNU) 6.4.1 20170727 (Red Hat 6.4.1-1) 
> LLVM: 3.9.1\22"
> 
> @__func__.2210 = internal local_unnamed_addr constant [4 x i8] c"foo\00"
> @.cst = private local_unnamed_addr constant [24 x i8] c"DEBUG: %s, line 
> %d: %s\0A\00", align 8
> @.cst.1 = private local_unnamed_addr constant [13 x i8] 
> c"test/hello.c\00", align 8
> @.cst.2 = private local_unnamed_addr constant [12 x i8] c"Leslie 
> Zhai\00", align 8
> @.cst.3 = private local_unnamed_addr constant [20 x i8] c"%s: Hello 
> World %d\0A\00", align 8
> 
> ; Function Attrs: nounwind uwtable
> define void @foo(...) unnamed_addr #0 {
> entry:
>%"ssa point" = bitcast i32 0 to i32
>br label %""
> 
> "": ; preds = %entry
>%0 = call i32 (i8*, ...) @printf(i8* noalias getelementptr inbounds 
> ([24 x i8], [24 x i8]* @.cst, i64 0, i64 0), [13 x i8]* @.cst.1, i32 4, 
> [4 x i8]* @__func__.2210) #1
>br label %return
> 
> return:   ; preds = %""
>ret void
> }
> 
> declare i32 @printf(i8*, ...)
> 
> ; Function Attrs: nounwind uwtable
> define i32 @main(i32 %argc, i8** %argv) unnamed_addr #0 {
> entry:
>%argc_addr = alloca i32, align 4
>%argv_addr = alloca i8**, align 8
>%n = alloca i32
>%s = alloca i8*
>%"" = alloca i32
>%"alloca point" = bitcast i32 0 to i32
>store i32 %argc, i32* %argc_addr, align 1
>store i8** %argv, i8*** %argv_addr, align 1
>%"ssa point" = bitcast i32 0 to i32
>br label %""
> 
> "": ; preds = %entry
>%0 = call i32 (i8*, ...) @printf(i8* noalias getelementptr inbounds 
> ([20 x i8], [20 x i8]* @.cst.3, i64 0, i64 0), i8* getelementptr 
> inbounds ([12 x i8], [12 x i8]* @.cst.2, i64 0, i64 0), i32 1) #1
>br label %""
> 
> "":   ; preds = %""
>store i32 0, i32* %"", align 1
>br label %return
> 
> return:   ; preds = %""
>%1 = load i32, i32* %"", align 4
>ret i32 %1
> }
> 
> attributes #0 = { nounwind uwtable
> "no-frame-pointer-elim-non-leaf"="true" }
> attributes #1 = { nounwind }
> 
> !llvm.module.flags = !{!0}
> 
> !0 = !{i32 1, !"PIE Level", i32 2}
> 
> 
> $ llc hello.s
> 
>  .text
>  .file"hello.s"
>  # Start of file scope inline 
> assembly
>  .ident"GCC: (GNU) 6.4.1 20170727 (Red Hat 6.4.1-1) LLVM: 3.9.1"
> 
>  # End of file scope inline
>  assembly
>  .globlfoo
>  .p2align4, 0x90
>  .typefoo,@function
> foo:# @foo
>  .cfi_startproc
> # BB#0: # %entry
>  pushq%rbp
> .Ltmp0:
>  .cfi_def_cfa_offset 16
> .Ltmp1:
>  .cfi_offset %rbp, -16
>  movq%rsp, %rbp
> .Ltmp2:
>  .cfi_def_cfa_register %rbp
>  movl$.L.cst, %edi
>  movl$.L.cst.1, %esi
>  movl$4, %edx
>  movl$__func__.2210, %ecx
>  xorl%eax, %eax
>  callqprintf
>  popq%rbp
>  retq
> .Lfunc_end0:
>  .sizefoo, .Lfunc_end0-foo
>  .cfi_endproc
> 
>  .globlmain
>  .p2align4, 0x90
>  .typemain,@function
> main:

Re: Assigning the result of a function call to a variable in the Gfortran frontend

2017-09-04 Thread Tobias Grosser

Hi Leslie,

I copied you in this thread as you currently worked quite a bit with
dragonegg and we are currently trying to generate metadata that makes it
easier to reason about multi-dimensional fortran arrays.

Best,
Tobias

On Mon, Sep 4, 2017, at 23:06, (IIIT) Siddharth Bhat wrote:
> Hello,
> 
> I've been hacking on the Gfortran frontend to change array index
> expressions to function calls, so that I can inspect them later on in the
> pipeline. I go from Fortran -> LLVM IR (through dragonegg) where I will
> look at the function call nodes.
> 
> However, I'm not able to generate correct IR for this. I can create
> function call, but I am unable to assign the return value of a function
> call to a variable here.
> 
> Here's a link to my experiments here: It includes a patch, a test file
> and
> the GIMPLE output
> 
> .
> 
> Help would be very much appreciated!
> 
> Thanks,
> Siddharth.
> -- 
> Sending this from my phone, please excuse any typos!

Re: Assigning the result of a function call to a variable in the Gfortran frontend

2017-09-05 Thread Tobias Grosser

Thank you leslie!

Best,
Tobias

On Wed, Sep 6, 2017, at 05:25, Leslie Zhai wrote:
> 
> 
> 在 2017年09月05日 15:25, Leslie Zhai 写道:
> >
> >
> > 在 2017年09月05日 14:14, Tobias Grosser 写道:
> >> Hi Leslie,
> >>
> >> I copied you in this thread as you currently worked quite a bit with
> >> dragonegg and we are currently trying to generate metadata that makes it
> >> easier to reason about multi-dimensional fortran arrays.
> >>
> >> Best,
> >> Tobias
> >>
> >> On Mon, Sep 4, 2017, at 23:06, (IIIT) Siddharth Bhat wrote:
> >>> Hello,
> >>>
> >>> I've been hacking on the Gfortran frontend to change array index
> >>> expressions to function calls, so that I can inspect them later on 
> >>> in the
> >>> pipeline. I go from Fortran -> LLVM IR (through dragonegg) where I will
> >>> look at the function call nodes.
> >>>
> >>> However, I'm not able to generate correct IR for this. I can create
> >>> function call, but I am unable to assign the return value of a function
> >>> call to a variable here.
> >>>
> >>> Here's a link to my experiments here: It includes a patch, a test file
> >>> and
> >>> the GIMPLE output
> >>> <https://gist.github.com/bollu/999184bdb3d0f1569ee0fd0a351689e3#file-m-gimple-L24>
> >>>  
> >>>
> >
> > I rebase the patch for GCC v8.x at first:
> 
> 
> diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
> index 9efb531..5cf5ba9 100644
> --- a/gcc/fortran/trans-array.c
> +++ b/gcc/fortran/trans-array.c
> @@ -91,6 +91,8 @@ along with GCC; see the file COPYING3.  If not see
>   #include "dependency.h"
> 
>   static bool gfc_get_array_constructor_size (mpz_t *, 
> gfc_constructor_base);
> +static tree call_polly_index(stmtblock_t *parent_block, tree 
> *original_index,
> + gfc_array_ref *ar);
> 
>   /* The contents of this structure aren't actually used, just the 
> address.  */
>   static gfc_ss gfc_ss_terminator_var;
> @@ -3251,7 +3253,13 @@ gfc_conv_scalarized_array_ref (gfc_se * se, 
> gfc_array_ref * ar)
> if (build_class_array_ref (se, tmp, index))
>   return;
> 
> +  printf("TIMSTAMP: %s - %s\n", __DATE__, __TIME__);
> +  printf("==\n");
> +
> +  printf("# 1. index(new):\n");
> +  call_polly_index(&se->pre, &index, ar);
> se->expr = gfc_build_array_ref (tmp, index, decl);
> +  printf("==\n");
>   }
> 
> 
> @@ -3335,6 +3343,22 @@ build_array_ref (tree desc, tree offset, tree 
> decl, tree vptr)
> return tmp;
>   }
> 
> +// See: gfc_call_malloc
> +static tree call_polly_index(stmtblock_t *parent_block, tree 
> */*original_index*/,
> +gfc_array_ref */*ar*/) {
> +  tree fncall, var, result;
> +  stmtblock_t block;
> +
> +  var = gfc_create_var(gfc_array_index_type, "pollyindex");
> +  gfc_init_block (&block);
> +
> +  fncall = build_call_expr_loc(input_location, 
> gfor_fndecl_polly_array_index, 0);
> +  gfc_add_modify(&block, var, fold_convert(gfc_array_index_type,
> fncall));
> +  result = gfc_finish_block (&block);
> +  gfc_add_expr_to_block(parent_block, result);
> +  return var;
> +}
> +
> 
>   /* Build an array reference.  se->expr already holds the array
>   descriptor.
>  This should be either a variable, indirect variable reference or 
> component
> diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
> index 74d8606..28ddcfd 100644
> --- a/gcc/fortran/trans-decl.c
> +++ b/gcc/fortran/trans-decl.c
> @@ -95,6 +95,7 @@ static int seen_ieee_symbol;
> 
>   /* Function declarations for builtin library functions.  */
> 
> +tree gfor_fndecl_polly_array_index;
>   tree gfor_fndecl_pause_numeric;
>   tree gfor_fndecl_pause_string;
>   tree gfor_fndecl_stop_numeric;
> @@ -3495,6 +3496,14 @@ gfc_build_builtin_function_decls (void)
> /* ERROR STOP doesn't return.  */
> TREE_THIS_VOLATILE (gfor_fndecl_error_stop_string) = 1;
> 
> +
> +  printf("building polly_array_index function decl...\n");
> +  gfor_fndecl_polly_array_index = gfc_build_library_function_decl (
> +get_identifier (PREFIX("polly_array_index")),
> +gfc_array_index_type, 0);
> +  TREE_THIS_VOLATILE (gfor_fndecl_polly_array_index) = 1;
> +  printf("built polly_array_index function decl...\n");
> +
> gfor_fndecl_pause_numeric = gfc_build_library_function_decl (
>   get_identifier

Re: How could I get alias set information from data_reference_p

2009-07-15 Thread Tobias Grosser

On Tue, 2009-07-14 at 23:34 +0200, Richard Guenther wrote:
> On Tue, Jul 14, 2009 at 6:08 PM, Sebastian Pop wrote:
> > On Tue, Jul 14, 2009 at 11:03, Sebastian Pop wrote:
> >>> Why do you need alias-set numbers?
> >>
> >> We want to represent the alias set information as an extra subscript
> >> on memory accesses: for example,
> >>
> >> if we have A[10] and supposing that A is in alias set 6, this would be
> >> represented as "memory_access[6][10]".
> >>
> >> For B[3][4] with base array B in alias set 7 and 8, we would represent
> >> this as "memory_access[7][3][4] or memory_access[8][3][4]".
> >>
> >> The data dependence test that we currently have in the graphite branch
> >> would work with alias sets represented by an extra dimension to the
> >> array dimensions.
> >
> > And by the way, right now we consider that all the data refs alias each
> > other, that means that we set the alias dimension of the memory
> > accesses to 1 for every data reference.  This makes the test very
> > conservative: in the example above, with the current implementation,
> > we would have:
> >
> > A: memory_access[1][10]
> > B: memory_access[1][3][4]
> 
> Well, so you really want to have sort-of the base objects partitioned
> into must(!)-conflict sets?  Thus,

Is there anything like must-conflict? I thought the alias oracle just
returns may conflict relations.
This information should be passed to graphite without loosing any
information.

>   (*p)[1][2]
>   (*q)[1][3]
> 
> is only correct if the resulting accesses may not alias?  (that is,
> p == q?)

What do you mean by is correct?

> 
> No, we don't have such information readily available.  You have
> to compute it.  Likely by building a complete conflict map of
> the bases of all memory references and partitioning it.
> 
> Which doesn't sound like a great idea - it's quadratic.  Thus, can't
> you represent this in a more sparse way?

It is quadratic in what? In the number of data references?

If the only way we can get the information if two data references may
alias is asking the oracle, I do not think there is an alternative. We
have to get at least this information. So there will be nb_of_drs^2
calls to the alias oracle. Or is there any alternative?

I think the only way to get to less complexity is to limit the
information passed to graphite. E.g. we can put everything in the same
alias set as we do now. This is very conservative but definitely faster.

I tend to take the high complexity at the moment, as I do not think we
get many SCoPs that are too big to handle and passing all the details
allows Graphite to do more optimizations. However we can switch to the
old approach if the number of data references passes a certain limit, so
gcc's speed will not suffer.

Tobi

Re: How could I get alias set information from data_reference_p

2009-07-15 Thread Tobias Grosser

On Wed, 2009-07-15 at 13:26 +0200, Richard Guenther wrote:
> On Wed, Jul 15, 2009 at 1:00 PM, Tobias
> Grosser wrote:
> > On Tue, 2009-07-14 at 23:34 +0200, Richard Guenther wrote:
> >> On Tue, Jul 14, 2009 at 6:08 PM, Sebastian Pop wrote:
> >> > On Tue, Jul 14, 2009 at 11:03, Sebastian Pop wrote:
> >> >>> Why do you need alias-set numbers?
> >> >>
> >> >> We want to represent the alias set information as an extra subscript
> >> >> on memory accesses: for example,
> >> >>
> >> >> if we have A[10] and supposing that A is in alias set 6, this would be
> >> >> represented as "memory_access[6][10]".
> >> >>
> >> >> For B[3][4] with base array B in alias set 7 and 8, we would represent
> >> >> this as "memory_access[7][3][4] or memory_access[8][3][4]".
> >> >>
> >> >> The data dependence test that we currently have in the graphite branch
> >> >> would work with alias sets represented by an extra dimension to the
> >> >> array dimensions.
> >> >
> >> > And by the way, right now we consider that all the data refs alias each
> >> > other, that means that we set the alias dimension of the memory
> >> > accesses to 1 for every data reference.  This makes the test very
> >> > conservative: in the example above, with the current implementation,
> >> > we would have:
> >> >
> >> > A: memory_access[1][10]
> >> > B: memory_access[1][3][4]
> >>
> >> Well, so you really want to have sort-of the base objects partitioned
> >> into must(!)-conflict sets?  Thus,
> >
> > Is there anything like must-conflict? I thought the alias oracle just
> > returns may conflict relations.
> 
> True.  This is why I ask ;)
> 
> > This information should be passed to graphite without loosing any
> > information.
> >
> >>   (*p)[1][2]
> >>   (*q)[1][3]
> >>
> >> is only correct if the resulting accesses may not alias?  (that is,
> >> p == q?)
> >
> > What do you mean by is correct?
> 
> I try to ask what you are going to do with the computed alias-set
> numbers and what is the property required for the alias-set numbers
> so that their use do not result in invalid transformations from graphite.
> 
> So, what does the 1 in (*p)[1][2] semantically mean?
> 
> >>
> >> No, we don't have such information readily available.  You have
> >> to compute it.  Likely by building a complete conflict map of
> >> the bases of all memory references and partitioning it.
> >>
> >> Which doesn't sound like a great idea - it's quadratic.  Thus, can't
> >> you represent this in a more sparse way?
> >
> > It is quadratic in what? In the number of data references?
> 
> Yes.  You need to build the full conflict map (or a conflict graph,
> as Li proposes in his final algorithm).
>
> > If the only way we can get the information if two data references may
> > alias is asking the oracle, I do not think there is an alternative. We
> > have to get at least this information. So there will be nb_of_drs^2
> > calls to the alias oracle. Or is there any alternative?
> I don't know.  It depends on how you are going to use the information.

The idea is that we want to get rid of the need to call a function, to
know if two data references alias, but - as Sebastian explained - to map
them to virtual memory locations.

Lets say we have some data references A, B, C, D, E, F, G that alias
like this:

A -- BF -- G
|  / |
| /  |
C -- D -- E   

What I would like is to create groups of the data references that may
touch the same memory location. So every pair of data references that
may alias has to share at least one alias set.

The easiest solution is to put all of them in one big set. This is
always correct, but very conservative.

AS1 = {A,B,C,D,E,F,G}

The next step would be to make groups of all connected components of the
graph.

AS1 = {A,B,C,D,E}
AS2 = {F,G}

The most exact solution I can think of is to take all maximal complete
sub graphs. This means all sub graphs where there is between every two
elements E1, E2 an edge.

AS1 = {A,B,C}
AS2 = {B,C,D}
AS3 = {D,E}
AS4 = {F,G}

So a statement

D[i][4j-5] = B [2j];

has these effects:

may-write [2] [i] [4j-5]
may-write [3] [i] [4j-5]
read [1] [2j]
read [2] [2j]

Another statement might be

... = E[5i][5i+j]

read [3] [5i] [5i+j]

The information is used in our data dependency checks to check and
calculate which dependencies exist.

There we have conflict equalities for ever

Re: How could I get alias set information from data_reference_p

2009-07-15 Thread Tobias Grosser

On Wed, 2009-07-15 at 22:48 +0200, Richard Guenther wrote:
> On Wed, Jul 15, 2009 at 10:46 PM, Richard
> Guenther wrote:
> > On Wed, Jul 15, 2009 at 9:15 PM, Tobias
> > Grosser wrote:
> >>> A note on Lis final graph algorithm.  I don't understand why you want
> >>> to allow data-references to be part of multiple alias-sets?  (Of course
> >>> I don't know how you are going to use the alias-sets ...)
> >>
> >> Just to pass more information to Graphite. The easiest example might be
> >> something like
> >>
> >> A -- B -- C
> >>
> >> if we have
> >>
> >> AS1 = {A,B}
> >> AS2 = {B,C}
> >>
> >> we know that A and C do not alias and therefore do not have any
> >
> > No, from the above you _don't_ know that.  How would you arrive
> > at that conclusion?
> 
> What I want to say is that, if  A -- B -- C is supposed to be the alias graph
> resulting from querying the alias oracle for the pairs (A, B), (A, C), (B, C)
> then this is a result that will never occur.  Because if (A, B) is true
> and (B, C) is true then (A, C) will be true as well.

What for example for this case:

void foo (*b) {
 int *a 
 int *c
 
 if (bar())
a = b;
 else
c = b;
}

I thought this may give us the example above, but it seems I am wrong.
If the alias oracle is transitive that would simplify the algorithm a
lot. Can we rely on the transitivity?

Tobi

Re: Build with graphite (cloog, ppl) as installed on Debian testing (20090823) fails with -Wc++-compat error.

2009-08-24 Thread Tobias Grosser

Hi Toon,

did you build with the latest cloog-ppl package. It is 0.15.7 and
available at ftp://gcc.gnu.org/pub/gcc/infrastructure/.

Gcc should build without any problems using this package. Can you verify
this?

The problem was in the cloog-ppl headers and was fixed in CLooG-ppl
0.15.4 (I think).

We should add a check for ClooG revision to make configure fail on
outdated cloog 0.15 revisions. 

Otherwise there is not not much we can do about this in gcc. These are
bugs fixed in CLooG and the user needs the latest bugfix release of
CLooG itself.

Tobi

Scev analysing number of loop iterations returns (2^32-1) instead of -1

2009-10-07 Thread Tobias Grosser

I try to analyse this code:
--
int foo (int N)
{
  int ftab[257];
  int i, j;

  for (i = 0; i < N  - 7488645; i++)
j = ftab [i];

  return j;
}
--

The number of iterations I get is:

(unsigned int) N_5(D) + 0x0

However I expect it to be

(unsigned int) N_5(D) + (-1)

At the moment we convert these values in Graphite using int_cst_value
even if 0x0 does not fit in host_integerp(). In some cases we
get free wrapping to -1 (that seems to be correct), however in others
this hackish use of int_cst_value leads to incorrect code.

Now I try to convert this value directly to a gmp value. Using this
approach.

--
double_int di = tree_to_double_int (t);
mpz_set_double_int (res, di, TYPE_UNSIGNED (TREE_TYPE (t)));
--


However the "di" value is again (2^32 -1) instead of -1, so the gmp
value I get is not the -1 it actually should be.
The conversion from tree to gmp value seems to be correct as

double_int_negative_p (di) == false, as well as 
TYPE_UNSIGNED (TREE_TYPE (t)) == true

Now the big question. Should it be represented as an signed int in the
tree so that my conversion works or how can I detect that I have to
simulate the wrapping?

I suspect the conversion might happen here:

(gdb) print t.int_cst.common.type.base
$11 = {code = INTEGER_TYPE, side_effects_flag = 0, constant_flag = 0,
addressable_flag = 0, volatile_flag = 0, readonly_flag = 0,
unsigned_flag = 1, ...}
(gdb) print t.type.common.base
$12 = {code = INTEGER_CST, side_effects_flag = 0, constant_flag = 1,
addressable_flag = 0, volatile_flag = 0, readonly_flag = 0,
unsigned_flag = 0, ...}
(gdb) print t.common.type.base
$13 = {code = INTEGER_TYPE, side_effects_flag = 0, constant_flag = 0,
addressable_flag = 0, volatile_flag = 0, readonly_flag = 0,
unsigned_flag = 1, ...}

What is the meaning of the different types. Some have the unsigned_flag
set, others not?

Thanks a lot
Tobias

Re: Scev analysing number of loop iterations returns (2^32-1) instead of -1

2009-10-07 Thread Tobias Grosser

On Wed, 2009-10-07 at 16:42 +0200, Richard Guenther wrote:
> On Wed, Oct 7, 2009 at 4:37 PM, Tobias Grosser
>  wrote:
> > I try to analyse this code:
> > --
> > int foo (int N)
> > {
> >  int ftab[257];
> >  int i, j;
> >
> >  for (i = 0; i < N  - 7488645; i++)
> >j = ftab [i];
> >
> >  return j;
> > }
> > --
> >
> > The number of iterations I get is:
> >
> > (unsigned int) N_5(D) + 0x0
> >
> > However I expect it to be
> >
> > (unsigned int) N_5(D) + (-1)
> 
> No, that would be (unsigned int) (N_5(D) + -1) instead.
> 
> It's fold that canonicalizes this to the above form - you
> simply have to deal with it (unsigned arithmetic, that is).

OK. So I need to understand this better.

E.g:

int foo (int N)
{ 
  int ftab[257];
  int i, j;
  
  for (i = 0; i < N  - 50; i++)
j = ftab [i];
  
  return j;
} 

Number of latch executions: (unsigned int) N_5(D) + 0x0ffcd

What happens if N == 5? I would expect the number of latch iterations to
be 0 as i < 5 - 50 is always false. However using the upper expression I
get something like
5 + 0x0ffcd == 0x0ffd2
what is a lot bigger than 0.

Thanks for your help

Tobi

Re: Scev analysing number of loop iterations returns (2^32-1) instead of -1

2009-10-07 Thread Tobias Grosser

On Wed, 2009-10-07 at 17:23 +0200, Richard Guenther wrote:
> On Wed, Oct 7, 2009 at 5:16 PM, Tobias Grosser
>  wrote:
> > On Wed, 2009-10-07 at 16:42 +0200, Richard Guenther wrote:
> >> On Wed, Oct 7, 2009 at 4:37 PM, Tobias Grosser
> >>  wrote:
> >> > I try to analyse this code:
> >> > --
> >> > int foo (int N)
> >> > {
> >> >  int ftab[257];
> >> >  int i, j;
> >> >
> >> >  for (i = 0; i < N  - 7488645; i++)
> >> >j = ftab [i];
> >> >
> >> >  return j;
> >> > }
> >> > --
> >> >
> >> > The number of iterations I get is:
> >> >
> >> > (unsigned int) N_5(D) + 0x0
> >> >
> >> > However I expect it to be
> >> >
> >> > (unsigned int) N_5(D) + (-1)
> >>
> >> No, that would be (unsigned int) (N_5(D) + -1) instead.
> >>
> >> It's fold that canonicalizes this to the above form - you
> >> simply have to deal with it (unsigned arithmetic, that is).
> >
> > OK. So I need to understand this better.
> >
> > E.g:
> >
> > int foo (int N)
> > {
> >  int ftab[257];
> >  int i, j;
> >
> >  for (i = 0; i < N  - 50; i++)
> >j = ftab [i];
> >
> >  return j;
> > }
> >
> > Number of latch executions: (unsigned int) N_5(D) + 0x0ffcd
> >
> > What happens if N == 5? I would expect the number of latch iterations to
> > be 0 as i < 5 - 50 is always false. However using the upper expression I
> > get something like
> > 5 + 0x0ffcd == 0x0ffd2
> > what is a lot bigger than 0.
> 
> It's undefined if N == 5 because the loop counter would overflow.


Why?

The loop

for (i=0; i < 5 - 50; i++)

is equivalent to

for (i=0; i < -45; i++)

Which just evaluates to false and will not be executed. How can the loop
counter overflow?

Tobi

Re: Scev analysing number of loop iterations returns (2^32-1) instead of -1

2009-10-07 Thread Tobias Grosser

On Wed, 2009-10-07 at 18:30 +0200, Tobias Grosser wrote:
> On Wed, 2009-10-07 at 17:44 +0200, Richard Guenther wrote:
> > On Wed, Oct 7, 2009 at 5:35 PM, Tobias Grosser
> >  wrote:
> > > On Wed, 2009-10-07 at 17:23 +0200, Richard Guenther wrote:
> > >> On Wed, Oct 7, 2009 at 5:16 PM, Tobias Grosser
> > >>  wrote:
> > >> > On Wed, 2009-10-07 at 16:42 +0200, Richard Guenther wrote:
> > >> >> On Wed, Oct 7, 2009 at 4:37 PM, Tobias Grosser
> > >> >>  wrote:
> > >> >> > I try to analyse this code:
> > >> >> > --
> > >> >> > int foo (int N)
> > >> >> > {
> > >> >> >  int ftab[257];
> > >> >> >  int i, j;
> > >> >> >
> > >> >> >  for (i = 0; i < N  - 7488645; i++)
> > >> >> >j = ftab [i];
> > >> >> >
> > >> >> >  return j;
> > >> >> > }
> > >> >> > --
> > >> >> >
> > >> >> > The number of iterations I get is:
> > >> >> >
> > >> >> > (unsigned int) N_5(D) + 0x0
> > >> >> >
> > >> >> > However I expect it to be
> > >> >> >
> > >> >> > (unsigned int) N_5(D) + (-1)
> > >> >>
> > >> >> No, that would be (unsigned int) (N_5(D) + -1) instead.
> > >> >>
> > >> >> It's fold that canonicalizes this to the above form - you
> > >> >> simply have to deal with it (unsigned arithmetic, that is).
> > >> >
> > >> > OK. So I need to understand this better.
> > >> >
> > >> > E.g:
> > >> >
> > >> > int foo (int N)
> > >> > {
> > >> >  int ftab[257];
> > >> >  int i, j;
> > >> >
> > >> >  for (i = 0; i < N  - 50; i++)
> > >> >j = ftab [i];
> > >> >
> > >> >  return j;
> > >> > }
> > >> >
> > >> > Number of latch executions: (unsigned int) N_5(D) + 0x0ffcd
> > >> >
> > >> > What happens if N == 5? I would expect the number of latch iterations 
> > >> > to
> > >> > be 0 as i < 5 - 50 is always false. However using the upper expression 
> > >> > I
> > >> > get something like
> > >> > 5 + 0x0ffcd == 0x0ffd2
> > >> > what is a lot bigger than 0.
> > >>
> > >> It's undefined if N == 5 because the loop counter would overflow.
> > >
> > >
> > > Why?
> > >
> > > The loop
> > >
> > > for (i=0; i < 5 - 50; i++)
> > >
> > > is equivalent to
> > >
> > > for (i=0; i < -45; i++)
> > >
> > > Which just evaluates to false and will not be executed. How can the loop
> > > counter overflow?
> > 
> > Ah, indeed.  Sorry for being confused.  Is tree-niter-desc->assumptions
> > or ->may_be_zero non-NULL?
> 
> Yes both. I attached the gdb content for both.

(gdb) call debug_generic_expr (ndesc.assumptions)
1
(gdb) call debug_generic_expr (ndesc.may_be_zero)
0
(gdb) call debug_tree (ndesc.assumptions)
  constant
1>
(gdb) call debug_tree (ndesc.may_be_zero)
  constant
0>

So it seems the "niter" expression should contain the correct solution
for every N. The cases where "niter" is not valid are not fullfilled
following the description in tree-flow.h

Tobias

Re: Scev analysing number of loop iterations returns (2^32-1) instead of -1

2009-10-07 Thread Tobias Grosser

On Thu, 2009-10-08 at 08:49 +0200, Zdenek Dvorak wrote:
> Hi,
> 
> > > Ah, indeed.  Sorry for being confused.  Is tree-niter-desc->assumptions
> > > or ->may_be_zero non-NULL?
> > 
> > Yes both. I attached the gdb content for both.
> 
> you need to check may_be_zero, which in your case should contain
> something like N <= 49.  If this condition is true, the number of
> iterations of the loop is zero.  Please also check the comments
> in tree-flow.h, where the possible outcomes of # of iteration analysis
> are explained.

Yes, I already did this:

(gdb) call debug_generic_expr (ndesc.assumptions)
1
(gdb) call debug_generic_expr (ndesc.may_be_zero)
0
(gdb) call debug_tree (ndesc.assumptions)
  constant
1>
(gdb) call debug_tree (ndesc.may_be_zero)
  constant
0>

So it seems the "niter" expression should contain the correct solution
for every N. The cases where "niter" is not valid are not fullfilled
following the description in tree-flow.h

Tobias

Re: loop optimization in gcc

2009-10-12 Thread Tobias Grosser

On Sun, 2009-10-11 at 20:20 +0530, sandeep soni wrote:
> Hi All,
> 
> I have been studying the gcc code lately as part of my project.I have
> got info from this mailing list about CFG and DFG information.I want
> to know how gcc uses this information to perform loop optimization?
> Does it Follow any particular algorithm or in particular what are the
> different techniques that it uses to parallelize the code by
> performing the loop optimizations?(correct me please if this sentence
> is not right).
> 
> If possible please provide any pointers to any form of literature
> available regarding it.
> 
> Thanks in advance.

Hi,

you also might want to take a look at the Graphite project.
http://gcc.gnu.org/wiki/Graphite where we do loop optimizations and
automatic parallelization based on the polytop model. If you need any
help feel free to ask.

Tobias

Re: loop optimization in gcc

2009-10-14 Thread Tobias Grosser

On Wed, 2009-10-14 at 20:12 +0530, sandeep soni wrote:
> > Hi,
> >
> > you also might want to take a look at the Graphite project.
> > http://gcc.gnu.org/wiki/Graphite where we do loop optimizations and
> > automatic parallelization based on the polytop model. If you need any
> > help feel free to ask.
> >
> > Tobias
> >
> >
> 
> Hi,
> 
> This seems to be quite interesting and challenging.Moreover,it is very
> close to what we are trying to achieve as well (on a small scale that
> is).I have started preliminary reading on the polytope model and the
> working of GRAPHITE. I will ask you if i face any doubts.It would be
> nice to contribute to this project.
> 
> 
> For the starters, can you tell me if GRAPHITE also does source to
> source transformations or otherwise to optimize??Coz i had read
> somewhere else that the polyhedral model used source to source
> transformations.

Hi,

you are right. There are several polytope frameworks that work on the
source code level (LooPo, Cloog/Clan from Cedric Bastoul), however
Graphite works on the intermediate level tree-ssa in gcc. Therefore we
can not do any source to source transformations.
The idea is to not be limited to specific input languages or special
formatting of the code, but to be able to use the powerful analysis in
the gcc middle end.
This allows us to work on any input language and to detect loops that do
not even look like a loop in the input program (goto-loops). Using the
powerful scalar evolution framework in gcc Graphite also handles loops
that do not look like affine linear loops.
This is a powerful approach in its earlier stages. Basic loops and
simple code transformations already work, but there is still a lot left
to be done.

Tobi

Re: loop optimization in gcc

2009-10-14 Thread Tobias Grosser

On Wed, 2009-10-14 at 23:56 +0530, sandeep soni wrote:
> On Wed, Oct 14, 2009 at 9:02 PM, Tobias Grosser
>  wrote:
> > On Wed, 2009-10-14 at 20:12 +0530, sandeep soni wrote:
> >> > Hi,
> >> >
> >> > you also might want to take a look at the Graphite project.
> >> > http://gcc.gnu.org/wiki/Graphite where we do loop optimizations and
> >> > automatic parallelization based on the polytop model. If you need any
> >> > help feel free to ask.
> >> >
> >> > Tobias
> >> >
> >> >
> >>
> >> Hi,
> >>
> >> This seems to be quite interesting and challenging.Moreover,it is very
> >> close to what we are trying to achieve as well (on a small scale that
> >> is).I have started preliminary reading on the polytope model and the
> >> working of GRAPHITE. I will ask you if i face any doubts.It would be
> >> nice to contribute to this project.
> >>
> >>
> >> For the starters, can you tell me if GRAPHITE also does source to
> >> source transformations or otherwise to optimize??Coz i had read
> >> somewhere else that the polyhedral model used source to source
> >> transformations.
> >
> > Hi,
> >
> > you are right. There are several polytope frameworks that work on the
> > source code level (LooPo, Cloog/Clan from Cedric Bastoul), however
> > Graphite works on the intermediate level tree-ssa in gcc. Therefore we
> > can not do any source to source transformations.
> > The idea is to not be limited to specific input languages or special
> > formatting of the code, but to be able to use the powerful analysis in
> > the gcc middle end.
> > This allows us to work on any input language and to detect loops that do
> > not even look like a loop in the input program (goto-loops). Using the
> > powerful scalar evolution framework in gcc Graphite also handles loops
> > that do not look like affine linear loops.
> > This is a powerful approach in its earlier stages. Basic loops and
> > simple code transformations already work, but there is still a lot left
> > to be done.
> >
> > Tobi
> >
> >
> 
> Hi,
> 
>  Sounds absolutely convincing to me. I am too keen to contribute in
> this in any way possible.I will first try to understand how it works
> totally .Would you mind me pressing on with some of issues in the near
> future? I am afraid though that they might be a bit more theoretical
> to begin with.

Sure. Just drop me a line

Re: Graphite: Collapsing of loop nests ?

2009-12-04 Thread Tobias Grosser

On Fri, 2009-12-04 at 10:29 -0600, Sebastian Pop wrote:
> On Fri, Dec 4, 2009 at 09:41, Toon Moene  wrote:
> > I wonder if Graphite can do this one (or is planned to be able to):
> >
> > Another loop optimization that suggests itself
> > is the following, trying to eliminate unnecessary
> > loop constructs:
> > \begin{verbatim}
> >  SUBROUTINE SUB(A, B, C, N, M)
> >  REAL A(N, M), B(N, M), C(N, M)
> >  DO J = 1, M
> > DO I = 1, N
> >C(I, J) = A(I, J) + B(I, J)
> > ENDDO
> >  ENDDO
> >  END
> > \end{verbatim}
> > If we could generate code that looks like what is
> > written below, we would have nothing but
> > {\em one} loop.
> > \begin{verbatim}
> >  DO K = 1, M*N
> > C(K, 1) = A(K, 1) + B(K, 1)
> >  ENDDO
> > \end{verbatim}
> > In this case, this transformation can be done
> > because the tuple $(i,j)$ forms an induction
> > variable $i+n*(j-1)$ in its own right
> > (replaced by $k$ in the {\em collapsed} loop).
> 
> For the moment Graphite wouldn't do this kind of transform.  But I think
> that this could be done: CLooG should generate the following code if we
> ask it to collapse the two loops:
> 
> DO K = 1, M*N
>  I = K mod N
>  J = K / N
>  C(I, J) = A(I, J) + B(I, J)
> ENDDO
> 
> And then one would have to cleanup the scalar arithmetic to have
> linearized array accesses, and that should already be done by GCC in
> the lowering of memory accesses.
> 
> Now a question for Cedric: how would you ask CLooG to generate this
> collapsed loop?

I do not believe this is already possible for arbitrary N.

You would have to write a scattering function like this

s0 = i+n*(j-1)

What is

s= i + n*j - n
   ^^^  Here is a product that is not affine linear.

There are extensions to the polytop model to make this possible, but
they are not close to be production ready. Just recently I Armin
Groesslinger gave a nice talk about this topic. If you want I can look
for his work describing solutions.

Tobi

Re: Graphite: Collapsing of loop nests ?

2009-12-04 Thread Tobias Grosser

On Fri, 2009-12-04 at 10:50 -0600, Sebastian Pop wrote:
> On Fri, Dec 4, 2009 at 10:39, Tobias Grosser  
> wrote:
> > I do not believe this is already possible for arbitrary N.
> >
> > You would have to write a scattering function like this
> >
> > s0 = i+n*(j-1)
> >
> > What is
> >
> > s= i + n*j - n
> >   ^^^  Here is a product that is not affine linear.
> 
> Yes this is affine linear, but not representable in the current way we
> deal with parameters in the polyhedral model...  Note that constant
> times IV would be allowed, and yes parameters are named constants.

However at the moment we represent parameters as variables that tend to
be constant. So as you said that will not work. Probably there is some
way to regenerate the loops from scratch based on analysing the memory
dependencies, as in this case such a complicated scattering is probably
not required. But this case is definitely not that easy to solve.

> > There are extensions to the polytop model to make this possible, but
> > they are not close to be production ready.
> 
> I think that we should revisit this independently of general solutions.
> 
> > Just recently I Armin Groesslinger gave a nice talk about this
> > topic. If you want I can look for his work describing solutions.
> 
> I also have discussed with Armin when I was visiting INRIA earlier
> this year, and I know what he's working on.
> 
> Sebastian

Re: [GRAPHITE] Re: Loop Transformations Question

2010-02-09 Thread Tobias Grosser

On 09.02.2010 19:39, Sebastian Pop wrote:
> On Tue, Feb 9, 2010 at 12:34, Cristianno Martins
>  wrote:
>> Hi,
>>
>> Thanks for the fast reply. Only one more thing: is there some way that
>> I could force it to be signed??
> 
> I guess that you should wait the fixes from Tobias and Ramakrishna to
> CLooG and Graphite to have the type of the IV exposed by CLooG, rather
> than having the original IV type set to the generated IV.
> 
> Sebastian

Yes it looks as this is one of the bugs Ramakrishna and me are working on.

In short the reason for the bugs seems to be that the loop induction
variable is converted to an unsigned int. Later cloog generates a
statement that is negative and assigned to the unsigned int. Instead of
this expression iv < -something being false the signed value is wrapped
to an unsigned one. Therefore the whole expression becomes true.

 I will have a look into this one.

Tobias

Re: GSoC application

2010-04-23 Thread Tobias Grosser


Hi Artem,

On 04/23/10 22:22, Artem Shinkarov wrote:

Hi

I've submitted an application to gcc in terms of Google Summer of Code
2010, but I have not received any comments yet. The idea of this
application was discussed here in the mailing list and I received
quite some support.

If people are still thinking whether accept it or not than it is ok,
but if it is unreviewed then it is sadly.


at the moment the applications are under review. The results will be 
announced by Google at April 26: 19:00 UTC.


Good luck

Tobi



If anyone is in the position of a reviewer or anyone can provide some
information I would be very appreciated.

Application: 
http://socghop.appspot.com/gsoc/student_proposal/private/google/gsoc2010/ashinkarov/t127065494824


--
Regards,
Artem Shinkarov

[graphite] Cleanup of command line parameters

2008-10-10 Thread Tobias Grosser

Hi graphities,

graphite consists of four flags "-floop-block", "-floop-interchange",
"-floop-stripmine" and "-fgraphite".

If any of these flags is set, we enable the graphite pass and we search
for SCoPs.
For every SCoP we try to apply transformations specified with
"-floop-block", "-floop-interchange" or "-floop-stripmine". But only if
we could apply a transformation, we replace the SCoP with new code
generated from the GRAPHITE data structures. Otherwise we do not touch
the GIMPLE representation.
If we only specify "-fgraphite", we never generate code for any SCoP, as
we never tried any transformation. So just using "-fgraphite" is
useless.

What is missing is a way to make GRAPHITE always generate code, even if
we did not apply any transformation.

This has two benefits:
- We can check the overhead the GIMPLE -> GRAPHITE -> GIMPLE
transformation costs.
- Wider test coverage, as we are able to run the code generator for
every SCoP, not only the SCoPs, we are able to transform.


Now:


-fgraphite: Do nothing.
-floop-block, -floop-interchange, -floop-stripmine: Try these
transformations.

Only "-fgraphite":   Do nothing
Only "-floop-block": Only loop blocked SCoPs are changed.
"-fgraphite -floop-block":   Only loop blocked SCoPs are changed.

One solution: Reuse -fgraphite
--

-fgraphite: Enable always code generation
-floop-block, -floop-interchange, -floop-stripmine: Try these
transformations.

Only "-fgraphite":   The identity transformation for all SCoPs.
Only "-floop-block": Only loop blocked SCoPs are changed.
"-fgraphite -floop-block":   All SCoPs are are changed. Some SCoPs are
 loop blocked, others just the identity 
 transformation.

This allows us to get all the benefits but (mis)uses the -fgraphite
flag. At the moment it has no other meaning, but I could think of it
containing an automatic optimizer, that applies all available graphite
transformations or selects them automatically.

The other solution is: Use -fgraphite-identity
--

-fgraphite: Enable always code generation
-floop-block, -floop-interchange, -floop-stripmine, -fgraphite-identity:
Try these transformations.

Only "-fgraphite":   Do nothing. Free for later use.
Only "-floop-block": Only loop blocked SCoPs are changed.
Only "-fgraphite-identity":  The identity transformation for all SCoPs.
"-fgraphite-identity
 -floop-block":  All SCoPs are are changed. Some SCoPs are
 loop blocked, others just the identity 
 transformation.

This makes sense, as we only generate code for applied and enabled
transformations. These are loop-blocking, interchange, stripmine and
- may be - the new and very easy to write identity transformation, that
does nothing, but enable code generation.

What du you think about these different solutions?

See you
Tobi

Re: [graphite] Cleanup of command line parameters

2008-10-10 Thread Tobias Grosser

On Fri, 2008-10-10 at 20:35 +0200, Manuel López-Ibáñez wrote:
> 2008/10/10 Tobias Grosser <[EMAIL PROTECTED]>:
> >
> > Now:
> > 
> >
> > -fgraphite: Do nothing.
> > -floop-block, -floop-interchange, -floop-stripmine: Try these
> > transformations.
> 
> This only applies to the branch, does it? There is no do-nothing
> command-line option in trunk, is there?

No, that's the same for trunk. But it seems my description was not that
clear. In fact "-fgraphite" does a lot.

If you enable "-fgraphite" the complete CFG graph is scanned and we
generate the GRAPHITE representation for every SCoP. This is great to
make compile tests for the transformation from GIMPLE to GRAPHITE. So
e.g. the polyhedron test suite can be transformed to GRAPHITE, without
any failure. And most of the SPEC test suite also.

But we only generate GIMPLE out of GRAPHITE, if we applied a
transformation. This was thought to reduce compile time, as we do not
regenerate code for SCoPs, that never changed.

If you want to test a graphite transformation, use "-floop-block" (The
one implemented and used for testing).

What I would like to discuss (and later implement) is how we could allow
to specify a identity transformation, that enables code generation for
all SCoPs. This transformation would probably require higher compile
time and generate slower code, but can be used to test the GRAPHITE code
generation better.

But as you see in my description, I would also like to change the
meaning of -fgraphite.

Solution one would make it to generate the identity transformation.
Solution two would make it an automatic optimizer (yet unimplemented),
that could at the moment default to all available graphite
transformations.

See you
Tobi

RE: [graphite] Cleanup of command line parameters

2008-10-10 Thread Tobias Grosser

On Fri, 2008-10-10 at 13:49 -0500, Jagasia, Harsha wrote:
> Hi Tobias,
> >
> >graphite consists of four flags "-floop-block", "-floop-interchange",
> >"-floop-stripmine" and "-fgraphite".
> >
> >If any of these flags is set, we enable the graphite pass and we search
> >for SCoPs.
> >For every SCoP we try to apply transformations specified with
> >"-floop-block", "-floop-interchange" or "-floop-stripmine". But only if
> >we could apply a transformation, we replace the SCoP with new code
> >generated from the GRAPHITE data structures. Otherwise we do not touch
> >the GIMPLE representation.
> >If we only specify "-fgraphite", we never generate code for any SCoP, as
> >we never tried any transformation. So just using "-fgraphite" is
> >useless.
> >
> >What is missing is a way to make GRAPHITE always generate code, even if
> >we did not apply any transformation.
> >
> >This has two benefits:
> >- We can check the overhead the GIMPLE -> GRAPHITE -> GIMPLE
> >transformation costs.
> >- Wider test coverage, as we are able to run the code generator for
> >every SCoP, not only the SCoPs, we are able to transform.
> >
> >
> 
> I think this is a good idea in the long run (in the 4.5 timeframe). My only 
> issue with it is that we have several bugs today even without generating code 
> unless a transformation is picked. Of cpu2006 benchmarks, all but 4 fail to 
> build. We have memory leak issues, we have not tried to bootstrap gcc with 
> graphite enabled. I think that expanding the coverage or overhead testing 
> only makes sense after we have fixed the existing bugs.
> I personally like the "-fgraphite-identity" approach the best.
Hi Harsha,

I understand your position and I do not want to extend the graphite
behavior, that we get more bugs. I think at the moment the most
important part we need for graphite is stable code for the front and
backend, that we can test on Polyhedron, SPEC and gcc.
So I would like to see this proposal as a step backwards. Just add a
transformation, that does nothing (instead of loop blocking) to test the
backend.
This will simplify debugging of the existing backend bugs, and will stop
us from committing new bugs, that are hidden by the limited test
coverage.

My dream would be nightly runs of polyhedron, spec and gcc with
-fgraphite-identity to be able to fix the existing bugs.

See you
Tobi

RE: [graphite] Cleanup of command line parameters

2008-10-10 Thread Tobias Grosser

On Fri, 2008-10-10 at 13:51 -0500, Jagasia, Harsha wrote:
> >
> >Hi Tobias,
> >>
> >>graphite consists of four flags "-floop-block", "-floop-interchange",
> >>"-floop-stripmine" and "-fgraphite".
> 
> In fact I also think that we should not expose "-floop-stripmine" as a
> flag because by itself it is never profitable.

Yes, you may be right. The only reason for -floop-stripmine would be
functional testing of the loop strip mining pass, but it will never give
a performance improvement.

-floop-stripmine and -floop-interchange are not implemented at the
moment, but I did not want to remove them, as they may be implemented
later. I thought I could implement them over a weekend, but I could not
test them because of other bugs. So I thought the time is better spend
in fixing remaining graphite bugs.
But maybe we should print some error messages, if s.o. uses these flags.

See you
Tobias

[graphite] Add -fgraphite-identity / Solution two

2008-10-10 Thread Tobias Grosser

Hi,

I would like to add a new command line parameter for graphite. It will
allow us to test graphite without any transformation like loop blocking,
but only using a simple GIMPLE->GRAPHITE->GIMPLE identity
transformation.

Buy adding this command line option, we do not change the behavior of
any other part of graphite or gcc, but gain the possibility to debug,
test and check the performance of the graphite front/backend without the
noise of additional transformations.
Also we can check the backend for every possible SCoP and not only for
SCoPs, that we can transform using loop blocking.
This should make debugging of the current bug reports easier and allows
us to set up nightly tests testing the complete front and backend.

This patch implements solution two of the recent discussion at [EMAIL PROTECTED]

See you
Tobi


2008-10-10  Tobias Grosser  <[EMAIL PROTECTED]>

	* doc/invoke.texi: Add -fgraphite-identity.
	* graphite.c (graphite_apply_transformations): Check for
	-fgraphite-identity.
	* toplev.c: (process_options): Add flag_graphite_identity.
	* tree-ssa-loop.c: Add flag_graphite_identity.

diff --git a/gcc/common.opt b/gcc/common.opt
index 6f09dfd..4e68067 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -567,6 +567,10 @@ floop-block
 Common Report Var(flag_loop_block) Optimization
 Enable Loop Blocking transformation
 
+fgraphite-identity
+Common Report Var(flag_graphite_identity) Optimization
+Enable Graphite Identity transformation
+
 fguess-branch-probability
 Common Report Var(flag_guess_branch_prob) Optimization
 Enable guessing of branch probabilities
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 755c422..4fcf975 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -340,7 +340,7 @@ Objective-C and Objective-C++ Dialects}.
 -fira-coalesce -fno-ira-share-save-slots @gol
 -fno-ira-share-spill-slots [EMAIL PROTECTED] @gol
 -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
--floop-block -floop-interchange -floop-strip-mine @gol
+-floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
 -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
 -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
 -fmudflapir -fmudflapth -fno-branch-count-reg -fno-default-inline @gol
@@ -6116,6 +6116,13 @@ because the innermost loop will iterate over a smaller amount of data
 that can be kept in the caches.  This optimization applies to all the
 languages supported by GCC and is not limited to Fortran.
 
[EMAIL PROTECTED] -fgraphite-identity
[EMAIL PROTECTED] fgraphite-identity
+Enable the identity transformation for graphite. For every SCoP we generate
+the polyhedral representation and transform it back to gimple. Using
+-fgraphite-identity we can check the costs or benefits of the
+GIMPLE -> GRAPHITE -> GIMPLE transformation.
+
 @item -fcheck-data-deps
 @opindex fcheck-data-deps
 Compare the results of several data dependence analyzers.  This option
diff --git a/gcc/graphite.c b/gcc/graphite.c
index a615e2c..ab386d7 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -4912,17 +4912,13 @@ graphite_apply_transformations (scop_p scop)
   if (flag_loop_block)
 transform_done = graphite_trans_scop_block (scop);
 
-#if 0 && ENABLE_CHECKING
-  /* When the compiler is configured with ENABLE_CHECKING, always
- generate code, even if we did not apply any transformation.  This
- provides better code coverage of the backend code generator.
-
- This also allows to check the performance for an identity
- transform: GIMPLE -> GRAPHITE -> GIMPLE; and the output of CLooG
- is never an identity: if CLooG optimizations are not disabled,
- the CLooG output is always optimized in control flow.  */
-  transform_done = true;
-#endif
+  /* Generate code, even if we did not apply any real transformation.
+ This also allows to check the performance for the identity
+ transformation: GIMPLE -> GRAPHITE -> GIMPLE
+ Keep in mind, that CLooG optimizes in control, so the loop structure
+ may change, even if we only use -fgraphite-identity.  */ 
+  if (flag_graphite_identity)
+transform_done = true;
 
   return transform_done;
 }
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 24e4df7..42ad2a4 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1707,7 +1707,8 @@ process_options (void)
   if (flag_graphite
   || flag_loop_block
   || flag_loop_interchange
-  || flag_loop_strip_mine)
+  || flag_loop_strip_mine
+  || flag_graphite_identity)
 sorry ("Graphite loop optimizations cannot be used");
 #endif
 
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 51fc07c..2ea58f6 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -305,7 +305,8 @@ gate_graphite_transforms (void)
 {
   /* Enable -fgraphite pass if any one of the graphite optimization flags 
  is turned on.  */
-  if (flag_loop_block || flag_loop_interchange

Re: [graphite] Cleanup of command line parameters

2008-10-11 Thread Tobias Grosser

Hi,

On Fri, 2008-10-10 at 23:31 +0200, Albert Cohen wrote:
> Tobias Grosser wrote
> > Hi graphities,
> > 
> > graphite consists of four flags "-floop-block", "-floop-interchange",
> > "-floop-stripmine" and "-fgraphite".
> > 
> > If any of these flags is set, we enable the graphite pass and we search
> > for SCoPs.
> > For every SCoP we try to apply transformations specified with
> > "-floop-block", "-floop-interchange" or "-floop-stripmine". But only if
> > we could apply a transformation, we replace the SCoP with new code
> > generated from the GRAPHITE data structures. Otherwise we do not touch
> > the GIMPLE representation.
> > If we only specify "-fgraphite", we never generate code for any SCoP, as
> > we never tried any transformation. So just using "-fgraphite" is
> > useless.
> > 
> > What is missing is a way to make GRAPHITE always generate code, even if
> > we did not apply any transformation.
> > 
> > This has two benefits:
> > - We can check the overhead the GIMPLE -> GRAPHITE -> GIMPLE
> > transformation costs.
> > - Wider test coverage, as we are able to run the code generator for
> > every SCoP, not only the SCoPs, we are able to transform.
> > 
> > 
> > Now:
> > 
> > 
> > -fgraphite: Do nothing.
> > -floop-block, -floop-interchange, -floop-stripmine: Try these
> > transformations.
> > 
> > Only "-fgraphite":   Do nothing
> > Only "-floop-block": Only loop blocked SCoPs are changed.
> > "-fgraphite -floop-block":   Only loop blocked SCoPs are changed.
> > 
> > One solution: Reuse -fgraphite
> > --
> > 
> > -fgraphite: Enable always code generation
> > -floop-block, -floop-interchange, -floop-stripmine: Try these
> > transformations.
> > 
> > Only "-fgraphite":   The identity transformation for all SCoPs.
> > Only "-floop-block": Only loop blocked SCoPs are changed.
> > "-fgraphite -floop-block":   All SCoPs are are changed. Some SCoPs are
> >  loop blocked, others just the identity 
> >  transformation.
> > 
> > This allows us to get all the benefits but (mis)uses the -fgraphite
> > flag. At the moment it has no other meaning, but I could think of it
> > containing an automatic optimizer, that applies all available graphite
> > transformations or selects them automatically.
> > 
> > The other solution is: Use -fgraphite-identity
> > --
> > 
> > -fgraphite: Enable always code generation
> > -floop-block, -floop-interchange, -floop-stripmine, -fgraphite-identity:
> > Try these transformations.
> > 
> > Only "-fgraphite":   Do nothing. Free for later use.
> > Only "-floop-block": Only loop blocked SCoPs are changed.
> > Only "-fgraphite-identity":  The identity transformation for all SCoPs.
> > "-fgraphite-identity
> >  -floop-block":  All SCoPs are are changed. Some SCoPs are
> >  loop blocked, others just the identity 
> >  transformation.
> > 
> > This makes sense, as we only generate code for applied and enabled
> > transformations. These are loop-blocking, interchange, stripmine and
> > - may be - the new and very easy to write identity transformation, that
> > does nothing, but enable code generation.
> > 
> > What du you think about these different solutions?
> 
> Hi Tobias,
> 
> I am in favor of prefixing all transformation passes with -fgraphite-
> leading to -fgraphite-block -fgraphite-stripmine -fgraphite-identity and 
> the new ones to come. These flags should probably disappear at some 
> point, as a more unified optimization heuristic should arise that 
> combines all transformations, with some parameters to influence the 
> profitability heuristic.
> 
> I agree -fgraphite should be reserved for later use, but not mandatory 
> to use any of the new -fgraphite-xxx options.
> 
> This is just a suggestion with a partial understanding of the option 
> naming scheme in GCC only!

I see what you mean. For me all the graphite flags seem to be like
temporary flags, as the idea of the polyhedron model is to allow a lot
of different optimizations, that can not be split up to loop blocking 
So in this case the current flags seem for me to just enable different
demo/test optimizers in graphite, that are replaced later by a better
general one. So I like your idea.

I will submit a patch containing this idea.

See you
Tobias

Re: [graphite] Cleanup of command line parameters [PATCH]

2008-10-11 Thread Tobias Grosser

On Sat, 2008-10-11 at 23:19 +0200, Richard Guenther wrote:
> On Sat, Oct 11, 2008 at 11:13 PM, Sebastian Pop <[EMAIL PROTECTED]> wrote:
> > On Sat, Oct 11, 2008 at 6:46 AM, Richard Guenther
> > <[EMAIL PROTECTED]> wrote:
> >> Note that we cannot really remove switches from the user, but we have to at
> >> least keep them as no-op for backward compatibility.  Which is why I would
> >> like you to think twice at least as to what options you want to add for 
> >> 4.4.
> >> As a rule of thumb, do not add (or at least document) options that are not
> >> useful for users of GCC but only developers of GCC.  Maybe you can add a
> >> glob for debugging options with -fdebug-graphite-XXX instead?
> >>
> >
> > We can enable the -fgraphite-* flags only when ENABLE_CHECKING is defined,
> > such that these are disabled in the released compilers.
> 
> I think that you can keep them enabled without ENABLE_CHECKING.  But I suggest
> to remove user-level documentation of them (documentation in common.opt 
> comments
> should be enough).

Hi,

another patch. It contains:

- Removal of documentation outside of common.opts for (-fgraphite,
  -floop-block, -floop-interchange, -floop-strip-mine)
  This means doc/invoke.texi.
  (Proposed by Richi)

- Removal of flag "-floop-strip-mine", as it never will improve 
  performance and so there will be no use for it.
  (Proposed by Harsha)

- Rename of -floop-block to -fgraphite-block and -floop-interchange to
  -fgraphite-interchange.
  (Proposed by Albert)

- Add of -fgraphite-identity.
  (Proposed by Tobias (me), solution 2 selected by Harsha y Sebastian)

- Add warning as -fgraphite-interchange is not yet implemented.
  (Tobias (me))

So the graphite command line flags are not available any more in the
user documentation, but can be used to test/debug the graphite code.

What flags exist?

-fgraphite: Graphite without code generation. 
Useful for frontend testing.
-fgraphite-identity:The identity transformation.
Useful for backend testing without 
transformation.
-fgraphite-block:   Loop blocking.
Useful for checking available performance 
improvements with loop blocking
-fgraphite-interchange: Loop interchange (unimplemented).
Useful for checking available performance 
improvements with loop interchange.

Later I would like to be -fgraphite the automatic optimizer.

See you
Tobi

Re: [graphite] Cleanup of command line parameters [PATCH]

2008-10-11 Thread Tobias Grosser

On Sat, 2008-10-11 at 23:19 +0200, Richard Guenther wrote:
> On Sat, Oct 11, 2008 at 11:13 PM, Sebastian Pop <[EMAIL PROTECTED]> wrote:
> > On Sat, Oct 11, 2008 at 6:46 AM, Richard Guenther
> > <[EMAIL PROTECTED]> wrote:
> >> Note that we cannot really remove switches from the user, but we have to at
> >> least keep them as no-op for backward compatibility.  Which is why I would
> >> like you to think twice at least as to what options you want to add for 
> >> 4.4.
> >> As a rule of thumb, do not add (or at least document) options that are not
> >> useful for users of GCC but only developers of GCC.  Maybe you can add a
> >> glob for debugging options with -fdebug-graphite-XXX instead?
> >>
> >
> > We can enable the -fgraphite-* flags only when ENABLE_CHECKING is defined,
> > such that these are disabled in the released compilers.
> 
> I think that you can keep them enabled without ENABLE_CHECKING.  But I suggest
> to remove user-level documentation of them (documentation in common.opt 
> comments
> should be enough).

Hi,

another patch. It contains:

- Removal of documentation outside of common.opts for (-fgraphite,
  -floop-block, -floop-interchange, -floop-strip-mine)
  This means doc/invoke.texi.
  (Proposed by Richi)

- Removal of flag "-floop-strip-mine", as it never will improve 
  performance and so there will be no use for it.
  (Proposed by Harsha)

- Rename of -floop-block to -fgraphite-block and -floop-interchange to
  -fgraphite-interchange.
  (Proposed by Albert)

- Add of -fgraphite-identity.
  (Proposed by Tobias (me), solution 2 selected by Harsha y Sebastian)

- Add warning as -fgraphite-interchange is not yet implemented.
  (Tobias (me))

So the graphite command line flags are not available any more in the
user documentation, but can be used to test/debug the graphite code.

What flags exist?

-fgraphite: Graphite without code generation. 
Useful for frontend testing.
-fgraphite-identity:The identity transformation.
Useful for backend testing without 
transformation.
-fgraphite-block:   Loop blocking.
Useful for checking available performance 
improvements with loop blocking
-fgraphite-interchange: Loop interchange (unimplemented).
Useful for checking available performance 
improvements with loop interchange.

Later I would like to be -fgraphite the automatic optimizer.

See you
Tobi

2008-09-11  Tobias Grosser  <[EMAIL PROTECTED]>

	* common.opt: Remove floop-strip-mine. Rename floop-block to
	fgraphite-block, floop-interchange to fgraphite-interchange.
	Add fgraphite-identity.
	* doc/invoke.texi: Remove documentation for fgraphite*
	* graphite.c: (graphite_apply_transformations): Add support for
	fgraphite-identity.
	* toplev.c (process_options): Rename flags. Add flage_graphite_identity.
	Add "fgraphite-interchange is unimplemented" message.
	* tree-ssa-loop.c (gate_graphite_transforms): Update flags.
	* testsuite/gcc.dg/graphite/block-0.c: Update flags.
	* testsuite/gcc.dg/graphite/block-1.c: Update flags.
	* testsuite/gcc.dg/graphite/scop-16.c: Update flags.
	* testsuite/gcc.dg/graphite/scop-17.c: Update flags.
	* testsuite/gcc.dg/graphite/scop-18.c: Update flags.
	* testsuite/gfortran.dg/graphite/block-1.f90: Update flags.
	* testsuite/gfortran.dg/graphite/block-2.f: Update flags.
	* testsuite/gfortran.dg/graphite/block-3.f90: Update flags.
	* testsuite/gfortran.dg/graphite/block-4.f90: Update flags.
diff --git a/gcc/common.opt b/gcc/common.opt
index 6f09dfd..e5f26ae 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -555,17 +555,17 @@ fgraphite
 Common Report Var(flag_graphite)
 Enable in and out of Graphite representation
 
-floop-strip-mine
-Common Report Var(flag_loop_strip_mine) Optimization
-Enable Loop Strip Mining transformation
+fgraphite-interchange
+Common Report Var(flag_graphite_interchange) Optimization
+Enable Graphite Loop Interchange transformation
 
-floop-interchange
-Common Report Var(flag_loop_interchange) Optimization
-Enable Loop Interchange transformation
+fgraphite-block
+Common Report Var(flag_graphite_block) Optimization
+Enable Graphite Loop Blocking transformation
 
-floop-block
-Common Report Var(flag_loop_block) Optimization
-Enable Loop Blocking transformation
+fgraphite-identity
+Common Report Var(flag_graphite_identity) Optimization
+Enable Graphite Identity transformation
 
 fguess-branch-probability
 Common Report Var(flag_guess_branch_prob) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 755c422..7420b8f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -340,7 +340,6 @@ Objective-C and Objective-C++ Dialects}.
 -fira-coalesce -fno-ira-share-s

Re: [graphite] Cleanup of command line parameters [PATCH]

2008-10-12 Thread Tobias Grosser

Hi,

On Sun, 2008-10-12 at 13:19 +0200, Tobias Burnus wrote:
> Hi,
> 
> Tobias Grosser wrote:
> > another patch. It contains:
> > 
> > - Removal of documentation outside of common.opts for (-fgraphite,
> >   -floop-block, -floop-interchange, -floop-strip-mine)
> >   This means doc/invoke.texi.
> >   (Proposed by Richi)
> 
> While I agree that -fgraphite does not make sense as user option, I'm
> not sure about -floop-block and -floop-interchange. I could imagine that
> some users would like to play with this option (though I have no real
> opinion about this).

The problem I see is, that these options are for me like intermediate
demonstration options, as graphite should get an automatic optimizer. So
I am not sure what to do. I see these options:

1. Keep them until we have these optimizer:

Here we have the problem what to do after that. We could simply ignore
them or we make them like hints to allow the optimizer trying specific
transformations. But I think, it may not even be possible to try to
disable specific optimizations in all cases while using the polytop
model.
So we may have options just used in one or two releases.

2. Keep them like the -fgraphite options hidden, but allow interested
user to try graphite. So we get interested testers, but do not have to
keep these - maybe later misleading - options forever.

> 
> > - Removal of flag "-floop-strip-mine", as it never will improve 
> >   performance and so there will be no use for it.
> >   (Proposed by Harsha)
> 
> I might have misunderstood Harsha, but I think he said that it only does
> not improve the performance if used by itself, i.e. combined with other
> options it is/might be profitable. Thus one may leave it in as
> undocumented option. (I have no opinion about this and I did no tests, I
> only wanted to stress the "by itself".)

No I am quite sure, we do not need this flag. Yes you are right strip
mining can help combined with loop interchange to improve performance.
But this is -floop-block and already available.

> 
> In any case, those changes should also be documented at
> http://gcc.gnu.org/gcc-4.4/changes.html, which currently lists all options.
> 
> Tobias B

Thanks for your comments and please correct me, if I said something
wrong/could say something better. I am quite new to the gcc development
so every hint is appreciated. 

see you

Tobias

Re: Automatic Parallelization & Graphite - future plans

2009-03-10 Thread Tobias Grosser

Hi Razya

great to hear these Graphite plans. Some short comments.

On Tue, 2009-03-10 at 16:13 +0200, Razya Ladelsky wrote:
> [...]
> 
> The first step, as we see it, will teach Graphite that parallel code needs 
> to be produced.
> This means that Graphite will recognize simple parallel loops (using SCoP 
> detection and data dependency analysis), 
> and pass on that information.
>  The information that needs to be conveyed expresses that a loop is 
> parallelizable, and may also include annotations of  more 
> detailed information e.g, the shared/private variables. 
> 
> There are two possible models for the code generation:
> 1. Graphite will annotate parallel loops and pass that information all the 
> way through CLOOG 
> to the current autopar code generator to produce the parallel, GOMP based 
> code.

It might be possible to recognize parallel loops in graphite, but you
should keep in mind that in the graphite polyhedral representation loops
do not yet exist. So you would have to foresee which loops CLOOG will
produce. This might be possible depending how strict the scheduling we
give to CLOOG is. Another problem is, that cloog might split some loops
automatically (if possible) to reduce the control flow.

> 2. Graphite will annotate the parallel loops and CLOOG itself will be 
> responsible of generating 
> the parallel code.

The same as above. It will hard to mark loops as loops do not yet exist.

> A point to notice here is that scalars/reductions are
>  currently not 
> handled in Graphite.

We are working heavily on this. Expect it to be ready at least at the
end of march. Hopefully the end of this week.

> In the first model, where Graphite calls autopar's code generation, 
> scalars can be handled.

3. Wait for cloog to generate the new loops. As we have the polyhedral
information (poly_bb_p) still available during code generation, we can
try to update the dependency information using the restrictions cloog
added and use the polyhedral dependency analysis to check if there are
any dependencies in the CLOOG generated loops. So we can add a pass in
between CLOOG and clast-to-gimple that marks parallel loops.

Advantage: - Can be 100% exact, no forecasts as we are working on 
 actually generated loops.
   - Nice splitting of what is done where.
 1. Graphite is in charge of optimizations (generate 
parallelism)
 2. CodeGen just detects parallel loops and generates 
code for them.

> After Graphite finishes its analysis, it calls autopar's reduction 
> analysis, and only then the code
> generation is called (if the scalar analysis determines that the loop 
> still parallelizable, of course)
> 
> Once the first step is accomplished, the following steps will focus on 
> teaching Graphite 
> to find loop transformations (such as skewing, interchange etc.) that 
> expose coarse grain synchronization free parallelism.
> This will be heavily based on the polyhedral data dependence and 
> transformation infrastructures.
> We have not determined which algorithm/ techniques we're going to use for 
> this part.
> 
>  Having synchronization free parallelization integrated in Graphite, will 
> set the ground for 
> handling parallelism requiring a small amount of parallelization. 

Yes, great. This will allow us to experiment with advanced auto
parallelization. I am really looking forward to see the first patches!

> This is a rough view for our planned work on autopar in GCC.
> Please feel free to ask/comment.
> 
> Thanks,
> Razya

Re: Google Summer of Code 2009

2009-03-19 Thread Tobias Grosser

Hi Ian,

> Student applications will be accepted from March 23 to April 3.  Each
> student will work with a mentor from the project.  As we've done in past
> years, we need to have a set of mentors who are prepared to work with
> students.  I would like to encourage any interested experienced GCC
> developer to become a mentor for the project.  Being a mentor does not
> require that you actually work with a student: we normally have many
> more available mentors than we do students.  All mentor volunteers (and
> only mentor volunteers) will be able to see and vote on all student
> project proposals.  As in the past, I will accept as a mentor anybody
> who is listed as a maintainer of some part of gcc in the MAINTAINERS
> file.  If you are not a maintainer but still want to be a mentor, I'm
> open to being convinced.

I would very much like to mentor a student during Summer of Code even if
I am not yet a gcc MAINTAINER.
As I am active part of GRAPHITE development since over two years - last
year during the Summer of Code 2008 and this year 3 month from AMD
Austin even full time - I am sure I am experienced enough to guide a
student working on graphite development.
I already posted two project proposal I am interested in to the GCC
Summer of Code ideas wiki [1]. Both are about automatic parallelization
in GCC, which is a research area I already worked on during my time in
the LooPo project [2], even before I started to get active in gcc.
As I will keep working on graphite and LooPo this summer, I will be up
to date in graphite and automatic parallelization and in touch with
people who will support me.
I think I can help new students based on my experience as a student last
year and for myself it will be a great experience to support a new
student to get started in gcc development.

[1] http://gcc.gnu.org/wiki/Graphite/SoC  (More detailed description)
[2] http://www.infosun.fim.uni-passau.de/cl/loopo/

See you

Tobi

Re: Google Summer of Code 2009

2009-03-25 Thread Tobias Grosser

Hi Pranav,

On Tue, 2009-03-24 at 23:32 +0530, Pranav wrote: 
> Hi Sebastian and Tobias,
[...] 
> To delve deeper into compiler's research I will be joining the graduate 
> school at UIUC from next Fall. I feel that working on this project would 
> give me great pleasure and it would be the best way I could have ever 
> spent my long vacations. I think we can implement most of the classical 
> optimizations - loop permutation, loop fusion / fission, strip mining, 
> etc in the summers. As I do not know what all optimizations are expected 
> to be covered in the project could you please help me with the problem 
> before I write my project proposal.

It is great to hear from you and thank you for this interesting
introduction of yourself.
As it seems that you are interested in some optimizations based on the
polyhedral model, I added two new projects to the Graphite Google Summer
of Code ideas page (http://gcc.gnu.org/wiki/Graphite/SoC). I think both
might be interesting for you.
If you are interested in one of these feel free to ask any question, so
we can find out together what you might want to cover in your Google
Summer of Code application

See you

Tobias

Re: {gSoc}application : Automatic parallelization in Graphite

2009-03-26 Thread Tobias Grosser

On Thu, 2009-03-26 at 10:36 +0800, Li Feng wrote:
> Hi all,
> 
> Below is the proposal of this gSoc project.  I'd really like you review and
> comment on this and then I can plan this project better.

Hi Li,

this looks nice. Thanks for your work on this.

Tobias

> 
> Thanks,
> Li Feng
> 
> #title Automatic parallelization in Graphite
> 
> * Synopsis
> 
> With the integration of Graphite to GCC4.4, a strong loop nest
> analysis and transformation engine was introduced. But now it
> does only non-parallel loop generation. My work is to detect
> synchronization free parallel loops and generate parallel code
> for them, which will mainly enables programs to run faster.
> 
[..]

> * Road-map
> 
>  1. Since data dependency analysis is not available, I will firstly
> integrate autopar's code generation to Graphite. This work will
> be done step by step.(Mid June)
> - Introduce a flag -fgraphite-parallelize that forces auto-parallelization
>   for all loops.
> - Make sure the loops Graphite creates can be handled by the autopar's
>   code generation.
> - Call autopar for every loop.(The loops that can not be paralleled will
>   just fail/crash.)

I think on this point we should discuss with Razya and the others where
your work ends.  Just adapting autopar to handle all graphite loops is a
project on its own. So I do not think it can be done by you in two
weeks.

>  2. Write test cases for the loops that can be parallelized. This will take a
> few discussions with Graphite developers to see which kind
> of loops we will should detect and can be auto-parallelized.(End June)
>  3. Write code for parallel code detection. This part of job will based on
> SCoP detection and data dependency, and at this time, data dependency
> analysis should have been done. This part of work will take most of
> the time.(First week of August)

How much time this point takes depends how exact you want to solve it. I
think a correct but conservative implementation might be written in a
week. If you want to detect all loops it will take you more time.

>  4. Code cleaning and write documents.(Second week of August)

I think it is useful to define the limits of your work a little bit
stricter. For me there are two options:

1. You stay more on the autopar/gimple side (Step 1) and adapt autopar
for graphite. This is very necessary for graphite, but it might need
some time to get up to speed in autopar.

2. You stay more on the graphite/polyhedral side. You work on all these
steps, but we limit step 1 to the graphite internal parts. This means
you connect autopar to graphite and try to generate parallel loop nests.
If autopar can not handle a loop nest you analyse it and write a bug
report. Someone else will extend autopar.

As Razya already has some knowlege about autopar and she is also
interested in working on parallelization maybe she is able to support
you with the autopar stuff? So you might be able to actually focus more
on the polyhedral part.

> * Profit for GCC
> 
>  - When the auto-parallelization has been done in Graphite, developer
>can mostly take their effort to loop transformations. Graphite will

be

>in charge of optimizations(generate parallelism) and the autopar
>code in Graphite will just detect and generate code for them.

Tobias

[graphite] Weekly phone call - Automatic parallelization

2009-04-01 Thread Tobias Grosser

Hi,

to keep everybody updated what is happening in the GRAPHITE branch, I
would like to post the notes of our weekly phone call.

Attendees: Razya, Li, Konrad, Jan, Tobi, David, Sebastian, Christophe 

Discussed topics: 

  * Data dependencies: Tobias committed a patch filling the access
polyhedrons 
  * Not yet enabled as there are some bootstrap bugs 
  * All arrays access the same alias set 
  * Reductions are not yet included in the access
polyhedrons (Same as last week, but now really
committed) 
  * Bug in building of scattering matrices fixed. 
  * Bug in condition detection on its way to be fixed. 
  * Automatic Parallelization: 
  * Li will start integrating graphite and code generation
(If his summer of code proposal is accepted) 
  * Mark all loops parallel if new flag
"-fgraphite-force-parallel" is set. 
  * How to connect auto par? 
  * Directly in graphite code generation (clast to
gimple): We call the autopar backend after
generating a serial gimple loop. This is the
easiest approach. Also we still access to the
polyhedral information so we can use it to give
autopar more information. 
  * By annotating loops: In the graphite pass we
annotate gimple loops as parallel . Later the
autopar pass is scheduled, which bypasses its
analysis if it sees a loop annotated parallel.
This allows to schedule scalar optimizations in
between graphite and autopar, but we loose the
polyhedral information. As we do not need the
additional information, but we definitely need
some scalar optimization like PRE the
"annotating loops" approach seems to be our
first try. 

The notes of all calls are available on the wiki:
http://gcc.gnu.org/wiki/Graphite_Phone_Call

More information about graphite at:
http://gcc.gnu.org/wiki/Graphite

Tobi

Re: [gSoc] [graphite] general plan for Automatic parallelization in Graphite

2009-04-23 Thread Tobias Grosser

Hi Alex,

On Wed, 2009-04-22 at 08:19 -0700, Alex Turjan wrote:
> Are there any plans to move the partial unrolling phase from
> RTL to Tree-SSA?
> The move would benefit from better (or easier to implement) Tree-SSA alias 
> analysis. 

We thought about loop unrolling in graphite.
It seems to be an interesting project and definitely possible in
general.
But at the moment I do not know of anybody who is working on this. So it
is an open project. If you want to work on this we will definitely
support you. ;-)

Tobi

> 
> Alex
> > 
> > 
> > 
> > 
> > --- On Wed, 4/22/09, Li Feng 
> > wrote:
> > 
> > > From: Li Feng 
> > > Subject: [gSoc] [graphite] general plan for Automatic
> > parallelization in   Graphite
> > > To: "GCC" 
> > > Cc: "Tobias Grosser"
> > , "Sebastian"
> > , "Razya Ladelsky"
> > , konrad.trifuno...@gmail.com
> > > Date: Wednesday, April 22, 2009, 5:10 PM
> > > Hi,
> > > 
> > > It's nice that the proposal 'Automatic
> > > parallelization in Graphite'
> > > is accepted. Which means I will be working with great
> > > Graphtie
> > > developers this summer, and trying to implement the
> > project
> > > .
> > > 
> > > I have set up a blog for this project, which will
> > mainly
> > > about this
> > > project: 1. plans 2. what I have done 3. related
> > Graphite
> > > internals
> > > You can subscribe to it if you like:
> > > http://summergraphite.blogspot.com/
> > > 
> > > Here is a general plan for this project, keep you in
> > loop,
> > > and feel free to comment :)
> > > 
> > >  1. Mark the innermost loop parallel  [done]
> > > 
> > >  2. Try to schedule autopar pass after Graphite, and
> > enable
> > >  code generation if flag_graphite_force_parallel
> > is set
> > >  - There should be some discussion with Razya
> > about
> > >her plan about the autopar part
> > >  - But before that, I'll try to schedule
> > > autopar first
> > > 
> > >  3. I may try to write testcases for the loops that
> > should
> > > be
> > >  parallel, from simple to hard, and check
> > autopar's
> > > code
> > >  generation part, make sure this works correctly
> > as we
> > >  expected.
> > >  - The testcases is important. There should be
> > some
> > >detailed discussion maybe with Sebastian
> > and
> > > Konrad.
> > >To see what kind of loop we can/decide to
> > > handle.
> > >  - Check autopar's code generation with
> > >flag_graphite_force_parallel set with these
> > > testcases,
> > >report bugs if it goes wrong.
> > > 
> > >  4. Try to write code for deciding if a loop can be
> > > parallel
> > >  with data dependency test under this polyhedral
> > model.
> > > - Try to understand the interface of data
> > > dependency test
> > > - Write code, if data dependency success, mark
> > the
> > > loop parallel
> > > 
> > > Cheers,
> > > Li
> 
> 
>

Re: Polyhedral Model

2009-04-27 Thread Tobias Grosser

On Mon, 2009-04-27 at 19:58 +0100, Dave Korn wrote:
> Cristianno Martins wrote:
> 
> > Well, I didn't find anything about a implementation of this kind of
> > optimization inside of gcc. Also, I need to know if someone is working
> > on something like this using the gcc compiler.
> 
> http://gcc.gnu.org/wiki/Graphite
> 

Yes this is the right point and you are welcome to take part in
development of Graphite. To be honest there is a lot of work left to be
done even if there are already working several people on Graphite. ;-)
If you want you can join our weekly phone calls on Wednesday or you just
sent some questions to the mailing lists so we can help you to get
started with graphite.
The file that might be interesting for you is graphite-poly.h in the
graphite branch. This is the interface/data structure of our polyhedral
representation. It would be great if you could look into it and give us
some feedback.

Tobias

Re: Polyhedral Model

2009-04-27 Thread Tobias Grosser

On Mon, 2009-04-27 at 19:26 -0300, Cristianno Martins wrote:
> Hi,
> 
> Thank you for helping me with those informations. From now on, I'll be
> checking the Graphite framework, and I intend to contribute to that by
> providing support to automatic parallelization. However, my project
> focus on multicore architectures; I guess this is not a problem, is
> it?

This is great. At the moment this is a very hot topic. If you need more
information on this part there are several people also interested in
automatic parallelization.

Razya Ladelsky from IBM:

She wrote a larger part of tree-parloops.c. This is the first basic
implementation of automatic parallelization in gcc using OpenMP.

Sebastian Pop:

He wrote the first draft of tree-parloops.c

Li Feng:

Currently working in his Google summer of Code to extend tree-parloops.c
to take advantage of the analysis graphite can offer us.

Me:

I am from the LooPo group in Passau. We work since a long time on the
polyhedral model and would like to try some of our ideas in the gcc.
This will start summer 2009, as there is still some work left to make
graphite stable enough.

There are several other people generally in automatic parallelization,
that I do not know yet or that have not yet contributed code but work on
the research side.

Also there are more people working on Graphite in general.

Feel free to join our team. It would be great to read a little bit more
about your ideas.
If you need any help to get started with Graphite - just drop us a line.

Tobias

> Thanks in advance,
> 
> On Mon, Apr 27, 2009 at 6:46 PM, Tobias Grosser  wrote:
> > On Mon, 2009-04-27 at 19:58 +0100, Dave Korn wrote:
> >> Cristianno Martins wrote:
> >>
> >> > Well, I didn't find anything about a implementation of this kind of
> >> > optimization inside of gcc. Also, I need to know if someone is working
> >> > on something like this using the gcc compiler.
> >>
> >> http://gcc.gnu.org/wiki/Graphite
> >>
> >
> > Yes this is the right point and you are welcome to take part in
> > development of Graphite. To be honest there is a lot of work left to be
> > done even if there are already working several people on Graphite. ;-)
> > If you want you can join our weekly phone calls on Wednesday or you just
> > sent some questions to the mailing lists so we can help you to get
> > started with graphite.
> > The file that might be interesting for you is graphite-poly.h in the
> > graphite branch. This is the interface/data structure of our polyhedral
> > representation. It would be great if you could look into it and give us
> > some feedback.
> >
> > Tobias
> >
> >
> 
> 
>

[graphite] Weekly phone call notes

2009-04-29 Thread Tobias Grosser

Hi gcc developers, hi graphities

here are some notes from our weekly phone call. Unfortunately I missed
to send out the notes from the last two phone calls, but I hope to get
them out more regulary. Believe in me! ;-)

http://gcc.gnu.org/wiki/Graphite_Phone_Call/2009_04_29

Attendees: Li, Jan, Konrad, Sebastian, Tobias, David, Christophe 

  * Reductions: Diego OK with going out of SSA. 
  * Patch not yet ready. The higher code coverage exposes an
IVSTACK bug. 
  * IVStack: Sebastian is working on removing it 
  * Data dependence analysis: Committed to graphite branch. 
  * Not yet enabled, as the data reference still does not
bootstrap. The lexicographic smaller than constraint is
not yet added. This is not wrong but very conservative.
Dependence test should be enabled with
-fgraphite-identity. 
  * Loop transformations: We need some simple transformations to
check data dependency analysis. 
  * Pranav? We did not hear anything yet. 
  * Data reference: Last bugs have to be fixed. 
  * Not yet enabled. Blocks data dependence analysis. 
  * PCP: 
  * Jan Implemented simplification/canonicalization for
expressions. This will allow a systematic translation of
expressions to constraints. Jan will work on a reduced
polyhedral interface that is used in PCP. First PCP
integration in gcc - later extended by reductions. 

No good representation for reductions for the general
case. Like conditions and PHI nodes (Sebastian) Maybe
for a subset of the reductions we can add commutativaty
of operations. 

  * Autopar: Li prepared some patches to trigger autopar by
graphite. 
  * Autopar failed in the graphite branch even without
graphite enabled. It seems we introduced a bug while
using some code of it for graphite. The breakage happens
in the reductions part. 
  * Merge from trunk: 
  * Sebastian wants to work on this Changes in vectorizer
Last merge 4 months ago Tried mid march but was
difficult. So stopped because of ENOTIME. 
  * Parts of graphite working well?: The polyhedral part of the
graphite architecture will be described 
  * in the Tobias' paper at the summit and the PCP part by
Jan. 
  * Worldwide interest: Cristianno Martins from Brazil showed some
interest 
  * http://gcc.gnu.org/ml/gcc/2009-04/msg00711.html 

  * We already have people from the US, Europe and China
contributing.

Re: [graphite] Weekly phone call notes

2009-04-29 Thread Tobias Grosser

On Wed, 2009-04-29 at 23:57 +0200, Richard Guenther wrote:
> On Wed, Apr 29, 2009 at 11:34 PM, Tobias Grosser
>  wrote:
> > Hi gcc developers, hi graphities
> >
> > here are some notes from our weekly phone call. Unfortunately I missed
> > to send out the notes from the last two phone calls, but I hope to get
> > them out more regulary. Believe in me! ;-)
> >
> > http://gcc.gnu.org/wiki/Graphite_Phone_Call/2009_04_29
> >
> > Attendees: Li, Jan, Konrad, Sebastian, Tobias, David, Christophe
> >
> >  * Reductions: Diego OK with going out of SSA.
> 
> You will loose all points-to information.  I think going out of SSA is
> a very bad idea.

It makes live a lot easier and it is just for reductions. ;-) But you
are right we should get SSA back.

The point was more if it is the right moment to try to regenerate SSA as
the polyhedral model is actually not single assignment. Or actually the
other way around. More single assignment than SSA.

We postponed this complication or silently hoped that a later pass might
clean up our mess. Actually Sebastian already looked at the output and
there seems to be a later gcc pass, that cleans it up and reintroduces
SSA. But we have to investigate this further or at least document it
correctly. The final goal has to be to get an optimized loop, that is
still in SSA. The way we take is not completely clear yet.

> >  * Patch not yet ready. The higher code coverage exposes an
> >IVSTACK bug.
> >  * IVStack: Sebastian is working on removing it
> >  * Data dependence analysis: Committed to graphite branch.
> >  * Not yet enabled, as the data reference still does not
> >bootstrap. The lexicographic smaller than constraint is
> >not yet added. This is not wrong but very conservative.
> >Dependence test should be enabled with
> >-fgraphite-identity.
> >  * Loop transformations: We need some simple transformations to
> >check data dependency analysis.
> >  * Pranav? We did not hear anything yet.
> >  * Data reference: Last bugs have to be fixed.
> >  * Not yet enabled. Blocks data dependence analysis.
> >  * PCP:
> >  * Jan Implemented simplification/canonicalization for
> >expressions. This will allow a systematic translation of
> >expressions to constraints. Jan will work on a reduced
> >polyhedral interface that is used in PCP. First PCP
> >integration in gcc - later extended by reductions.
> >
> >No good representation for reductions for the general
> >case. Like conditions and PHI nodes (Sebastian) Maybe
> >for a subset of the reductions we can add commutativaty
> >of operations.
> >
> >  * Autopar: Li prepared some patches to trigger autopar by
> >graphite.
> >  * Autopar failed in the graphite branch even without
> >graphite enabled. It seems we introduced a bug while
> >using some code of it for graphite. The breakage happens
> >in the reductions part.
> >  * Merge from trunk:
> >  * Sebastian wants to work on this Changes in vectorizer
> >Last merge 4 months ago Tried mid march but was
> >difficult. So stopped because of ENOTIME.
> >  * Parts of graphite working well?: The polyhedral part of the
> >graphite architecture will be described
> >  * in the Tobias' paper at the summit and the PCP part by
> >Jan.
> >  * Worldwide interest: Cristianno Martins from Brazil showed some
> >interest
> >  * http://gcc.gnu.org/ml/gcc/2009-04/msg00711.html
> >
> >  * We already have people from the US, Europe and China
> >contributing.
> >
> >
> >
> >

[graphite] Weekly phone call notes

2009-05-06 Thread Tobias Grosser

Hi folks, hi graphities,

here you are with the latest notes from our graphite phone call.

It is also available on the wiki:

http://gcc.gnu.org/wiki/Graphite_Phone_Call/2009_05_06

All the best

Tobi


Attendees: Sebastian, Tobias, Christophe, Albert, Li, Jan, Razya,
Konrad, Antoniu 

  * Sebastian: 
  * Working on IVstack removal, 
  * Several other patches like "remove strcmp". (Will be
committed as soon as possible) 
  * in clast-to-gimple: 
  * There is a problem finding the type of induction
variables and upper bound expressions, as cloog
does not carry information about them. 
  * We will try "unsigned long long" and
have to insert casts. (This might
trigger problems in the vectorizer and
might be slow. Maybe we can optimize the
size of the iv later.) Before: Used we
used the type of the old IV. But with
strip mining there is no 1 to 1 relation
in between ivs, so there is not always a
type. Other idea: infer types from upper
and lower bound expressions. But does
not seem to work either. This blocks the
work on removal IVSTACK. 
  * Removal of IVStack: Blocked by types. But
already triggers some bugs. 
  * Reductions: Blocking by IVSTack. 
  * Li: 
  * Mark loops as parallel with -fgraphite-force-parallel
(committed). 
  * Trigger autopar with loop->can_be_parallel (committed). 

  * Started testsuite for graphite_autopar (sent for review
to gcc-patches). 
  * 
  * Autopar fails in graphite branch on this line; 

red = reduction_phi (reduction_list, reduc_phi);
  if (red == NULL)
  {
if (dump_file && (dump_flags & TDF_DETAILS))
  fprintf (dump_file,
   "  FAILED: it is not a part of reduction.\n");
return false;
  }


  * Tobias: 
  * Fix bugs to enable data reference building (Now only
gfortran.dg/transpose_conjg_1.f90 fails). 
  * Bootstrapped dependency testing. Worked except one test
case (gfortran.dg/cray_pointers_2.f90). 
  * Working on his paper about the polyhedral part of
graphite to attract more research in this area. 
  * 
  * Jan: 
  * Got gcc summit paper accepted. Will be about the design
of Graphite: IR, components of the Graphite
infrastructure, testsuite, example, internals,
integration with external prototyping tools (POCC). We
should have some discussions about who describes what in
graphite, as Tobias also has a paper accepted. 
  * Worked on the translation of PCP to Polyhedral
representation.

Re: Polyhedral Model

2009-06-10 Thread Tobias Grosser

On Mon, 2009-04-27 at 19:26 -0300, Cristianno Martins wrote:
> Hi,
> 
> Thank you for helping me with those informations. From now on, I'll be
> checking the Graphite framework, and I intend to contribute to that by
> providing support to automatic parallelization. However, my project
> focus on multicore architectures; I guess this is not a problem, is
> it?

Hey Christiano,

I have not heard anything from you for a while. How is your project
going?

Tobi

> 
> Thanks in advance,
> 
> On Mon, Apr 27, 2009 at 6:46 PM, Tobias Grosser  wrote:
> > On Mon, 2009-04-27 at 19:58 +0100, Dave Korn wrote:
> >> Cristianno Martins wrote:
> >>
> >> > Well, I didn't find anything about a implementation of this kind of
> >> > optimization inside of gcc. Also, I need to know if someone is working
> >> > on something like this using the gcc compiler.
> >>
> >> http://gcc.gnu.org/wiki/Graphite
> >>
> >
> > Yes this is the right point and you are welcome to take part in
> > development of Graphite. To be honest there is a lot of work left to be
> > done even if there are already working several people on Graphite. ;-)
> > If you want you can join our weekly phone calls on Wednesday or you just
> > sent some questions to the mailing lists so we can help you to get
> > started with graphite.
> > The file that might be interesting for you is graphite-poly.h in the
> > graphite branch. This is the interface/data structure of our polyhedral
> > representation. It would be great if you could look into it and give us
> > some feedback.
> >
> > Tobias
> >
> >
> 
> 
>

Re: Compile times for gcc with ppl/cloog backened?

2010-05-09 Thread Tobias Grosser


Hi,

On 05/09/10 05:09, ajmcello wrote:

I've got a quad core 3.2Ghz FreeBSD-8 system with 8GB of ram. I
compiled and installed Cloog-PPL and PPL, mpfr, gmp, mpc, polylib,
etc. I'm using make -j 4, and my gcc compile has been going for about
24 hours. Is this normal or did something go terribly wrong? It is
still building, but each file uses about 20-30 hours of CPU time to
build (according to top). I think this is the 3rd compile. After the
first, I recompiled all of my tools (binutils, coreutils, etc) with
the new gcc and then started the build over. This last one, I think
stage1 and stage2 went by relatively quick, but the 3rd and final
stage has been taking forever. I wrote a test program and compiled it
with prev-gcc/xgcc with no arguments, and it took about 5 seconds.
Just curious.


No. This is too much. At our nightly tester gcc bootstraps on a 8 core 
machine in 45 minutes.


Two things I am not sure about in you config.

1. You said you installed polylib.

There is no dependency to polylib. The cloog version you installed 
should link against PPL, but not polylib.


2. You enabled lto

Did you try once without lto? I have not tested graphite yet with lto, 
so there might be either a bug or LTO exposes _ a lot _ optimization 
oportunities.


Cheers

Tobi



Here's my runtime options. Thanks.


./configure --prefix=/usr/gnu \
--disable-werror \
--disable-ppl-version-check \
--disable-cloog-version-check \
--enable-cxx \
--enable-shared \
--enable-static \
--enable-bootstrapp \
--enable-lto \
--enable-objc-gc \
--enable-gold \
--enable-stage1-checking=all \
--enable-64-bit-bfd  \
--enable-languages=c,c++,lto \
--with-pic \
--with-libiconv=/usr/gnu \
--with-gmp-include=/usr/gnu/include \
--with-gmp-lib=/usr/gnu/lib \
--with-gmp-build=/wl2k/src/gmp/gmp-5.0.1 \
--with-gmp=/usr/gnu \
--with-mpfr-include=/usr/gnu/include \
--with-mpfr-lib=/usr/gnu/lib \
--with-mpfr-build=/wl2k/src/mpfr/mpfr-2.4.2 \
--with-mpfr=/usr/gnu \
--with-cloog=/usr/gnu \
--with-cloog-include=/usr/gnu/include \
--with-cloog-lib=/usr/gnu/lib \
--with-ppl=/usr/gnu \
--with-ppl-include=/usr/gnu/include \
--with-ppl-lib=/usr/gnu/lib \
--with-mpc=/usr/gnu \
--with-mpc-include=/usr/gnu/include \
--with-mpc-lib=/usr/gnu/lib \
--with-gnu-ld \
--with-ld=/usr/gnu/bin/ld \
--with-gnu-as \
--with-as=/usr/gnu/bin/as

Re: Question about polyhedral model and solver used in GRAPHITE

2010-06-17 Thread Tobias Grosser


On 06/17/2010 10:41 AM, Simone Pellegrini wrote:

Dear all,
I was reading one of the first technical reports of GRAPHITE and it
looks like several libraries have been considered for integration into
GCC for solving the ILP problem and polyhedral model representation. In
the document the Omega library, PIP, PolyLib and PPL are mentioned.

I am wondering about which library is actually used in the current
implementation of the GRAPHITE framework and if possible I would like to
know the motivations behind this choice. I actually need to build
dependence analysis on top of a source-to-source compiler and I find
myself in the 'nightmare' of choosing a suitable library for supporting
the polyhedral model analysis and transformation.


Hi Simone,

currently we use PPL. At the time we wanted to commit Graphite it was 
the only realistic choice.


The alternative was Polylib. However it was GPLv2 only and there was no 
chance to change the license. (I heard that it was changed recently).

So there was not too much choice.

Furthermore, technically polylib is a very old implementation that has a 
plain interface that is not that easy to understand. Polyhedron are e.g. 
represented as two dimensional matrices.

Also its support for integer polyhedron is limited and sometimes wrong.

On the other hand there is PPL. A nicely developed library, that since 
the most recent release also includes a PIP solver. It has a nice 
constraint based interface, good documentation and is GPLv3 licensed. 
Unfortunately integer polyhedron are not that easy to use in PPL.


Recently there showed up a new library called ISL. It is developed by 
Sven Verdoolaege and is state of the art in terms of integer polyhedron. 
It is plain C and also supports a nice constraint based interface. 
Furthermore it has native support for existentially quantified variables.
It already proved useful in CLooG [1] and is also used in Polly [2] 
because of its liberal LGPL license.


You said you are planning to work on Source to Source compilation, and 
you are going to develop a dependency analysis.


In this case I highly recommend you to check if this is not already 
available. There is candl from Cedric Bastoul that does dependency 
analysis and furthermore ISL has an integrated dependency solver.

You just might want to use it directly.

Furthermore I would be interested in the toolchain you are developing 
and if there is any possibility to integrate it with existing tools.
We are working on an open Polyhedral interchange format called 
openscop/scoplib that will allow different polyhedral tools to 
interchange their data.


So you could e.g. write the frontend/backend and could plug in tools 
supporting SCoPlib.


Currently these tools exist that support SCoPlib:

* Graphite: Frontend/Backend in GCC
* Polly: Frontend/Backend in LLVM
* Clan: C Source frontend
* Candl: Dependency analysis
* Pluto: Integrated toolchain for source to source compilation/

Please ask if you need further information or you want to introduce us 
to your project.


Cheers
Tobi

[1] http://www.cloog.org
[2] http://wiki.llvm.org/Polyhedral_optimization_framework

Re: gcc.gnu.org/infrastructure - newer versions of GMP/mpfr/mpc/isl/cloog?

2014-01-27 Thread Tobias Grosser


On 01/27/2014 08:29 PM, Tobias Burnus wrote:

Hello,

motivated by the recent MPC 1.0.2 announcement, I looked at
./contrib/download_prerequisites and also at
ftp://gcc.gnu.org/pub/gcc/infrastructure/ to see which versions are
offered there.

Question: Would it make sense to place newer versions into
infrastructure and update ./contrib/download_prerequisites for those? I
believe most distros use newer versions nowadays and as some bugs have
been fixed in newer versions...

* GMP: infrastructure 4.3.2 (2010-01-08), current: 5.1.3 (2013-09-30)
* mpfr: infrastructure 2.4.2 (2009-11-30), current: 3.1.2 (2013-03-13)
* mpc: infrastructure 0.8.1 (2009-12-08), current: 1.0.2 (2014-01-15)
* ISL: infrastructure 0.11.1 (2012-12-12), current: 0.12.2 (2014-01-12)
* CLooG: infrastructure 0.18.0 (2012-12-20), current: 0.18.1
(2013-10-11) [Or 0.18.2 (2013-12-20) according to the GIT tag, but that
release only added a howto_cloog_release.txt file ...]


Hi Tobias,

that sounds like a great idea. We are internally currently working on 
preparing graphite for the isl 0.13.0 release, which is a large 
improvement and e.g. provides a computeout facility that allows us to 
stop dependence analysis in case the dependence problem is too complex 
to solve. This would address some of the open graphite bugs. Even before 
this is ready, upgrading to CLooG 0.18.1 and isl-0.12.1 would probably 
be a good thing to do.


Cheers,
Tobias

Re: asking your advice about bug

2014-02-03 Thread Tobias Grosser


On 02/03/2014 11:11 AM, Роман Гареев wrote:



Dear Graphite contributors,


  I am new to Graphite and want to find out if I could work on it. Could you
point me to a simple bug, please? I would be very grateful for your advise.


Hi,

maybe this bug seems a good start:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59586

A segfault is normally rather easy to debug. If you have problems, I am 
very happy to assist.


Cheers,
Tobias

Re: asking your advice about bug

2014-02-05 Thread Tobias Grosser


On 02/04/2014 07:08 PM, Roman Gareev wrote:



Thank you for your advice! I've started working on a fix.


Great. Feel free to ask in case you need some help.

Tobias

Re: gcc.gnu.org/infrastructure - newer versions of GMP/mpfr/mpc/isl/cloog?

2014-02-13 Thread Tobias Grosser


On 02/13/2014 08:19 AM, Richard Biener wrote:

On Mon, Jan 27, 2014 at 8:40 PM, Tobias Grosser  wrote:

On 01/27/2014 08:29 PM, Tobias Burnus wrote:


Hello,

motivated by the recent MPC 1.0.2 announcement, I looked at
./contrib/download_prerequisites and also at
ftp://gcc.gnu.org/pub/gcc/infrastructure/ to see which versions are
offered there.

Question: Would it make sense to place newer versions into
infrastructure and update ./contrib/download_prerequisites for those? I
believe most distros use newer versions nowadays and as some bugs have
been fixed in newer versions...

* GMP: infrastructure 4.3.2 (2010-01-08), current: 5.1.3 (2013-09-30)
* mpfr: infrastructure 2.4.2 (2009-11-30), current: 3.1.2 (2013-03-13)
* mpc: infrastructure 0.8.1 (2009-12-08), current: 1.0.2 (2014-01-15)
* ISL: infrastructure 0.11.1 (2012-12-12), current: 0.12.2 (2014-01-12)
* CLooG: infrastructure 0.18.0 (2012-12-20), current: 0.18.1
(2013-10-11) [Or 0.18.2 (2013-12-20) according to the GIT tag, but that
release only added a howto_cloog_release.txt file ...]



Hi Tobias,

that sounds like a great idea. We are internally currently working on
preparing graphite for the isl 0.13.0 release, which is a large improvement
and e.g. provides a computeout facility that allows us to stop dependence
analysis in case the dependence problem is too complex to solve. This would
address some of the open graphite bugs. Even before this is ready, upgrading
to CLooG 0.18.1 and isl-0.12.1 would probably be a good thing to do.


I've tested building the 4.8 branch with cloog 0.18.1 and isl 0.12.2 and
that seems to work.  Updating the infrastructure dir sounds good to me,
I'll do it.


Thanks Richi!

Could we also make the minimal library requirement for gcc the following 
two?


Cheers,
Tobias

Re: gcc.gnu.org/infrastructure - newer versions of GMP/mpfr/mpc/isl/cloog?

2014-02-13 Thread Tobias Grosser


On 02/13/2014 08:37 AM, Richard Biener wrote:

On Thu, Feb 13, 2014 at 2:20 PM, Tobias Grosser  wrote:

On 02/13/2014 08:19 AM, Richard Biener wrote:


On Mon, Jan 27, 2014 at 8:40 PM, Tobias Grosser  wrote:


On 01/27/2014 08:29 PM, Tobias Burnus wrote:



Hello,

motivated by the recent MPC 1.0.2 announcement, I looked at
./contrib/download_prerequisites and also at
ftp://gcc.gnu.org/pub/gcc/infrastructure/ to see which versions are
offered there.

Question: Would it make sense to place newer versions into
infrastructure and update ./contrib/download_prerequisites for those? I
believe most distros use newer versions nowadays and as some bugs have
been fixed in newer versions...

* GMP: infrastructure 4.3.2 (2010-01-08), current: 5.1.3 (2013-09-30)
* mpfr: infrastructure 2.4.2 (2009-11-30), current: 3.1.2 (2013-03-13)
* mpc: infrastructure 0.8.1 (2009-12-08), current: 1.0.2 (2014-01-15)
* ISL: infrastructure 0.11.1 (2012-12-12), current: 0.12.2 (2014-01-12)
* CLooG: infrastructure 0.18.0 (2012-12-20), current: 0.18.1
(2013-10-11) [Or 0.18.2 (2013-12-20) according to the GIT tag, but that
release only added a howto_cloog_release.txt file ...]




Hi Tobias,

that sounds like a great idea. We are internally currently working on
preparing graphite for the isl 0.13.0 release, which is a large
improvement
and e.g. provides a computeout facility that allows us to stop dependence
analysis in case the dependence problem is too complex to solve. This
would
address some of the open graphite bugs. Even before this is ready,
upgrading
to CLooG 0.18.1 and isl-0.12.1 would probably be a good thing to do.



I've tested building the 4.8 branch with cloog 0.18.1 and isl 0.12.2 and
that seems to work.  Updating the infrastructure dir sounds good to me,
I'll do it.



Thanks Richi!

Could we also make the minimal library requirement for gcc the following
two?


I see no reason to do that at this point (I suppose only dropping support
for ISL < 0.12.2 would help you?), but I have updated the recommended
versions as documented in install.texi to those two.

When trunk re-opens for stage1 we can drop support for older
versions once we make code changes that make those versions no
longer work.


Perfect. That works for us.


Pretty high on my wishlist and a good motivation would be to get
rid of the cloog dependency by using the ISL code generator ;)
I'll help as good as I can with all problems that arise in GCC
specific areas such as GIMPLE and SSA.


We seem to agree on the next steps. Using the isl code generator is 
pretty high on the wish list, only fixing the last P1 bugs is higher.


Cheers,
Tobias

Re: asking your advice about bug

2014-02-14 Thread Tobias Grosser


On 02/12/2014 11:51 AM, Roman Gareev wrote:

Hi Roman,

thanks for the quick feedback!


I've found out that this bug appeared in revision 189156 
(svn://gcc.gnu.org/svn/gcc/trunk)
and similar error message appeared in revision 191757 
(svn://gcc.gnu.org/svn/gcc/trunk)
(maybe it's because of changes in diagnostic.c). If
subtract_commutative_associative_deps, a function located in
gcc/graphite-dependences.c, is commented out, the error will disappear . I
am trying to find a bug in this function now. *Could you *please answer a
few questions about it?

1) Where can I find the algorithm for finding associative commutative
reduction, which was used in subtract_commutative_associative_deps?


It seems as if there is no such description available. This is a problem 
by itself and we should probably add documentation about what is going 
on. Unfortunately, I did not write the code and I also don't really get 
what it is doing. The intuition seems to be that the dependences between 
a set of reduction statements are computed and those dependences are 
then removed from the overall set of dependences.
This is in general a good idea, but the code could need some 
improvements and fixes. Several things look shady here. Here one example


1) We only remove dependences, but we should also add new ones

Let 'a->a' be reduction dependences, then dependences between 'a'
can only be removed in case we add new dependences between 'b1' and all
'a' and all 'a' and 'b2.

b1
|
a -> a -> a
  |
  b2

This should probably be fixed, but I don't think this is the problem of 
the current bug report. In fact, to fix the bug report, I don't even 
think we need to understand the full algorithm. The first question to 
ask is:


Why are we segfaulting? Which statement is causing the segfault?



2) What is the number returned by isl_union_map_compute_flow? (I haven't
found its description in “Integer Set Library: Manual”)

3) I've found the following terms in subtract_commutative_associative_deps:
“may accesses”, “must access”. “Integer Set Library: Manual” gives the
following definition: «If any of the source accesses are marked as being
may accesses, then there will be a dependence to the last must access and
to any may access that follows this last must access». *Could you *please 
describe
their meaning? Are they related to transitively-covered dependences?


Thanks Sven for answering those.

Tobias

Re: asking your advice about bug

2014-02-24 Thread Tobias Grosser


On 02/17/2014 06:50 PM, Roman Gareev wrote:



Hi Tobias,


  thanks for the answer!


Hi Roman,

sorry for missing this mail.



  I think that the segfault is being caused by NULL arguments being passedto 
compute_deps
by loop_level_carries_dependences. *This is **causing **an* *assignment of**
NULL values to the following parameters of **compute_deps:* must_raw_no_source,
may_raw_no_source, must_war_no_source, may_war_no_source,
must_waw_no_source, may_waw_no_source. They are being passed to 
subtract_commutative_associative_deps
and dereferenced in the following statements:


  *must_raw_no_source = isl_union_map_subtract (*must_raw_no_source,
x_must_raw_no_source);


  *may_raw_no_source = isl_union_map_subtract (*may_raw_no_source,
x_may_raw_no_source);


  *must_war_no_source = isl_union_map_subtract (*must_war_no_source,
x_must_war_no_source);


  *may_war_no_source = isl_union_map_subtract (*may_war_no_source,
x_may_war_no_source);


  *must_waw_no_source = isl_union_map_subtract (*must_waw_no_source,
x_must_waw_no_source);


  *may_waw_no_source = isl_union_map_subtract (*may_waw_no_source,
x_may_waw_no_source);

  This is the reason of segfault. (All functions mentioned above are located
in gcc/graphite-dependences.c)



Interesting analysis.


  I think that this can be solved by the addition to
subtract_commutative_associative_deps of NULL checking of the following
variables: must_raw_no_source, may_raw_no_source, must_war_no_source,
may_war_no_source, must_waw_no_source, may_waw_no_source. I've implemented
this in the patch, which can be found below.


Yes, this would be a 'solution'. However, I am in fact surprised that 
those variables are NULL at all. Do you have an idea why this is the 
case? Understanding this would help to understand if the patch you 
propose is actually the right solution or if it is just hiding a 
previous bug.


Cheers,
Tobias

Re: asking your advice about bug

2014-03-06 Thread Tobias Grosser


On 03/02/2014 08:06 PM, Roman Gareev wrote:




Yes, this would be a 'solution'. However, I am in fact surprised that
those variables are NULL at all. Do you have an idea why this is the
case? Understanding this would help to understand if the patch you
propose is actually the right solution or if it is just hiding a
previous bug.



Hi Tobias,

After consideration of almost all the code in gcc/graphite-dependences.c,
I think that the NULL arguments being passed to compute_deps are
appropriate
for loop_level_carries_dependences.


I slowly come to the same conclusion. Thanks a lot for digging deeper 
into this.



In my opinion, loop_level_carries_dependences uses the following algorithm
to determine if the loop at the level DEPTH carries dependences:

This function uses compute_deps for finding RAW, WAR and WAW dependences
of all basic blocks in the body of the given loop. Subsequently, it tries
to determine presence of these dependences at the given level. I think that it 
tries to
find loop-independent
dependences [1] in carries_deps. Therefore it maps the relation of
dependences
to the relation of the corresponding time-stamps and intersects the result
with the relation
in which all the inputs before DEPTH occur at the same time as the output,
and the input
at DEPTH occurs before output.


Yes, it checks if all dependences are carried by outer loops already. If 
the intersection is not empty, some dependences are carried by the DEPTH 
we currently check and the loop is consequently not parallel.


> I might be wrong, but I suppose that

no_source
dependences are *unnecessary* for this algorithm.


Yes, this analysis is very correct.

In that light, I believe your previous patch is correct. I will review 
it and we can probably commit it. Thanks for working on this!


Tobias

Re: asking your advice about bug

2014-03-06 Thread Tobias Grosser


On 02/17/2014 06:50 PM, Roman Gareev wrote:



Hi Tobias,


  thanks for the answer!


  I think that the segfault is being caused by NULL arguments being passedto 
compute_deps
by loop_level_carries_dependences.*This is **causing **an*  *assignment of**
NULL values to the following parameters of **compute_deps:* must_raw_no_source,
may_raw_no_source, must_war_no_source, may_war_no_source,
must_waw_no_source, may_waw_no_source. They are being passed to 
subtract_commutative_associative_deps
and dereferenced in the following statements:


  *must_raw_no_source = isl_union_map_subtract (*must_raw_no_source,
x_must_raw_no_source);


  *may_raw_no_source = isl_union_map_subtract (*may_raw_no_source,
x_may_raw_no_source);


  *must_war_no_source = isl_union_map_subtract (*must_war_no_source,
x_must_war_no_source);


  *may_war_no_source = isl_union_map_subtract (*may_war_no_source,
x_may_war_no_source);


  *must_waw_no_source = isl_union_map_subtract (*must_waw_no_source,
x_must_waw_no_source);


  *may_waw_no_source = isl_union_map_subtract (*may_waw_no_source,
x_may_waw_no_source);


  This is the reason of segfault. (All functions mentioned above are located
in gcc/graphite-dependences.c)


  I think that this can be solved by the addition to
subtract_commutative_associative_deps of NULL checking of the following
variables: must_raw_no_source, may_raw_no_source, must_war_no_source,
may_war_no_source, must_waw_no_source, may_waw_no_source. I've implemented
this in the patch, which can be found below.


  Tested x86_64-unknown-linux-gnu, applying to revisions 189156, 207802
(svn://gcc.gnu.org/svn/gcc/trunk) and 207802
(svn://gcc.gnu.org/svn/gcc/branches/ibm/gcc-4_8-branch)


  Thanks for your answers and advice, Sven!


  --

Roman Gareev

-- --- You received this message because you are subscribed to the
Google Groups "GCC GRAPHITE" group. To unsubscribe from this group and
stop receiving emails from it, send an email to
gcc-graphite+unsubscr...@googlegroups.com. For more options, visit
https://groups.google.com/groups/opt_out.


patch


diff --git a/gcc/graphite-dependences.c b/gcc/graphite-dependences.c
index b0f8680..002e3d1 100644
--- a/gcc/graphite-dependences.c
+++ b/gcc/graphite-dependences.c
@@ -424,24 +424,83 @@ subtract_commutative_associative_deps (scop_p scop,
  &x_may_waw_no_source);
gcc_assert (res == 0);

-   *must_raw = isl_union_map_subtract (*must_raw, x_must_raw);
-   *may_raw = isl_union_map_subtract (*may_raw, x_may_raw);
-   *must_raw_no_source = isl_union_map_subtract (*must_raw_no_source,
- x_must_raw_no_source);
-   *may_raw_no_source = isl_union_map_subtract (*may_raw_no_source,
-x_may_raw_no_source);
-   *must_war = isl_union_map_subtract (*must_war, x_must_war);
-   *may_war = isl_union_map_subtract (*may_war, x_may_war);
-   *must_war_no_source = isl_union_map_subtract (*must_war_no_source,
- x_must_war_no_source);
-   *may_war_no_source = isl_union_map_subtract (*may_war_no_source,
-x_may_war_no_source);
-   *must_waw = isl_union_map_subtract (*must_waw, x_must_waw);
-   *may_waw = isl_union_map_subtract (*may_waw, x_may_waw);
-   *must_waw_no_source = isl_union_map_subtract (*must_waw_no_source,
- x_must_waw_no_source);
-   *may_waw_no_source = isl_union_map_subtract (*may_waw_no_source,
-x_may_waw_no_source);
+   if (must_raw)
+ *must_raw = isl_union_map_subtract (*must_raw, x_must_raw);
+   else
+ isl_union_map_free (x_must_raw);
+
+   if (may_raw)
+ *may_raw = isl_union_map_subtract (*may_raw, x_may_raw);
+   else
+ isl_union_map_free (x_may_raw);


In my understanding, it is sufficient to guard the no_source statements, no?


+
+   if (must_raw_no_source)
+ {
+   *must_raw_no_source = isl_union_map_subtract (*must_raw_no_source,
+ x_must_raw_no_source);
+ }
+   else
+ isl_union_map_free (x_must_raw_no_source);


Could you remove the '{' '}' around the first statement?


+
+   if (may_raw_no_source)
+ {
+   *may_raw_no_source = isl_union_map_subtract (*may_raw_no_source,
+x_may_raw_no_source);
+ }
+   else
+ isl_union_map_free (x_may_raw_no_source);


Could you remove the '{' '}' around the first statement?


+   if (must_war)
+ *must_war = isl_union_map_subtract (*must_war, x_must_war);
+   else
+ isl_union_map_free (x_must_war);
+
+   if (may_war)
+ *may_war = isl_union_map_subtract (*may_war, x_may_w

Re: Integration of ISL code generator into Graphite

2014-03-14 Thread Tobias Grosser


On 03/14/2014 09:21 PM, Roman Gareev wrote:

Dear gcc contributors,

I am going to try to participate in Google Summer of Code 2014. My
project is "Integration of ISL code generator into Graphite".


My proposal can be found at on the following link
https://drive.google.com/file/d/0B2Wloo-931AoTWlkMzRobmZKT1U/edit?usp=sharing
. I would be very grateful for your comments, feedback and ideas about
its improvement.


Thanks Roman,

I will have a look later on. For now, please make sure you already 
register now your proposal in Google Melange. You can always upload 
better/improved versions.


Thanks,
Tobias

Re: Integration of ISL code generator into Graphite

2014-03-19 Thread Tobias Grosser


On 03/14/2014 09:18 PM, Roman Gareev wrote:

Dear gcc contributors,

I am going to try to participate in Google Summer of Code 2014. My project
is "Integration of ISL code generator into Graphite".

My proposal can be found at on the following link
https://drive.google.com/file/d/0B2Wloo-931AoTWlkMzRobmZKT1U/edit?usp=sharing.
I would be very grateful for your comments, feedback and ideas about
its
improvement.


Hi Roman,

sorry for the delay. The proposal looks already very promising. Here 
some more comments:


- Graphite does not any more support CLooG-PPL. This part of software is 
history. No need to talk about it in the proposal. We currently have 
CLooG-isl and want to replace it by the isl internal ast generator.


3) Benefits

- We currently do not have license issues with cloog-isl. So no need to 
talk about them.


- Users currently do not need to choose between CLooG with PPL and isl
  backend. The PPL backend is not supported any more


Actual new benefits:

- Support for unrolling, full/partial tile separation, as well as 
fine-grained code size adjustments


- In unreleased isl 0.13.0, support for compute out feature

- Improved code generation quality


4)

- CLooG is not only meant to generate C code. Using the clasts is a 
supported and direct way to use CLooG, as it is CLooG's library interface.


- "New internal representaion will be generated by ISL. Its structure is 
planned to be similar to the CLAST tree, but can be changed ..."


What does this  mean? The isl_ast representation is already defined. Are 
you saying that isl may generate an ast that is different in structure 
to the clast tree currently generated? Or are you saying we

still need to define the isl_ast and its nodes itself?

- "can be union" -> "can be the union"

4)

- It is great that you schedule explicit time to get familiar with
  the libraries and to do testing and bug fixing.

However, to get familiar, it would be nice to list a couple of small
projects/steps that you can use for this. E.g. in the first week
you could already generate an isl_build and an isl_union_map schedule,
pass it to the isl code generation and dump the resulting AST
to a file.

- "Designing of new internal format"

What kind of new format do you need to generate?

- "Code generation from internal form"

This is the whole project. You really should divide this into smaller 
steps. You could e.g. first just create a simple loop ignoring any basic 
blocks. You could then generate a simple loop with a single basic block. 
Then a loop with more complex ast expressions, then nested loops and 
ifs, ...


It would be very helpful if you could think about such steps and 
possibly an approach to ensure you systematically add all pieces of the 
code generation.


Also, it would be good to understand which parts of the code generation 
can be copied/shared with the exiting cloog based code generation and 
which parts need to be rewritten.



5)

gcc should bootstrap

6)

- Another nice to have would be to use the full/partial tile separation 
for the tiles generated by the isl scheduler.


cheers,
tobi

Re: Integration of ISL code generator into Graphite

2014-03-21 Thread Tobias Grosser


On 03/21/2014 12:04 PM, Roman Gareev wrote:

Hi Tobias,

thank you for all your comments! I've tried to consider them in the
improved version of my proposal, which can be found at the following
link 
https://drive.google.com/file/d/0B2Wloo-931AoeUlYOHhETVBvY3M/edit?usp=sharing
.


- In unreleased isl 0.13.0, support for compute out feature


I haven't found information about this feature and isl 0.13.0. Could
you please give me a link to be referred to in the proposal?


Section 1.4.1 of the isl manual documents the following functions:

void isl_ctx_set_max_operations(isl_ctx *ctx,
unsigned long max_operations);
unsigned long isl_ctx_get_max_operations(isl_ctx *ctx);
void isl_ctx_reset_operations(isl_ctx *ctx);


- Improved code generation quality


I also haven't found code quality comparison between CLooG and ISL
code generator. Do you mean, that ISL code generator can improve code
quality with unrolling, full/partial tile separation, fine-grained
code size adjustments?


We have an unpublished paper on this. We should probably make this a 
tech report at some point.



- "New internal representaion will be generated by ISL. Its structure is
planned to be similar to the CLAST tree, but can be changed ..."

What does this  mean? The isl_ast representation is already defined. Are you
saying that isl may generate an ast that is different in structure to the
clast tree currently generated? Or are you saying we
still need to define the isl_ast and its nodes itself?


I wanted to say that ISL will generate ISL AST from the polyhedral
representation. This ISL AST (with pointers to original basic blocks
instead of statments) will be internal representation for Graphite,
that should be traversed and transformed into the GIMPLE CFG. I
eliminated the mention of this internal representation in the improved
version of the proposal.


Good.

I think this proposal is already very nice. I have some last comments in 
case you want to really polish it:


o 26-31 May: Get familiar with CLooG generation?

Why is this necessary?


Also, for the remaining time-line, I think you could start working on 
making this more detailed.


Instead of having separate testing and fixing bug weeks, I think it 
would be optimal to have for each weak one topic that you plan to finish 
(including testing and fixing), as well as a brief idea how to do it. 
You already have a good idea of what is needed, but you could detail 
this further by looking through the exiting source code (or by looking 
into Polly's IslCodeGeneration.cpp) to see what is needed.


In which week are you e.g. planning to write the code to generate 
isl_ast_expr?


Which pieces of the code are you planning to reuse?

Are you planning to split of pieces of the code that can be shared by 
CLooG and isl, before fully removing CLooG?


Cheers,
Tobias

Re: Integration of ISL code generator into Graphite

2014-03-24 Thread Tobias Grosser


On 03/24/2014 02:08 PM, Mircea Namolaru wrote:

Hi,

1. Maybe there is also an opportunity to improve the design of Graphite
code generation, this will give you another benefit - code more robust, easier
to maintain and extend.

I suggest a design where the code generation from ISL AST is based
on attributes (properties) of the ISL AST nodes.


Hi Mircea,

could you give examples of the 'attributes' you are thinking of?

Tobias

Re: Some Tests in gcc.dg/graphite/ are failing for AVR target

2014-03-29 Thread Tobias Grosser


On 03/28/2014 03:52 PM, K_s, Vishnu wrote:

Hi all,

Some of the tests added in gcc.dg/graphite are failing for AVR target, Because
size of the arrays defined are 'too' large for AVR. I'm wondering is it
possible to reduce the size of the array's in tests.

One example is gcc.dg/graphite/scop-0.c, which is failing with error
"size of array 'a' is too large" when compiling with avr-gcc.

Following tests are also failing for avr-gcc with error "size of array used
is too large".

gcc.dg/graphite/scop-22.c
gcc.dg/graphite/scop-3.c
gcc.dg/graphite/scop-dsyr2k.c
gcc.dg/graphite/scop-dsyrk.c
gcc.dg/graphite/scop-mvt.c
gcc.dg/graphite/scop-sor.c
gcc.dg/graphite/pr46185.c


Hi Vinshnu,

could you provide a patch that fixes these tests for your platform by 
reducing array sizes. It makes the intention very clear and we could 
just review it. (I am almost certain it is trivial to accept).


Cheers,
Tobias

Re: Anyone used Graphite of Gentoo recently?

2014-03-30 Thread Tobias Grosser


On 03/31/2014 06:25 AM, Vladimir Kargov wrote:

On 27 March 2014 18:39, Mircea Namolaru  wrote:

The domain is computed on basis of the information provided by
number_of_latch_execution that returns the tree expression

(unsigned int) maxLen_6(D) - (unsigned int) minLen_4(D)

For signed integers (without the cast to unsigned int) this seems to be the
correct value. As an unsigned int expression it leads to incorrect domain.


I've got a couple of questions with regards to wrapping. Pasting the
relevant code from graphite-sese-to-poly.c:

static isl_pw_aff *
extract_affine (scop_p s, tree e, __isl_take isl_space *space)
{
...
   // e comes from number_of_latch_executions()
   type = TREE_TYPE (e);
   if (TYPE_UNSIGNED (type))
 res = wrap (res, TYPE_PRECISION (type));

1) Any idea why wrapping occurs only for unsigned expressions?


In the C standard, unsinged operations have defined overflow semantics 
in form of wrapping. Signed operations instead have undefined behavior, 
which means we can assume that no wrapping occurs and consequently do 
not need to model it.



2) What exactly is wrapping needed for? I can't find it in the old PPL
implementation (though it could have been called differently).


it is necessary to model the defined overflow behavior of unsigned integers.


Also, no matter what the logic behind wrapping is, it does seem
suspicious that this code checks the signedness of the result of
number_of_latch_execution() (which may be arbitrary, as far as I
understood), and not of the loop induction variable, for example.


It is very suspicious. It is a rather difficult topic and I doubt it is 
implemented correctly. Also, I do not think that implementing wrapping 
this way is the right approach. Instead, we should just model it to
extract a run-time check that verifiers that no wrapping occurs and then 
go one without wrapping.


Regarding this bug there are two directions to go:

1) It is necessary to understand if we need to do wrapping on the result 
of number_of_latch_executions. Meaning, we should understand the 
semantics of this function and the expression it returns.


2) There is something else happening?? From my observations it seems
that the generated code at least for this test case should behave 
correctly and the relevant loop should be executed exactly twice. 
Surprisingly this is not the case. It would be interesting to understand 
what exactly prevents this loop from being executed. (From the clast, 
this is not obvious to me).


Tobi

Re: Anyone used Graphite of Gentoo recently?

2014-03-31 Thread Tobias Grosser


On 03/31/2014 10:10 AM, Richard Biener wrote:

On Mon, Mar 31, 2014 at 8:23 AM, Tobias Grosser  wrote:

On 03/31/2014 06:25 AM, Vladimir Kargov wrote:
Regarding this bug there are two directions to go:

1) It is necessary to understand if we need to do wrapping on the result of
number_of_latch_executions. Meaning, we should understand the semantics of
this function and the expression it returns.

2) There is something else happening?? From my observations it seems
that the generated code at least for this test case should behave correctly
and the relevant loop should be executed exactly twice. Surprisingly this is
not the case. It would be interesting to understand what exactly prevents
this loop from being executed. (From the clast, this is not obvious to me).


Note that there are always two sides of the "undefined overflow" thing

  1. in the input IL whether an operation may overflow or not (check
TYPE_OVERFLOW_UNDEFINED, _not_ !TYPE_UNSIGNED)


Thanks Richi. I was not aware of this, but we really should check for 
TYPE_OVERFLOW_UNDEFINED instead.


I think for now we may want to limit ourselves to 
TYPE_OVERFLOW_UNDEFINED, as the wrapping code is not very reliable and 
will cause very ugly code.



  2. in the GIMPLE IL generated from clast - all operations that satisfy
TYPE_OVERFLOW_UNDEFINED may not have different overflow behavior
than in the original untransformed program (no intermediate overflows
unless they were already present).  Usually that's not very easy to guarantee,
thus re-writing everything into unsigned arithmetic may be easiest.


The code we generate should not have any overflows. In case there are 
earlier defined overflows we should at best bail out. This is the safest 
approach. Anything else requires some more investigations.


Cheers,
Tobias

Graphite news

2012-02-09 Thread Tobias Grosser


Hi,

it has been quiet around Graphite for a while and I think it is more 
than time to give an update on Graphite.


== The Status of Graphite ==

Graphite has been around for a while in GCC. During this time a lot of 
people tested Graphite and Sebastian fixed many bugs. As of today the 
Graphite infrastructure is pretty stable and hosts already specific 
optimizations such as loop-interchange, blocking and loop-flattening.


However, during the development of Graphite we also found areas where
we are still way behind our possibilities.
First of all we realized that the use of a rational polyhedral library, 
even though it provides some functionality for integer polyhedra, is 
blocking us. Rational rational polyhedra worked OK for some time, but we 
have now come to a point where the absence of real integer polyhedra is 
causing problems. We have bugs that cannot be solved, just because 
rational polyhedra do not represent correctly the set of integer points 
in the loop iterations.
Another deficit in Graphite is the absence of a generic optimizer. Even 
though classical loop transformations work well for certain problems, 
one of the major selling points of polyhedral techniques is the 
possibility to go beyond classical loop transformations and to forget 
about the corresponding pass ordering issues. Instead it is possible to 
define a generic cost function for which to optimize. We currently do 
not take advantage of this possibility and therefore miss possible 
performance gains.
And as a last point, Graphite still does not apply to as much code as it 
could. We cannot transform a lot of code, not only because of the 
missing support for casts (for which we need integer polyhedra), but 
also because of an ad hoc SCoP detection and because some passes in the
GCC pass order complicate Graphite's job. Moving these road blocks out 
of the way should increase the amount of code we can optimize significantly.


== The pipeline of upcoming graphite changes ==

As just pointed out there is still a lot of work to be done. We have 
been aware of this and we actually have several ongoing projects to get 
this work done.


0. Moving to recent version of CLooG.

Graphite was relying for a long time on CLooG-PPL, a CLooG version 
Sebastian forked and ported to PPL, because of copyright issues at that 
time. The fork was never officially maintained by cloog.org, but always 
by Sebastian himself. This was a significant maintenance burden and 
meant that we where cut of from improvements in the official CLooG 
library. With Andreas Simbuerger we had 2011 a summer of code student, 
that added support to use the official cloog.org. The cloog.org version

proved to be very stable, but we could not yet switch entirely over,
as this version uses isl as polyhedral library, which would introduce 
another library dependence to GCC (ppl, CLooG and now isl). One solution 
to get this patch in and to not increase the number of library 
dependences is to follow CLooG and to replace PPL with isl. As this was

desirable for several other reasons Sebastian went ahead:

1. The integer set library

Back in September Sebastian started the work to move Graphite to an 
actual integer set library. We have chosen isl [1], which is nowadays 
probably the most advanced open source integer set library*. The patch 
set as posted in September was incomplete and in parts incorrect. I 
finished the patch set. With the new patch set the core graphite 
transformations work entirely with isl. The only exceptions are the 
interchange cost function, the openscop export/import and the 
loop-flattening pass. Due to the native support for integer maps and 
especially due to how we can combine sets and maps with isl, the isl
implementation of graphite functions is often a lot simpler and easier 
to understand. But, more importantly, it finally allows us to gather 
modulo wrapping and undefined overflow characteristics and solves 
several other issues we had due to the use of rational polyhedra.


2. A real polyhedral optimizer

To get a real, generic polyhedral optimizer for Graphite we have chosen 
the Pluto algorithm. The original implementation of Pluto is available 
here [2], the relevant publications are [3] and [4]. Pluto is an 
polyhedral optimizer that uses a single cost function to optimize 
simultaneously for data locality, parallelism and tileability. It has 
shown good results on various kernels [5] (or see the papers) and Uday, 
the original author was employed to reimplement it in IBM XL. We added 
an implementation of this algorithm to isl. My recent patch set enables 
Graphite to use this new optimizer. Even though the patch is an early 
draft and definitely needs tuning to match the results of the original 
implementation, it is a great starting point for a real polyhedral 
optimizer in Graphite.


3. A new SCoP detection

Valdimir Kargov has implemented a new, structured SCoP detection within 
his 2010 Google Summer of Code project. Th

Re: Graphite news

2012-02-09 Thread Tobias Grosser


On 02/09/2012 11:05 PM, Vladimir Makarov wrote:

On 02/09/2012 07:42 AM, Tobias Grosser wrote:


== Who will do all the work? ==

After reading all the open projects you may wander who will do all the
work? Unfortunately Sebastian switched jobs at the end of 2011, such
that we lost one full time contributor. Furthermore, I am myself also
not full time working on Graphite, but work on my PhD where I am
founded to work on the LLVM Polly project. This means developer
resources are currently rare.

To solve this issue, I believe the best approach is to share as much
infrastructure as possible between different projects.


I just thought about licensing. If you are going to share the code
between GCC and LLVM, who will be the owner (FSF or UIUC). But may be I
understood you incorrectly and you meant only libraries (isl etc).


Both. Libraries are simple, as they are just used. For code I can only 
share code that I have written. That means, if I have written code I can 
add it at both places. If people start to contribute, moving code 
becomes more difficult. However, I assume that most of the stuff that 
can be shared will actually be part of isl, and only compiler specific 
stuff goes into GCC and LLVM. So I do not expect that a lot of code 
needs to be moved later on.


Tobi

Re: Graphite news

2012-02-09 Thread Tobias Grosser


On 02/09/2012 11:37 PM, Tobias Grosser wrote:

On 02/09/2012 11:05 PM, Vladimir Makarov wrote:

On 02/09/2012 07:42 AM, Tobias Grosser wrote:


== Who will do all the work? ==

After reading all the open projects you may wander who will do all the
work? Unfortunately Sebastian switched jobs at the end of 2011, such
that we lost one full time contributor. Furthermore, I am myself also
not full time working on Graphite, but work on my PhD where I am
founded to work on the LLVM Polly project. This means developer
resources are currently rare.

To solve this issue, I believe the best approach is to share as much
infrastructure as possible between different projects.


I just thought about licensing. If you are going to share the code
between GCC and LLVM, who will be the owner (FSF or UIUC). But may be I
understood you incorrectly and you meant only libraries (isl etc).


Both. Libraries are simple, as they are just used. For code I can only
share code that I have written. That means, if I have written code I can
add it at both places. If people start to contribute, moving code
becomes more difficult. However, I assume that most of the stuff that
can be shared will actually be part of isl, and only compiler specific
stuff goes into GCC and LLVM. So I do not expect that a lot of code
needs to be moved later on.


Or maybe I should be more specific.

I think the stuff that can actually be shared are polyhedral concepts 
and algorithms. Those are mathematical algorithms which are so high 
level that they are compiler independent. For Graphite and Polly itself 
I do not see too much code reuse. Obviously they will share some 
techniques and ideas, but the actual implementation will strongly depend 
on the compiler it is built for. Here we have not only different coding 
styles and conventions, but also different needs how we interact with 
the rest of the compilers, which heuristics we choose and how in general 
we use the polyhedral techniques. I have earlier implemented some 
algorithms for both compiler, but as soon as they are pushed upstream 
they slowly start to diverge and to adapt to what is best for each 
compiler. From that on I believe the sharing will most probably happen 
by sharing experience and ideas, not actual code. Not having the same 
code base will also allow both projects to explore new directions 
individually, which may result in useful ideas for both.


Cheers
Tobi

Re: Graphite news

2012-02-09 Thread Tobias Grosser


On 02/10/2012 01:58 AM, Vladimir Makarov wrote:

On 12-02-09 5:54 PM, Tobias Grosser wrote:

On 02/09/2012 11:37 PM, Tobias Grosser wrote:

On 02/09/2012 11:05 PM, Vladimir Makarov wrote:

On 02/09/2012 07:42 AM, Tobias Grosser wrote:


== Who will do all the work? ==

After reading all the open projects you may wander who will do all the
work? Unfortunately Sebastian switched jobs at the end of 2011, such
that we lost one full time contributor. Furthermore, I am myself also
not full time working on Graphite, but work on my PhD where I am
founded to work on the LLVM Polly project. This means developer
resources are currently rare.

To solve this issue, I believe the best approach is to share as much
infrastructure as possible between different projects.


I just thought about licensing. If you are going to share the code
between GCC and LLVM, who will be the owner (FSF or UIUC). But may be I
understood you incorrectly and you meant only libraries (isl etc).


Both. Libraries are simple, as they are just used. For code I can only
share code that I have written. That means, if I have written code I can
add it at both places. If people start to contribute, moving code
becomes more difficult. However, I assume that most of the stuff that
can be shared will actually be part of isl, and only compiler specific
stuff goes into GCC and LLVM. So I do not expect that a lot of code
needs to be moved later on.


Or maybe I should be more specific.

I think the stuff that can actually be shared are polyhedral concepts
and algorithms. Those are mathematical algorithms which are so high
level that they are compiler independent. For Graphite and Polly
itself I do not see too much code reuse. Obviously they will share
some techniques and ideas, but the actual implementation will strongly
depend on the compiler it is built for. Here we have not only
different coding styles and conventions, but also different needs how
we interact with the rest of the compilers, which heuristics we choose
and how in general we use the polyhedral techniques. I have earlier
implemented some algorithms for both compiler, but as soon as they are
pushed upstream they slowly start to diverge and to adapt to what is
best for each compiler. From that on I believe the sharing will most
probably happen by sharing experience and ideas, not actual code. Not
having the same code base will also allow both projects to explore new
directions individually, which may result in useful ideas for both.


Hey Vladimir,

thanks for your feedback!


Personally, I'd prefer that you focus more on GCC but it probably does
not depend only from you. I think the final results for LLVM will look
better than for GCC because LLVM generated code not so optimized as GCC
one.


No, not entirely. There have been various reasons to start with 
Polly/LLVM, one the need for GPU code generation. Even though I now need 
to split my development time, I am  still very interested in continuing 
the development of Graphite.



Thanks for the clarification, Tobias. In general, it looks ok for me
although there are always some advantages and disadvantages of such
approach: better resource usage but less competition as the approach to
loop optimizations in the both compilers will be basically the same.


It really depends who steps in to do the work. I will for myself 
obviously not take two entirely different approaches for GCC and LLVM, 
but I welcome everybody to join the loop optimization affords. If people 
want to take different ways I am the last person to block anybody. 
Please complain loudly, if any of the directions I propose for Graphite 
does not sound right to you. You are invited to either convince me to 
change directions or to experiment with a different approach. I am eager 
to learn where I fail badly or where I could do better.


In terms of competition I agree and I also see both positive and 
negative points. One good point I see is that with the joint polyhedral 
base library the competition now moves to the kind of high level 
optimizations implemented, instead of competing over the quality of the 
low level infrastructure.


Cheers
Tobi

Re: Iterating over RTL in Graphite

2012-02-17 Thread Tobias Grosser


On 02/17/2012 08:34 PM, David Malcolm wrote:

On Thu, 2012-02-16 at 19:17 -0400, Arnaldo wrote:

Hello everyone,

I'm working on an extension to the Graphite pass of GCC 4.4.0.  My
intention is to associate costs to RTL instructions by adding them as
RTX attributes to a machine description file, and to read them back
during the Graphite pass by iterating through each basic block.

Is the RTL available during this optimization pass? I'm not sure this
is the case as I get a segfault when trying to iterate over the RTL
with the code below ("internal compiler error: Segmentation fault"). I
don't need the fully resolved RTL, just to be able to read the
attribute given an RTL instruction.

I've tried debugging the compiler with gdb but it can't find the
debugging symbols even though they're there.  I'll keep trying to get
gdb to work but any leads on reading these attributes from within
Graphite is greatly appreciated.

I don't know about GCC 4.4, but a while back I wrote a script using my
GCC Python plugin to draw a "subway map" of GCC 4.6's passes:
http://gcc.gnu.org/ml/gcc/2011-07/msg00157.html
which you can see here:
http://gcc-python-plugin.readthedocs.org/en/latest/tables-of-passes.html

If I reading things correctly, the graphite passes happen whilst the
code is still in gimple form: the blocks are converted to RTL form in
the "expand" pass, which happens about 20 or so passes later.

Caveat: I'm not familiar with the insides of the graphite, and am
relatively new to gcc's insides, so I could be wrong; also the script
relies on the pass flags, and they're not necessarily correct either...


Yes, graphite works on GIMPLE. I believe I have never seen RTL when 
working on graphite, so I doubt it is easily available. (Maybe it is, 
but it is definitely not used within graphite).


Cheers
Tobi

Re: GCC 4.7.0 Release Candidate available from gcc.gnu.org

2012-03-03 Thread Tobias Grosser


On 03/02/2012 08:40 PM, Dennis Clarke wrote:

If all goes well, I'd like to release 4.7.0 in about three weeks.


I'll drop it on Solaris. Give it a push. Do we realy really need that
ppl/cloog stuff? I have never seen it build and pass any tests, ever,
even once, on Solaris with or without Sun Studio compilers or GCC or
prayer and geek magic under a full moon.  Seriously.


Given that PPL is a C++ library, you need to build it with g++ since
Studio CC and g++ aren't ABI-compatible.

That said, I'm using them in my Solaris GCC bootstraps, although it
admittedly took some effort to build initially and using it requires you
to jump through hoops to get the configure options right.  I agree that
this is an incredible mess right now.


I found it too be entirely too much work to be trusted and considered
stable long term. So therefore I generally kick ppl/cloog to the curb
and focus on core c,c++ gfortan and ada with objc thrown in without
the ppl/cloog bits at all. I have been very happy with my results and
the recent 4.6.3 RC was the most clean and pure results I have seen
to date, on Solaris 8 old Sparc no less :

http://gcc.gnu.org/ml/gcc-testresults/2012-02/msg02786.html


Hi Dennis,

I am planning to update graphite to a newer version of CLooG, which does 
not rely on PPL (but uses isl, a plain c library).


Is the C++ issue the only problem on solaris or are there further 
problems with CLooG? It would be good to know or test this in advance. 
Unfortunately I do not have any solaris machine and there is also none 
on the compile farm. Would you mind testing cloog 0.17 [1] and reporting 
if 'make check' succeeds.


Thanks a lot
Tobi

[1] 
http://www.bastoul.net/cloog/pages/download/count.php3?url=./cloog-0.17.0.tar.gz

Re: Graphite news

2012-06-22 Thread Tobias Grosser


On 06/22/2012 12:41 PM, Richard Guenther wrote:

On Thu, Mar 1, 2012 at 3:21 PM, Richard Guenther
  wrote:

On Thu, Feb 9, 2012 at 1:42 PM, Tobias Grosser  wrote:

Hi,

it has been quiet around Graphite for a while and I think it is more than
time to give an update on Graphite.

== The Status of Graphite ==

Graphite has been around for a while in GCC. During this time a lot of
people tested Graphite and Sebastian fixed many bugs. As of today the
Graphite infrastructure is pretty stable and hosts already specific
optimizations such as loop-interchange, blocking and loop-flattening.

However, during the development of Graphite we also found areas where
we are still way behind our possibilities.
First of all we realized that the use of a rational polyhedral library, even
though it provides some functionality for integer polyhedra, is blocking us.
Rational rational polyhedra worked OK for some time, but we have now come to
a point where the absence of real integer polyhedra is causing problems. We
have bugs that cannot be solved, just because rational polyhedra do not
represent correctly the set of integer points in the loop iterations.
Another deficit in Graphite is the absence of a generic optimizer. Even
though classical loop transformations work well for certain problems, one of
the major selling points of polyhedral techniques is the possibility to go
beyond classical loop transformations and to forget about the corresponding
pass ordering issues. Instead it is possible to define a generic cost
function for which to optimize. We currently do not take advantage of this
possibility and therefore miss possible performance gains.
And as a last point, Graphite still does not apply to as much code as it
could. We cannot transform a lot of code, not only because of the missing
support for casts (for which we need integer polyhedra), but also because of
an ad hoc SCoP detection and because some passes in the
GCC pass order complicate Graphite's job. Moving these road blocks out of
the way should increase the amount of code we can optimize significantly.

== The pipeline of upcoming graphite changes ==

As just pointed out there is still a lot of work to be done. We have been
aware of this and we actually have several ongoing projects to get this work
done.

0. Moving to recent version of CLooG.

Graphite was relying for a long time on CLooG-PPL, a CLooG version Sebastian
forked and ported to PPL, because of copyright issues at that time. The fork
was never officially maintained by cloog.org, but always by Sebastian
himself. This was a significant maintenance burden and meant that we where
cut of from improvements in the official CLooG library. With Andreas
Simbuerger we had 2011 a summer of code student, that added support to use
the official cloog.org. The cloog.org version
proved to be very stable, but we could not yet switch entirely over,
as this version uses isl as polyhedral library, which would introduce
another library dependence to GCC (ppl, CLooG and now isl). One solution to
get this patch in and to not increase the number of library dependences is
to follow CLooG and to replace PPL with isl. As this was
desirable for several other reasons Sebastian went ahead:

1. The integer set library

Back in September Sebastian started the work to move Graphite to an actual
integer set library. We have chosen isl [1], which is nowadays probably the
most advanced open source integer set library*. The patch set as posted in
September was incomplete and in parts incorrect. I finished the patch set.
With the new patch set the core graphite transformations work entirely with
isl. The only exceptions are the interchange cost function, the openscop
export/import and the loop-flattening pass. Due to the native support for
integer maps and especially due to how we can combine sets and maps with
isl, the isl
implementation of graphite functions is often a lot simpler and easier to
understand. But, more importantly, it finally allows us to gather modulo
wrapping and undefined overflow characteristics and solves several other
issues we had due to the use of rational polyhedra.

2. A real polyhedral optimizer

To get a real, generic polyhedral optimizer for Graphite we have chosen the
Pluto algorithm. The original implementation of Pluto is available here [2],
the relevant publications are [3] and [4]. Pluto is an polyhedral optimizer
that uses a single cost function to optimize simultaneously for data
locality, parallelism and tileability. It has shown good results on various
kernels [5] (or see the papers) and Uday, the original author was employed
to reimplement it in IBM XL. We added an implementation of this algorithm to
isl. My recent patch set enables Graphite to use this new optimizer. Even
though the patch is an early draft and definitely needs tuning to match the
results of the original implementation, it is a great starting point for a
real polyhedral optimizer in Graphite.

3. A new SCoP detection

Val

1 2 >

1 - 100 of 122 matches

Mail list logo