Re: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking

2014-03-14 Thread Richard Sandiford
Matthew Fortune  writes:
> Richard Sandiford  writes:
>> Matthew Fortune  writes:
>> >> I think instead we should have a configuration switch that allows a
>> >> particular -mfp option to be inserted alongside -mabi=32 if no
>> >> explicit -mfp is given.  This is how most --with options work.  Maybe
>> >> --with-fp- 32={32|64|xx}?  Specific triples could set a default value
>> if they like.
>> >> E.g. the MTI, SDE and mipsisa* ones would probably want to default to
>> >> -- with-32-fp=xx.  Triples aimed at MIPS IV and below would stay as
>> >> they are.  (MIPS IV is sometimes used with -mabi=32.)
>> >>
>> >> --with-fp-32 isn't the greatest name but is at least consistent with
>> >> --with-arch-32 and -mabi=32.  Maybe --with-fp-32=64 is so weird that
>> >> breaking consistency is better though.
>> >
>> > Tying the use of fpxx by default to a configure time setting is OK
>> > with me. When enabled it would still have to follow the rules as
>> > defined in the design in that it can only apply to architectures that
>> > can support the variant.
>> 
>> Right.  It's really equivalent to putting the -mfp on every command line
>> that doesn't have one.
>> 
>> > Currently that means everything but mips1.
>> 
>> Yeah, using -mips1 on a --with-{o}32-fp=xx toolchain would be an error.
>> 
>> > I'm not sure this is the same as tying an ABI to an architecture as
>> > both fp32 and fpxx are O32 and link compatible. Perhaps the configure
>> > switch would be --with-o32-fp={32|64|xx}. This shows it is just an O32
>> > related setting.
>> 
>> What I meant is that -march= and -mips shouldn't imply a different -mfp
>> setting.  The -mfp setting should be self-contained and it should be an
>> error if the architecture isn't compatible.
>> 
>> We might be in violent agreement here :-)  Like I say, I was just a bit
>> worried by the earlier -mips32r2 thing because there was a time when a -
>> mips option really could imply things like -mabi, -mgp and -mfp.
>> 
>> --with-o32-fp would be OK with me.  I'm just worried about the ABI being
>> spelt differently from -mabi=, but there's probably no perfect
>> alternative.
>
> I'd like to encourage the perspective that -mfp* options do not lead to
> a different ABI in the same sense that other variations do. While it is
> true that the calling conventions and code generation rules vary, 2 out
> of 3 combinations of -mfp32 -mfpxx and -mfp64 with -mabi=o32 are link
> compatible.

-mfp32 and -mfp64 aren't link-compatible though, so -mfp is part of the ABI.
What you're adding is a new variant that is individually link-compatible
with the other two (but obviously not both simultaneously).  It's a third
ABI variant in itself.

> The introduction of the modeless O32 ABI is intended to
> remove the part of the O32 definition that says 'FR=0' and hence the
> architecture then gets to dictate this and the generated code is still
> O32. It is true today that we have several architectures that mandate
> FR=0, some that cannot support fpxx and some that can support all fp*
> variations. I see nothing preventing the future having an architecture
> only supporting FR=1 though which we should also think about.

Agreed.

> When considering such a scenario it would be highly desirable for the
> following to just work as I believe architectural restrictions should
> be accounted for when designing default options. If the architecture
> gives no choice then it should just work IMO:
>
> Some ideas (speculating that someone builds a core called mips_n with
> only FR=1):
>
> --with-o32-fp=32
>
> mips-*-gcc -march=mips1 fp.c ==> generates fp32 code
> mips-*-gcc -march=mips2 fp.c ==> generates fp32 code 
> mips-*-gcc -march=mips32r2 fp.c ==> generates fp32 code 
> mips-*-gcc -march=mips32r2 -mfp64 fp.c ==> generates fp64 code
> mips-*-gcc -march=mips_n fp.c ==> generates fp64 code
>
> --with-o32-fp=xx
>
> mips-*-gcc -march=mips1 fp.c ==> generates fp32 code
> mips-*-gcc -march=mips2 fp.c ==> generates fpxx code 
> mips-*-gcc -march=mips32r2 fp.c ==> generates fpxx code 
> mips-*-gcc -march=mips32r2 -mfp64 fp.c ==> generates fp64 code
> mips-*-gcc -march=mips_n fp.c ==> generates fp64 code
>
> --with-o32-fp=64
>
> mips-*-gcc -march=mips1 fp.c ==> generates fp32 code
> mips-*-gcc -march=mips2 fp.c ==> generates fpxx code 
> mips-*-gcc -march=mips32r2 fp.c ==> generates fp64 code 
> mips-*-gcc -march=mips32r2 -mfp64 fp.c ==> generates fp64 code
> mips-*-gcc -march=mips32r2 -mfpxx fp.c ==> generates fpxx code
> mips-*-gcc -march=mips_n fp.c ==> generates fp64 code
>
> With these defaults, the closest supported ABI is used for each
> architecture based on the --with-o32-fp build option. The only one I
> really care about is the middle one as it makes full use of the O32 FPXX
> ABI without a user needing to account for arch restrictions.

Note that --with-* options just insert a canned -mfoo=bar option under
certain conditions, with those conditions being the same regardless of "bar".
So --with-o32-fp=32 should inser

RE: dom requires PROP_loops

2014-03-14 Thread Paulo Matos
> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: 13 March 2014 18:46
> To: Paulo Matos
> Cc: gcc@gcc.gnu.org
> Subject: RE: dom requires PROP_loops
> 
> On March 13, 2014 5:00:53 PM CET, Paulo Matos  wrote:
> >> -Original Message-
> >> From: Richard Biener [mailto:richard.guent...@gmail.com]
> >> Sent: 13 March 2014 13:24
> >> To: Paulo Matos
> >> Cc: gcc@gcc.gnu.org
> >> Subject: Re: dom requires PROP_loops
> >>
> >>
> >> Probably RTL cfgcleaup needs the same treatment as GIMPLE cfgcleanup
> >> then - allow removal if loop properties allows it.
> >>
> >
> >In both cfgcleanup.c and tree-cfgcleanup.c I can see code that protects
> >loop latches, but I see no code that allows removal of latch if
> >property allows it.
> >From what you say I would expect this would already be implemented in
> >tree-cfgcleanup.c, however what actually happens is that since
> >current_loops is non-null (PROP_loops is not destroyed in tree
> >loopdone), tree-cfgcleanup call chain ends up calling
> >cleanup_tree_cfg_bb on the bb loop latch and tree_forwarder_block_p
> >returns false for bb because of the following code thereby not removing
> >the latch:
> >  if (current_loops)
> >{
> >  basic_block dest;
> >  /* Protect loop latches, headers and preheaders.  */
> >  if (bb->loop_father->header == bb)
> > return false;
> >  dest = EDGE_SUCC (bb, 0)->dest;
> >
> >  if (dest->loop_father->header == dest)
> > return false;
> >}
> >
> >Why do we need to protect the latch?
> 
> You are looking at old sources.
>

That's correct. I was looking at 4.8. Let me take a look at what trunk is 
doing... :)
 
> Richard.
> 
> >Paulo Matos
> >
> >> Richard.
> >>
> 



Re: SET_EXPR_LOCATION usage for unused tree?

2014-03-14 Thread Richard Biener
On Thu, Mar 13, 2014 at 10:44 PM, Thomas Schwinge
 wrote:
> Hi!
>
> In gcc/c/c-parser.c:c_parser_omp_clause_num_threads (as well as other,
> similar functions), what is the point of setting the boolean tree c's
> location, given that this tree won't be used in the following?
>
>   /* Attempt to statically determine when the number isn't positive.  
> */
>   c = fold_build2_loc (expr_loc, LE_EXPR, boolean_type_node, t,
>build_int_cst (TREE_TYPE (t), 0));
>   if (CAN_HAVE_LOCATION_P (c))
> SET_EXPR_LOCATION (c, expr_loc);
>   if (c == boolean_true_node)
> {
>   warning_at (expr_loc, 0,
>   "% value must be positive");
>   t = integer_one_node;
> }
>   [c not used anymore]
>
> Both with and without the SET_EXPR_LOCATION, the error is the same:
>
> ../../loop.c: In function 'main':
> ../../loop.c:10:34: warning: 'num_threads' value must be positive
>  #pragma omp parallel num_threads(-1)
>   ^

That can be even simplified to avoid building the tree if it doesn't simplify
with

   c = fold_binary (LE_EXPR, boolean_type_node, t, build_int_cst
(TREE_TYPE (t), 0));
   if (c && c == boolean_true_node)
 {
warning_at (

Richard.

>
> Grüße,
>  Thomas


Re: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking

2014-03-14 Thread Richard Sandiford
Matthew Fortune  writes:
> The spec on:
> https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
> has been updated and attempts to account for all the feedback. Not
> everything has been possible to simplify/rework as requested but I
> believe I have managed to address many points cleanly.

(FWIW there seem to be some weird line breaks in the page which make
it a bit hard to read.)

The main thing that stood out for me was section 9.  If we have the
attributes and the program header (both good to have IMO) then we
shouldn't have an ELF flag too.  "Static" consumers should use the
attribute and "dynamic" consumers should use the program header.
The main point of encoding future info in a program header was to
relieve the pressure on the ELF flags.

As far as the program header encoding goes: I was thinking of a more
general mechanism that specifies a block of data, a bit like the current
PT_MIPS_OPTIONS does.  Encoding the information directly in the enumeration
wouldn't scale well, since we'd end up with the same problem as we have
now for ELF flags.  It would also be a bit wasteful to specify two bits
of information this way since the other parts of the header structure
don't carry any weight.

Thanks,
Richard


Legitimize address after reload

2014-03-14 Thread David Guillen
Hello,

I'm writing a simple gcc backend and I'm experiencing a weird thing
regarding address legitimation process. Two scenarios:

If I only allow addresses to be either a register or symbols my gcc
works. To do so I add the restrictions into the
TARGET_LEGITIMATE_ADDRESS_P macro. This makes gcc to force registers
for all the addresses.

If I allow also a 'PLUS' expression to be a valid address (adding the
restriction that the two addends are a register and a constant) it
happens (sometimes) that gcc comes up with an expression like this
one:

 (plus:SI (plus:SI (reg:SI somereg)
  (const_int 4))
 (const_int 8))


After taking a look at the 386 backend (and others) I just discovered
that there is a function called LEGITIMIZE_RELOAD_ADDRESS which is
responsible for handling this case. My issue is that this function is
not being called and, from what I saw while debugging, it seems that
the offending RTX expression is created after the address_reload pass,
and thus impossible for this pass to legitimize the address.

Looking at other architectures it seems that they are doing more or
less the same, so I don't know what the issue might be.

Do you have any idea?

Thanks,
David


Re: Legitimize address after reload

2014-03-14 Thread Julian Brown
On Fri, 14 Mar 2014 12:52:35 +0100
David Guillen  wrote:

> If I allow also a 'PLUS' expression to be a valid address (adding the
> restriction that the two addends are a register and a constant) it
> happens (sometimes) that gcc comes up with an expression like this
> one:
> 
>  (plus:SI (plus:SI (reg:SI somereg)
>   (const_int 4))
>  (const_int 8))
> 
> 
> After taking a look at the 386 backend (and others) I just discovered
> that there is a function called LEGITIMIZE_RELOAD_ADDRESS which is
> responsible for handling this case. My issue is that this function is
> not being called and, from what I saw while debugging, it seems that
> the offending RTX expression is created after the address_reload pass,
> and thus impossible for this pass to legitimize the address.

Look at how e.g. the ARM backend and others handle the "strict"
parameter to the legitimate_address hook -- you need to use that to
forbid pseudo registers being allowed in RTXs in the strict case.
LEGITIMIZE_RELOAD_ADDRESS is probably a red herring (at least for the
simple cases you're probably dealing with to start with), and isn't used
for LRA anyway. Getting these bits right can be very fiddly! The (plus
(reg) (const)) operands can arise before/during during register
elimination, IIRC. (You might need to get the register-elimination bits
right, too...)

Just a guess, anyway. (http://gcc.gnu.org/wiki/reload might be helpful
if you've not read it.)

Julian


RE: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking

2014-03-14 Thread Matthew Fortune
Richard Sandiford  writes:
> Matthew Fortune  writes:
> > The spec on:
> > https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinki
> > ng has been updated and attempts to account for all the feedback. Not
> > everything has been possible to simplify/rework as requested but I
> > believe I have managed to address many points cleanly.
> 
> (FWIW there seem to be some weird line breaks in the page which make it
> a bit hard to read.)

Apologies, I edited it offline and didn't check the result carefully enough. 
I'll clean it up. 

> The main thing that stood out for me was section 9.  If we have the
> attributes and the program header (both good to have IMO) then we
> shouldn't have an ELF flag too.  "Static" consumers should use the
> attribute and "dynamic" consumers should use the program header.
> The main point of encoding future info in a program header was to
> relieve the pressure on the ELF flags.

I know what you mean. I kept the ELF flag around because it firstly already 
exists (with the correct meaning as it happens) and secondly ELF flags are 
already consumed in the program loader whereas a small amount of new framework 
in the kernel is needed for the loader to respond to program headers. The 
'executable stack' header is currently consumed but the mechanism is not 
extensible today. My thinking is that the ELF flag eases us into the program 
loader but could validly be dropped/not required long term. It is largely 
ignored by the tools anyway in favour of the program headers.

I am happy to remove the ELF flag if I can confirm with our MIPS kernel 
developers that they can implement the program header inspection sooner rather 
than later.
 
> As far as the program header encoding goes: I was thinking of a more
> general mechanism that specifies a block of data, a bit like the current
> PT_MIPS_OPTIONS does.  Encoding the information directly in the
> enumeration wouldn't scale well, since we'd end up with the same problem
> as we have now for ELF flags.  It would also be a bit wasteful to
> specify two bits of information this way since the other parts of the
> header structure don't carry any weight.

I was trying to avoid the need for a program header to refer to a block of data 
as that is another part of the object that has to be loaded to determine the 
flag information. There are 2^28 processor specific program headers available 
which seems quite generous (I half though of using 2 for the two modes), but I 
do also recognise that most of the header then becomes wasted space. I guess 
there may be some complaint if we choose to abuse every field of a header to 
encode information (i.e. address, size, alignment etc) but this would be a nice 
compact way to store flags. It would be more visible to put flags in the 
address fields as these are already printed by readelf et al. but the processor 
specific flags are not. Personally I'd open up all the fields to abuse over 
adding a block of data. The block of data increases the complexity of the 
program loader and dynamic loader as they have to ensure more of an object is 
read in order to make a decision. The extra data needed from an object would 
also be target specific, all do-able I'm just not sure on complexity. I wonder 
if Joseph or Maciej have any thoughts here as I believe they discussed this 
idea of using program headers in the past. Since I'm far from being an expert 
in this area I'm OK with anything as long as I can get all maintainers of 
dynamic loaders and program loaders to agree (ha!). Bionic, glibc, uclibc and 
linux kernel are the primary targets here.

Regards,
Matthew


Re: Legitimize address after reload

2014-03-14 Thread David Guillen
Thanks for you info Julian.

I actually read all the docs and I think I 'more or less' understand
the inner workings of gcc.
What surprises me most is that during the non-strict RTL generation I
do not see any 'strange' address pattern but during the post-reload
process the non-legitimate address comes up. I guess it is due to the
fact that one of the PLUS operands is a memory operand (a local
var.?), thus resulting in double indirect memory address.

In any case I'm not using the restrict variable and I'm assuming
strict is zero, this is, not checking the hard regsiters themselves.
This is because any reg is OK for base reg. I'm pretty sure I'm
behaving similarly to arm, cris or x86 backends.

Thanks,
David



2014-03-14 13:11 GMT+01:00 Julian Brown :
> On Fri, 14 Mar 2014 12:52:35 +0100
> David Guillen  wrote:
>
>> If I allow also a 'PLUS' expression to be a valid address (adding the
>> restriction that the two addends are a register and a constant) it
>> happens (sometimes) that gcc comes up with an expression like this
>> one:
>>
>>  (plus:SI (plus:SI (reg:SI somereg)
>>   (const_int 4))
>>  (const_int 8))
>>
>>
>> After taking a look at the 386 backend (and others) I just discovered
>> that there is a function called LEGITIMIZE_RELOAD_ADDRESS which is
>> responsible for handling this case. My issue is that this function is
>> not being called and, from what I saw while debugging, it seems that
>> the offending RTX expression is created after the address_reload pass,
>> and thus impossible for this pass to legitimize the address.
>
> Look at how e.g. the ARM backend and others handle the "strict"
> parameter to the legitimate_address hook -- you need to use that to
> forbid pseudo registers being allowed in RTXs in the strict case.
> LEGITIMIZE_RELOAD_ADDRESS is probably a red herring (at least for the
> simple cases you're probably dealing with to start with), and isn't used
> for LRA anyway. Getting these bits right can be very fiddly! The (plus
> (reg) (const)) operands can arise before/during during register
> elimination, IIRC. (You might need to get the register-elimination bits
> right, too...)
>
> Just a guess, anyway. (http://gcc.gnu.org/wiki/reload might be helpful
> if you've not read it.)
>
> Julian


Re: Reg Alloc Problem.

2014-03-14 Thread Umesh Kalappa
Hi All,

To handle the below problem i.e making specific set of register as
base registers ,which is the subset of general registers set.

we see the *.c.208.ira logs as

Pass 0 for finding pseudo/allocno costs


r21: preferred BASE_REGS, alternative GENERAL_REGS, allocno GENERAL_REGS

a2 (r21,l0) best BASE_REGS, allocno GENERAL_REGS

r19: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS

a0 (r19,l0) best GENERAL_REGS, allocno GENERAL_REGS

r18: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS

a1 (r18,l0) best GENERAL_REGS, allocno GENERAL_REGS



  a0(r19,l0) costs: LOW_8BIT_REGS:0 BASE_REGS:0 GENERAL_REGS:0 ALL_REGS:0 MEM:8

  a1(r18,l0) costs: LOW_8BIT_REGS:0 BASE_REGS:0 GENERAL_REGS:0 ALL_REGS:0 MEM:8

  a2(r21,l0) costs: LOW_8BIT_REGS:2 BASE_REGS:0 GENERAL_REGS:4 ALL_REGS:4 MEM:8


where IRA choose the GENERAL_REG over BASE (preferred) for the r21
pseudo,i'm looking for the cause in our backend,but mean while anyone
in the group can share there experience w.r.t that will help me to
solve issue asap.

Thank you
~Umesh


On Wed, Mar 12, 2014 at 7:30 PM, Umesh Kalappa  wrote:
> Hi All,
>
> We are porting the gcc 4.8.1 to the new target and which has the pair
> 16 bit registers  like AB or CD or EF   and we modeled  it in
> reg_class as AB,CD and DE 16 bit pair_regs and CD ,EF as 16 bit
> base_regs and A,B,C,D E  and F as 8 bit as general_regs.
>
> We are stuck with below issues like
>
> 1)How do we modelled such that the register alloc to pick the
> respective  base_regs i.e CD,DE  instead of AB as show in the below
> case
>
> LD AB ,_a;//invalid instead of it should be emit LD CD ,_a
>
> LD (AB),#100;  // invalid instead of it should be emit LD (CD),#100
>
>
> Please note  that we override  the target hook like REGNO_REG_CLASS
> ,but still no luck here .
>
>  2)Current target enforce the restrictions on  the pair register set
> usage for multiplication  like
>
> MUL A,B  or MUL C,D or  MUL E,F
>
> But not MUL A,C or MUL  B,C  etc not across the pair_regs .
>
>
> Anyone can  please shed some lights here ,will be appreciate  and help
> us in the great way .
>
>  Thank you for the patience
>
> ~Umesh


Re: [gsoc 2014] moving fold-const patterns to gimple

2014-03-14 Thread Prathamesh Kulkarni
On Thu, Mar 13, 2014 at 4:44 PM, Richard Biener
 wrote:
> On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener
>  wrote:
>> On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni
>>  wrote:
>>> Hi Richard,
>>> Sorry for the late reply. I would like to have few clarifications
>>> regarding the following points:
>>>
>>> a) Pattern matching: Currently, gimple_match_and_simplify() matches
>>> patterns one-by-one. Could we use a decision tree to do the matching
>>> instead (similar to insn-recog.c) ?
>>> For the moment, let's consider pattern matching on only unary
>>> expressions without valueize and predicates:
>>> pattern 1: (negate (negate @0))
>>> pattern 2: (negate (bit_not @0))
>>>
>>> from the two AST's corresponding to patterns (match::op), we can build
>>> a decision tree:
>>> Some-thing similar to:
>>>NEGATE_EXPR
>>> NEGATE_EXPRBIT_NOT_EXPR
>>>
>>> and then generate code corresponding to this decision tree in gimple-match.c
>>> so the generated code should look something similar to:
>>>
>>> tree
>>> gimple_match_and_simplify (enum tree_code code, tree type, tree op0,
>>> gimple_seq *seq, tree (*valueize)(tree))
>>> {
>>>   if (code == NEGATE_EXPR)
>>> {
>>>   tree captures[4] = {};
>>>   if (TREE_CODE (op0) != SSA_NAME)
>>> return NULL_TREE;
>>>   gimple def_stmt = SSA_NAM_DEF_STMT (op0);
>>>   if (!is_gimple_assign (def_stmt))
>>> return NULL_TREE;
>>>   tree op = gimple_assign_rhs1 (def_stmt);
>>>   if (gimple_assign_rhs_code (op) == NEGATE_EXPR)
>>> {
>>>/* pattern (negate (negate @0)) matched */
>>> }
>>>   else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR)
>>> {
>>>/* pattern (negate (bit_not_expr @0)) matched */
>>> }
>>>   else
>>>return NULL_TREE;
>>>  }
>>>   else
>>>  return NULL_TREE;
>>> }
>>>
>>> For commutative ops, the pattern can be duplicated by walking the
>>> children of the node in reverse order.
>>> (I am not exactly clear so far about representing binary operators in a 
>>> decision
>>> tree) Is this the right way to go ? I shall try to shortly post a patch that
>>> implements this.
>>
>> Yes, that's the way to go (well, I'd even use a switch ()).
>>
>>> b) Targeting GENERIC, separating AST from gimple/generic:
>>> For generating a GENERIC pattern should there be another pattern
>>> something like match_and_simplify_generic ?
>>
>> Yes, there is an existing API in GCC for this that operates on GENERIC.
>> It's fold_unary_loc, fold_binary_loc, fold_ternary_loc.  The interface
>> the GENERIC match_and_simplify variant provides should match
>> that one.
>>
>>> Currently, the AST data structures (operand, expr, etc.)
>>> are tied to gimple (gen_gimple_match, gen_gimple_transform).
>>> We could also have similar functions: gen_generic_match,
>>> gen_generic_transform for generating GENERIC ?
>>
>> Yeah, but I'm not sure if keeping the (virtual) methods for generating
>> code makes much sense with a rewritten code generator.
>>
>>> Instead will it be better if we separate the AST
>>> from target IR (generic/gimple) and make simplify a visitor on AST
>>> (make simplify
>>> abstract class, with simplify_generic and simplify_gimple visitor
>>> classes that generate corresponding IR code) ?
>>
>> Yes.  Keep in mind the current state of genmatch.c is "quick hack
>> to make playing with the API side and with patterns possible" ;)
>>
>>> c) Shall it be a good idea in define_match  , for
>>> name to act as a substitute for pattern (similar to flex pattern
>>> definitions), so the name can be used in bigger patterns ?
>>
>> Maybe, I suppose we'll see when adding more patterns.
>>
>>> d) This is silly, but maybe use constants to denote corresponding tree 
>>> nodes ?
>>> for example instead of { build_int_cst (integer_type_node, 0); }, one
>>> could directly write 0, to denote a INTEGER_CST node with value 0.
>>
>> Yes, that might be possible - though it will require more knowledge
>> in the pattern matcher (you also want to match '0'?) and the code
>> generator.
>>
>>> e) There was a mention on the thread, regarding testing of patterns
>>> integrated into DSL. I wasn't able to understand that clearly. Could
>>> you explain that briefly ?
>>
>> DSL?  Currently I'd say it would be nice to make sure each pattern
>> is triggered by at least one GCC testcase - this requires looking
>> at a particular pass dump (that of forwprop or ccp are probably most suitable
>> as they are run very early).  I mentioned the possibility to do offline
>> (thus not with C testcases) testing but that would require some tool
>> to do that and it would be correctness testing (some automatic proof
>> generation tool - ISTR academics have this kind of stuff).  But that was
>> just an idea.
>>
>>> Regarding gsoc proposal, I would like to align it on the following points:
>>> a) Pattern matching using decision tree
>>
>> good.
>>
>>> b) Generate GIMPLE fold

Re: [gsoc 2014] moving fold-const patterns to gimple

2014-03-14 Thread Prathamesh Kulkarni
On Fri, Mar 14, 2014 at 9:01 PM, Prathamesh Kulkarni
 wrote:
> On Thu, Mar 13, 2014 at 4:44 PM, Richard Biener
>  wrote:
>> On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener
>>  wrote:
>>> On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni
>>>  wrote:
 Hi Richard,
 Sorry for the late reply. I would like to have few clarifications
 regarding the following points:

 a) Pattern matching: Currently, gimple_match_and_simplify() matches
 patterns one-by-one. Could we use a decision tree to do the matching
 instead (similar to insn-recog.c) ?
 For the moment, let's consider pattern matching on only unary
 expressions without valueize and predicates:
 pattern 1: (negate (negate @0))
 pattern 2: (negate (bit_not @0))

 from the two AST's corresponding to patterns (match::op), we can build
 a decision tree:
 Some-thing similar to:
NEGATE_EXPR
 NEGATE_EXPRBIT_NOT_EXPR

 and then generate code corresponding to this decision tree in 
 gimple-match.c
 so the generated code should look something similar to:

 tree
 gimple_match_and_simplify (enum tree_code code, tree type, tree op0,
 gimple_seq *seq, tree (*valueize)(tree))
 {
   if (code == NEGATE_EXPR)
 {
   tree captures[4] = {};
   if (TREE_CODE (op0) != SSA_NAME)
 return NULL_TREE;
   gimple def_stmt = SSA_NAM_DEF_STMT (op0);
   if (!is_gimple_assign (def_stmt))
 return NULL_TREE;
   tree op = gimple_assign_rhs1 (def_stmt);
   if (gimple_assign_rhs_code (op) == NEGATE_EXPR)
 {
/* pattern (negate (negate @0)) matched */
 }
   else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR)
 {
/* pattern (negate (bit_not_expr @0)) matched */
 }
   else
return NULL_TREE;
  }
   else
  return NULL_TREE;
 }

 For commutative ops, the pattern can be duplicated by walking the
 children of the node in reverse order.
 (I am not exactly clear so far about representing binary operators in a 
 decision
 tree) Is this the right way to go ? I shall try to shortly post a patch 
 that
 implements this.
>>>
>>> Yes, that's the way to go (well, I'd even use a switch ()).
>>>
 b) Targeting GENERIC, separating AST from gimple/generic:
 For generating a GENERIC pattern should there be another pattern
 something like match_and_simplify_generic ?
>>>
>>> Yes, there is an existing API in GCC for this that operates on GENERIC.
>>> It's fold_unary_loc, fold_binary_loc, fold_ternary_loc.  The interface
>>> the GENERIC match_and_simplify variant provides should match
>>> that one.
>>>
 Currently, the AST data structures (operand, expr, etc.)
 are tied to gimple (gen_gimple_match, gen_gimple_transform).
 We could also have similar functions: gen_generic_match,
 gen_generic_transform for generating GENERIC ?
>>>
>>> Yeah, but I'm not sure if keeping the (virtual) methods for generating
>>> code makes much sense with a rewritten code generator.
>>>
 Instead will it be better if we separate the AST
 from target IR (generic/gimple) and make simplify a visitor on AST
 (make simplify
 abstract class, with simplify_generic and simplify_gimple visitor
 classes that generate corresponding IR code) ?
>>>
>>> Yes.  Keep in mind the current state of genmatch.c is "quick hack
>>> to make playing with the API side and with patterns possible" ;)
>>>
 c) Shall it be a good idea in define_match  , for
 name to act as a substitute for pattern (similar to flex pattern
 definitions), so the name can be used in bigger patterns ?
>>>
>>> Maybe, I suppose we'll see when adding more patterns.
>>>
 d) This is silly, but maybe use constants to denote corresponding tree 
 nodes ?
 for example instead of { build_int_cst (integer_type_node, 0); }, one
 could directly write 0, to denote a INTEGER_CST node with value 0.
>>>
>>> Yes, that might be possible - though it will require more knowledge
>>> in the pattern matcher (you also want to match '0'?) and the code
>>> generator.
>>>
 e) There was a mention on the thread, regarding testing of patterns
 integrated into DSL. I wasn't able to understand that clearly. Could
 you explain that briefly ?
>>>
>>> DSL?  Currently I'd say it would be nice to make sure each pattern
>>> is triggered by at least one GCC testcase - this requires looking
>>> at a particular pass dump (that of forwprop or ccp are probably most 
>>> suitable
>>> as they are run very early).  I mentioned the possibility to do offline
>>> (thus not with C testcases) testing but that would require some tool
>>> to do that and it would be correctness testing (some automatic proof
>>> generation tool - ISTR academics have this kind of stuff).  Bu

Re: [gsoc 2014] moving fold-const patterns to gimple

2014-03-14 Thread Marc Glisse

On Fri, 14 Mar 2014, Prathamesh Kulkarni wrote:


I had a look at PR 14753
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14753) from the first
link. I have tried to implement those transforms (attached patch,
stage-1 compiled).
I have written the transforms to operate on GENERIC.


Why not directly gimple or the .pd file?


Is that correct ?
The patterns mentioned in the links were:
a) (X >> CST1) >= CST2 -> X >= CST2 << CST1
however, an expression Y >= CST gets folded to Y > CST - 1
so the transform I wrote:
(X >> CST1) > CST2 -> X > CST2 << CST1


That's not the same, try X=1, CST1=1, CST2=0.


b) (X & ~CST) == 0 -> X <= CST


Uh, that can't be true for all constants, only some with a very specific
shape (7 is 2^3-1).

--
Marc Glisse


Re: Legitimize address after reload

2014-03-14 Thread Jeff Law

On 03/14/14 05:52, David Guillen wrote:

Hello,

I'm writing a simple gcc backend and I'm experiencing a weird thing
regarding address legitimation process. Two scenarios:

If I only allow addresses to be either a register or symbols my gcc
works. To do so I add the restrictions into the
TARGET_LEGITIMATE_ADDRESS_P macro. This makes gcc to force registers
for all the addresses.

If I allow also a 'PLUS' expression to be a valid address (adding the
restriction that the two addends are a register and a constant) it
happens (sometimes) that gcc comes up with an expression like this
one:

  (plus:SI (plus:SI (reg:SI somereg)
   (const_int 4))
  (const_int 8))


After taking a look at the 386 backend (and others) I just discovered
that there is a function called LEGITIMIZE_RELOAD_ADDRESS which is
responsible for handling this case. My issue is that this function is
not being called and, from what I saw while debugging, it seems that
the offending RTX expression is created after the address_reload pass,
and thus impossible for this pass to legitimize the address.

Looking at other architectures it seems that they are doing more or
less the same, so I don't know what the issue might be.
LEGITIMIZE_RELOAD_ADDRESS is a hook that allows the target to rewrite 
invalid addresses during the reload pass in such a way that reload can 
generate the address more efficiently than the generic code in reload 
can do.


It should always be safe for LEGITIMIZE_RELOAD_ADDRESS to do nothing. 
For your problem I'm sure it's a total red herring.


Jeff



Re: [gsoc 2014] moving fold-const patterns to gimple

2014-03-14 Thread Prathamesh Kulkarni
On Fri, Mar 14, 2014 at 9:25 PM, Marc Glisse  wrote:
> On Fri, 14 Mar 2014, Prathamesh Kulkarni wrote:
>
>> I had a look at PR 14753
>> (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14753) from the first
>> link. I have tried to implement those transforms (attached patch,
>> stage-1 compiled).
>> I have written the transforms to operate on GENERIC.
>
>
> Why not directly gimple or the .pd file?
>
>
>> Is that correct ?
>> The patterns mentioned in the links were:
>> a) (X >> CST1) >= CST2 -> X >= CST2 << CST1
>> however, an expression Y >= CST gets folded to Y > CST - 1
>> so the transform I wrote:
>> (X >> CST1) > CST2 -> X > CST2 << CST1
>
>
> That's not the same, try X=1, CST1=1, CST2=0.
Ah yes. Shall following be correct ?
(X >> CST1) > CST2 -> X > ( (CST2 + 1) << CST1 ) - 1
Works correctly for X=1, CST1 = 1, CST2 = 0

(X >> CST1) > CST2
=> (X >> CST1) >= (CST2 + 1)   // this pattern is mentioned in PR
=> X >= (CST2 + 1) << CST1
=> X > ((CST2 + 1) << CST1) - 1
>
>
>> b) (X & ~CST) == 0 -> X <= CST
>
>
> Uh, that can't be true for all constants, only some with a very specific
> shape (7 is 2^3-1).
Agreed. Shall the pattern be folded if CST is 2^(n-1) ?
>
> --
> Marc Glisse


Re: [gsoc 2014] moving fold-const patterns to gimple

2014-03-14 Thread Richard Biener
On Fri, Mar 14, 2014 at 4:31 PM, Prathamesh Kulkarni
 wrote:
> On Thu, Mar 13, 2014 at 4:44 PM, Richard Biener
>  wrote:
>> On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener
>>  wrote:
>>> On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni
>>>  wrote:
 Hi Richard,
 Sorry for the late reply. I would like to have few clarifications
 regarding the following points:

 a) Pattern matching: Currently, gimple_match_and_simplify() matches
 patterns one-by-one. Could we use a decision tree to do the matching
 instead (similar to insn-recog.c) ?
 For the moment, let's consider pattern matching on only unary
 expressions without valueize and predicates:
 pattern 1: (negate (negate @0))
 pattern 2: (negate (bit_not @0))

 from the two AST's corresponding to patterns (match::op), we can build
 a decision tree:
 Some-thing similar to:
NEGATE_EXPR
 NEGATE_EXPRBIT_NOT_EXPR

 and then generate code corresponding to this decision tree in 
 gimple-match.c
 so the generated code should look something similar to:

 tree
 gimple_match_and_simplify (enum tree_code code, tree type, tree op0,
 gimple_seq *seq, tree (*valueize)(tree))
 {
   if (code == NEGATE_EXPR)
 {
   tree captures[4] = {};
   if (TREE_CODE (op0) != SSA_NAME)
 return NULL_TREE;
   gimple def_stmt = SSA_NAM_DEF_STMT (op0);
   if (!is_gimple_assign (def_stmt))
 return NULL_TREE;
   tree op = gimple_assign_rhs1 (def_stmt);
   if (gimple_assign_rhs_code (op) == NEGATE_EXPR)
 {
/* pattern (negate (negate @0)) matched */
 }
   else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR)
 {
/* pattern (negate (bit_not_expr @0)) matched */
 }
   else
return NULL_TREE;
  }
   else
  return NULL_TREE;
 }

 For commutative ops, the pattern can be duplicated by walking the
 children of the node in reverse order.
 (I am not exactly clear so far about representing binary operators in a 
 decision
 tree) Is this the right way to go ? I shall try to shortly post a patch 
 that
 implements this.
>>>
>>> Yes, that's the way to go (well, I'd even use a switch ()).
>>>
 b) Targeting GENERIC, separating AST from gimple/generic:
 For generating a GENERIC pattern should there be another pattern
 something like match_and_simplify_generic ?
>>>
>>> Yes, there is an existing API in GCC for this that operates on GENERIC.
>>> It's fold_unary_loc, fold_binary_loc, fold_ternary_loc.  The interface
>>> the GENERIC match_and_simplify variant provides should match
>>> that one.
>>>
 Currently, the AST data structures (operand, expr, etc.)
 are tied to gimple (gen_gimple_match, gen_gimple_transform).
 We could also have similar functions: gen_generic_match,
 gen_generic_transform for generating GENERIC ?
>>>
>>> Yeah, but I'm not sure if keeping the (virtual) methods for generating
>>> code makes much sense with a rewritten code generator.
>>>
 Instead will it be better if we separate the AST
 from target IR (generic/gimple) and make simplify a visitor on AST
 (make simplify
 abstract class, with simplify_generic and simplify_gimple visitor
 classes that generate corresponding IR code) ?
>>>
>>> Yes.  Keep in mind the current state of genmatch.c is "quick hack
>>> to make playing with the API side and with patterns possible" ;)
>>>
 c) Shall it be a good idea in define_match  , for
 name to act as a substitute for pattern (similar to flex pattern
 definitions), so the name can be used in bigger patterns ?
>>>
>>> Maybe, I suppose we'll see when adding more patterns.
>>>
 d) This is silly, but maybe use constants to denote corresponding tree 
 nodes ?
 for example instead of { build_int_cst (integer_type_node, 0); }, one
 could directly write 0, to denote a INTEGER_CST node with value 0.
>>>
>>> Yes, that might be possible - though it will require more knowledge
>>> in the pattern matcher (you also want to match '0'?) and the code
>>> generator.
>>>
 e) There was a mention on the thread, regarding testing of patterns
 integrated into DSL. I wasn't able to understand that clearly. Could
 you explain that briefly ?
>>>
>>> DSL?  Currently I'd say it would be nice to make sure each pattern
>>> is triggered by at least one GCC testcase - this requires looking
>>> at a particular pass dump (that of forwprop or ccp are probably most 
>>> suitable
>>> as they are run very early).  I mentioned the possibility to do offline
>>> (thus not with C testcases) testing but that would require some tool
>>> to do that and it would be correctness testing (some automatic proof
>>> generation tool - ISTR academics have this kind of stuff).  Bu

Re: [gsoc 2014] moving fold-const patterns to gimple

2014-03-14 Thread Marc Glisse

On Fri, 14 Mar 2014, Prathamesh Kulkarni wrote:


On Fri, Mar 14, 2014 at 9:25 PM, Marc Glisse  wrote:

On Fri, 14 Mar 2014, Prathamesh Kulkarni wrote:


The patterns mentioned in the links were:
a) (X >> CST1) >= CST2 -> X >= CST2 << CST1
however, an expression Y >= CST gets folded to Y > CST - 1
so the transform I wrote:
(X >> CST1) > CST2 -> X > CST2 << CST1


That's not the same, try X=1, CST1=1, CST2=0.

Ah yes. Shall following be correct ?
(X >> CST1) > CST2 -> X > ( (CST2 + 1) << CST1 ) - 1
Works correctly for X=1, CST1 = 1, CST2 = 0


Looks better. Though there is still the case where the new constant 
overflows, in which case we can fold the comparison to false.



b) (X & ~CST) == 0 -> X <= CST


Uh, that can't be true for all constants, only some with a very specific
shape (7 is 2^3-1).

Agreed. Shall the pattern be folded if CST is 2^(n-1) ?


Wrong parentheses. And I didn't really think about it, so that may not be 
the right test.


I think it would be a good idea to write, in comments, next to each 
non-trivial transformation, a short "proof" (at least some form of 
explanation). It would help people re-reading it later see quickly why the 
conditions are what they are.


--
Marc Glisse


Re: Legitimize address after reload

2014-03-14 Thread DJ Delorie

David Guillen  writes:
> In any case I'm not using the restrict variable and I'm assuming
> strict is zero, this is, not checking the hard regsiters themselves.
> This is because any reg is OK for base reg. I'm pretty sure I'm
> behaving similarly to arm, cris or x86 backends.

"strict" doesn't mean which hard register it is, "strict" means whether
or not it's a hard register at all.

If "strict" is true, you must assume any REG which isn't a real hard
register (i.e. REGNO >= FIRST_PSEUDO_REGISTER) does NOT match.


Integration of ISL code generator into Graphite

2014-03-14 Thread Roman Gareev
Dear gcc contributors,

I am going to try to participate in Google Summer of Code 2014. My
project is "Integration of ISL code generator into Graphite".


My proposal can be found at on the following link
https://drive.google.com/file/d/0B2Wloo-931AoTWlkMzRobmZKT1U/edit?usp=sharing
. I would be very grateful for your comments, feedback and ideas about
its improvement.


-

Roman Gareev


Re: Integration of ISL code generator into Graphite

2014-03-14 Thread Tobias Grosser

On 03/14/2014 09:21 PM, Roman Gareev wrote:

Dear gcc contributors,

I am going to try to participate in Google Summer of Code 2014. My
project is "Integration of ISL code generator into Graphite".


My proposal can be found at on the following link
https://drive.google.com/file/d/0B2Wloo-931AoTWlkMzRobmZKT1U/edit?usp=sharing
. I would be very grateful for your comments, feedback and ideas about
its improvement.


Thanks Roman,

I will have a look later on. For now, please make sure you already 
register now your proposal in Google Melange. You can always upload 
better/improved versions.


Thanks,
Tobias



PLEASE RE-ADD MIRRORS

2014-03-14 Thread Dan D .
Hello,

We previously had these same mirrors up under Go-Part.com but then changed our 
domain to Go-Parts.com. The mirror links then dropped off. We apologize deeply 
for this, and assure you that this is a one-time event. Going forward, the 
mirrors will stay up for a very long time to come, and are being served from 
very reliable and fast servers, and being monitored and maintained by a very 
competent server admin team.

PLEASE ADD:

(USA)
http://mirrors-usa.go-parts.com/gcc
ftp://mirrors.go-parts.com/gcc
rsync://mirrors.go-parts.com/gcc


(Australia)
http://mirrors-au.go-parts.com/gcc
ftp://mirrors-au.go-parts.com/gcc
rsync://mirrors-au.go-parts.com/gcc

(Russia)
http://mirrors-ru.go-parts.com/gcc
ftp://mirrors-ru.go-parts.com/gcc
rsync://mirrors-ru.go-parts.com/gcc


Thanks,
Dan   

PLEASE RE-ADD MIRRORS (small correction)

2014-03-14 Thread Dan D .
I made a small mistake below on the ftp/rsync mirrors for the USA mirror. They 
should be:

(USA)
http://mirrors-usa.go-parts.com/gcc
ftp://mirrors-usa.go-parts.com/gcc
rsync://mirrors-usa.go-parts.com/gcc


> From: dan1...@msn.com
> To: gcc@gcc.gnu.org
> Subject: PLEASE RE-ADD MIRRORS
> Date: Fri, 14 Mar 2014 16:53:22 -0700
>
> Hello,
>
> We previously had these same mirrors up under Go-Part.com but then changed 
> our domain to Go-Parts.com. The mirror links then dropped off. We apologize 
> deeply for this, and assure you that this is a one-time event. Going forward, 
> the mirrors will stay up for a very long time to come, and are being served 
> from very reliable and fast servers, and being monitored and maintained by a 
> very competent server admin team.
>
> PLEASE ADD:
>
> (USA)
> http://mirrors-usa.go-parts.com/gcc
> ftp://mirrors.go-parts.com/gcc
> rsync://mirrors.go-parts.com/gcc
>
>
> (Australia)
> http://mirrors-au.go-parts.com/gcc
> ftp://mirrors-au.go-parts.com/gcc
> rsync://mirrors-au.go-parts.com/gcc
>
> (Russia)
> http://mirrors-ru.go-parts.com/gcc
> ftp://mirrors-ru.go-parts.com/gcc
> rsync://mirrors-ru.go-parts.com/gcc
>
>
> Thanks,
> Dan