"Parallel" mode iterators

2014-08-21 Thread Dominik Vogt
Maybe some can help me a little bit with machine descriptions.
Assuming there is a define_insn pattern "foo", and that
pattern takes two arguments.  The first argument sould be of the
types DI, SI or HI, and the second argument is always half the
size of the first argument.

One can define mode iterators for

  (define_mode_iterator ITER1 [DI SI HI])
  (define_mode_iterator ITER2 [SI HI QI])

Is it possible to write something like this:

  (define_insn "foo" 
[(set (match_operand:ITER1 0 ...) 
 ...
[(match_operand:ITER1 1 ...)
 (match_operand:ITER2 2 ...)]
 ...

so that the pattern is copied only for the combinations DI-SI,
SI-HI and HI-QI, not for all nine combinations of the two
iterators?  (Or is there another way to get mode of the second
argument depending on the first argument?)

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: "Parallel" mode iterators

2014-08-21 Thread Ilya Verbin
2014-08-21 11:39 GMT+04:00 Dominik Vogt :
> One can define mode iterators for
>
>   (define_mode_iterator ITER1 [DI SI HI])
>   (define_mode_iterator ITER2 [SI HI QI])
>
> Is it possible to write something like this:
>
>   (define_insn "foo"
> [(set (match_operand:ITER1 0 ...)
>  ...
> [(match_operand:ITER1 1 ...)
>  (match_operand:ITER2 2 ...)]
>  ...
>
> so that the pattern is copied only for the combinations DI-SI,
> SI-HI and HI-QI, not for all nine combinations of the two
> iterators?  (Or is there another way to get mode of the second
> argument depending on the first argument?)

Look at ssehalfvecmode in i386/sse.md:

(define_mode_attr ssehalfvecmode
  [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI")
   (V32QI "V16QI") (V16HI  "V8HI") (V8SI  "V4SI") (V4DI "V2DI")
   (V16QI  "V8QI") (V8HI   "V4HI") (V4SI  "V2SI")
   (V16SF "V8SF") (V8DF "V4DF")
   (V8SF  "V4SF") (V4DF "V2DF")
   (V4SF  "V2SF")])

(define_expand "avx_vextractf128"
  [(match_operand: 0 "nonimmediate_operand")
   (match_operand:V_256 1 "register_operand")
   (match_operand:SI 2 "const_0_to_1_operand")]

  -- Ilya


Re: LTO bootstrap compare errors for ARM64

2014-08-21 Thread Richard Biener
On Wed, Aug 20, 2014 at 6:12 PM, Jan Hubicka  wrote:
>> On Wed, Aug 20, 2014 at 9:28 AM, Venkataramanan Kumar
>>  wrote:
>> > Hi Honza,
>> >
>> > After discussing with Richard Beiner via
>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077, it look like it is
>> > an existing problem in trunk and is masked due the fact that stage1
>> > and stage2 compilers in trunk are built with enable-checking and hence
>> > same garbage collection tuning parameters.
>>
>> Note that it works on trunk with --enable-checking=release for
>> whatever reason.
>
> Strange, do you know how the IR is affected by garbage collection?

Well, for at least one remaining case it seems that type (parts) are
re-used differently, so the input at the LTO-out streaming has
type (parts) duplicated (or not) dependent on GC.  The PR tells
about the case I didn't finish tracking down.

The case fixed is somebody modifying sth in place we stored into
hash tables for re-use.  Not sure what the second case is.

I still think we should try to understand why trunk is not affected?

Richard.

> Honza
>>
>> > In release branches, stage 1 is built with some checks like "gc" but
>> > stage 2 is not.
>> > These gc parameters affect the LTO IR and it gets streamed differently.
>> >
>> > Currently for release branches we have workaround of setting same gc
>> > parameters for stage1 and stage2 builds (or) build stage1 with
>> > ---enable-checking=release.
>> >
>> > regards,
>> > Venkat
>> >
>> >
>> > On 11 August 2014 16:20, Venkataramanan Kumar
>> >  wrote:
>> >> Hi Honza,
>> >>
>> >> I did not find any differences in tree level dumps. These are the dump
>> >> differences in IPA
>> >>
>> >> In gimple-fold.c.000i.cgraph
>> >>
>> >> (--Snip--)
>> >> < _Z25gimple_build_omp_continueP9tree_nodeS0_/761
>> >> (gimple_build_omp_continue(tree_node*, tree_node*)) @0x3ff7ebda548
>> >> ---
>> >>> _Z25gimple_build_omp_continueP9tree_nodeS0_/761 
>> >>> (gimple_build_omp_continue(tree_node*, tree_node*)) @0x3ff92b5a548
>> >> 28865c28865
>> >> < _Z26gimple_build_omp_taskgroupP21gimple_statement_base/760
>> >> (gimple_build_omp_taskgroup(gimple_statement_base*)) @0x3ff7ebda400
>> >> ---
>> >>> _Z26gimple_build_omp_taskgroupP21gimple_statement_base/760 
>> >>> (gimple_build_omp_taskgroup(gimple_statement_base*)) @0x3ff92b5a400
>> >> 28875c28875
>> >> < _Z23gimple_build_omp_masterP21gimple_statement_base/759
>> >> (gimple_build_omp_master(gimple_statement_base*)) @0x3ff7ebda2b8
>> >> ---
>> >>> _Z23gimple_build_omp_masterP21gimple_statement_base/759 
>> >>> (gimple_build_omp_master(gimple_statement_base*)) @0x3ff92b5a2b8
>> >> 28885c28885
>> >> < _Z24gimple_build_omp_sectionP21gimple_statement_base/758
>> >> (gimple_build_omp_section(gimple_statement_base*)) @0x3ff7ebda170
>> >> ---
>> >>> _Z24gimple_build_omp_sectionP21gimple_statement_base/758 
>> >>> (gimple_build_omp_section(gimple_statement_base*)) @0x3ff92b5a170
>> >> (--Snip--)
>> >>
>> >>
>> >> In gimple.c.044i.profile_estimate
>> >>
>> >> (--Snip--)
>> >>
>> >> 1987c1987
>> >> < vec::qsort(int (*)(void const*, void
>> >> const*)) (struct vec * const this, int (*) (const void *, const
>> >> void *) cmp)
>> >> ---
>> >>> vec::qsort(int (*)(void const*, void 
>> >>> const*)) (struct vec * const this, int (*) (const void *, const 
>> >>> void *) cmp)
>> >> (--Snip--)
>> >>
>> >> gimple.c.048i.inline
>> >>
>> >> (--Snip--)
>> >>
>> >> <   min size:   6
>> >> ---
>> >>>   min size:   0
>> >> 6590c6590
>> >> <   min size:   14
>> >> ---
>> >>>   min size:   0
>> >> 6607c6607
>> >> <   min size:   28
>> >> (--Snip--)
>> >>
>> >> On 7 August 2014 19:14, Jan Hubicka  wrote:
>> 
>>  As a First step I compared the "objump -D" dump between
>>  "stage2-gcc/gimple.o"  and "stage3-gcc/gimple.o".  Differences are in
>>  LTO sections .gnu.lto_.decls.0, .gnu.lto_.symtab.
>>  Ref: http://paste.ubuntu.com/7949238/
>> >>>
>> >>> If you see the differences already in .o files (i.e. at compile time), I 
>> >>> think the next
>> >>> step is to produce -fdump-tree-all -fdump-ipa-all dumps of 
>> >>> stage2-gcc/gimple.o
>> >>> and stage3-gcc/gimple.o and see how they differ.
>> >>>
>> >>> Debugging misoptimization of LTO stage2 compiler will be interesting - I 
>> >>> guess we can
>> >>> first try to identify what is wrong rahter than usual bisecting method...
>> >>>
>> >>> Honza
>> 
>>  No differences when when using "objdump -d".
>> 
>>  Next I passed "-save-temps" to stage2 and stage3 builds and compared
>>  the assembly files. The differences are in strings dumped via .ascii
>>  and ,string directives.
>> 
>>  Next I checked the flags passed to the stage 2 and stage 3 builds. It
>>  is same and below is the flag set being passed.
>> 
>>  -save-temps -O2 -g -flto -flto=jobserver -frandom-seed=1
>>  -ffat-lto-objects -DIN_GCC-fno-exceptions -fno-rtti
>>  -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
>>  -Wcast-qual

Re: [match-and-simplify] express conversions in transform ?

2014-08-21 Thread Prathamesh Kulkarni
On Tue, Aug 19, 2014 at 4:35 PM, Richard Biener
 wrote:
> On Tue, Aug 19, 2014 at 12:18 PM, Prathamesh Kulkarni
>  wrote:
>> I was wondering how to write transform that involves conversions
>> (without using c_expr) ?
>> for eg:
>> (T1)(~(T2) X)  -> ~(T1) X
>> // T1, T2 has same precision and X is an integral type not narrower than T1, 
>> T2
>>
>> this could be written as:
>> (simplify
>>   (convert (bit_not (convert@0 integral_op_p@1)))
>>   if-expr
>>   transform)
>>
>> How would the transform be written ?
>> with c_expr we could probably write as: (is that correct?)
>> (bit_not { fold_convert (type, @1); })
>
> No, unfortunately while that's correct for GENERIC it doesn't work
> for GIMPLE code-gen.  c_exprs are required to evaluate to
> "atomics", that is, non-expressions.
>
>> I was wondering whether we should have an explicit support for
>> conversions (something equivalent of "casting") ?
>> sth like:
>> (bit_not (cast type @0))
>
> Indeed simply writing
>
>(bit_not (convert @1))
>
> will make the code-generator "convert" @0 to its own type
> (I knew the hack to simply use the first operands type is not
> going to work in all cases... ;).  Other conversion operators
> include FIX_TRUNC_EXPR (float -> int), FLOAT_EXPR
> (int -> float).
>
> For most cases it works by accident as the outermost expression
> knows its type via 'type'.
>
> In the cases where we need to specify a type I'd like to avoid
> making the type-converting operators take an additional operand.
> Instead can we make it use a operator flag-with-argument?  Like
>
>   (bit_not (convert:type @1))
>
> or for the type of a capture (convert:t@2 @1).
>
> We can also put in some more intelligence in automatically
> determining the type to convert to.  In your example we
> know the bit_not is of type 'type' and this has to match
> the type of its operands thus the 'convert' needs to convert
> to 'type' as well.  I'm sure there are cases we can't
> auto-compute though, like chains of conversions (but not sure
> they matter in practice).
>
> So let's try that - Micha will like more intelligence in the generator ;)
>
> I suggest to add a predicate to the generator conversion_op_p
> to catch the various converting operators and then assert that
> during code-gen we are going to create it with 'type' (outermost
> type), not TREE_TYPE (ops[0]).  Then start from the cases
> we want to support and special-case (which requires passing
> down the type of the enclosing expression).
Um sorry, I am not sure if I understood that correctly.
How would conversion_op_p know when to use 'type' and not TREE_TYPE (ops[0]) ?
conversion_op_p would be used somewhat similar to in the attached
patch (illustrate only) ?

Thanks,
Prathamesh
>
> Thanks,
> Richard.
>
>> in case we want to cast to a type of another capture
>> we could use:
>> (foo (cast (type @0) @1)) ?  // cast @1 to @0's type
>> here type would be an operator that extracts type of the capture.
>>
>> Another use of having type as an operator I guess would be for
>> type-matching (not sure if it's useful).
>> for instance:
>> (foo (bar @0 @1) @(type @0)@1)
>> that checks if @0, @1 have same types.
>>
>> I was wondering on the same lines if we could introduce new
>> keyword "precision" analogous to type ?
>> (precision @0) would be short-hand for
>> TYPE_PRECISION (TREE_TYPE (@0))
>>
>> So far I found couple patterns in fold_unary_loc with conversions
>> in transform:
>> (T1)(X p+ Y) into ((T1)X p+ Y)
>> (T1)(~(T2) X) -> ~(T1) X
>> (T1) (X * Y) -> (T1) X * (T1) Y
>>
>> Thanks,
>> Prathamesh
Index: genmatch.c
===
--- genmatch.c	(revision 214258)
+++ genmatch.c	(working copy)
@@ -859,6 +859,18 @@ check_no_user_id (simplify *s)
 
 /* Code gen off the AST.  */
 
+bool
+conversion_op_p (e_operation *oper)
+{
+  if (strcmp (oper->op->id, "CONVERT_EXPR") != 0
+  && strcmp (oper->op->id, "FIX_TRUNC_EXPR") != 0
+  && strcmp (oper->op->id, "FLOAT_EXPR") != 0)
+return false;
+
+  // ??? how to assert convert should use 'type' 
+  return false;
+}
+
 void
 expr::gen_transform (FILE *f, const char *dest, bool gimple, int depth)
 {
@@ -875,9 +887,13 @@ expr::gen_transform (FILE *f, const char
   /* ???  Have another helper that is like gimple_build but may
 	 fail if seq == NULL.  */
   fprintf (f, "  if (!seq)\n"
-	   "{\n"
-	   "  res = gimple_simplify (%s, TREE_TYPE (ops%d[0])",
+	   "{\n");
+  if (!conversion_op_p (operation))
+	fprintf (f, "  res = gimple_simplify (%s, TREE_TYPE (ops%d[0])",
 	   operation->op->id, depth);
+  else
+	fprintf (f, "  res = gimple_simplify (%s, type", operation->op->id);
+
   for (unsigned i = 0; i < ops.length (); ++i)
 	fprintf (f, ", ops%d[%u]", depth, i);
   fprintf (f, ", seq, valueize);\n");


Re: LTO inhibiting dwarf lexical blocks output

2014-08-21 Thread Richard Biener
On Wed, Aug 20, 2014 at 6:18 PM, Jan Hubicka  wrote:
>> On August 18, 2014 8:46:00 PM CEST, Jan Hubicka  wrote:
>> >>
>> >> The following seems to fix it.  In testing now.
>> >
>> >Will streaming as non-reference prevent DECL from being merged and
>> >tails of BLOCK_VAR chains
>> >to be corrupted?
>>
>> Yes, the decl ends up in the function section then, not the global types and 
>> decls one.
>
> Hmm, breaking one decl rule.  tree-inliner used to do that years ago, too. 
> When function declaring
> extern in local scope got inlined, we duplicated the node.  I fixed that by 
> pushing these to
> nonlocalized_list instead.

Well, those decls are only there for debug info so nonlocalized stuff
is only a memory optimization.  I also remember trying to figure out
a testcase convincing me we should stream that vector for LTO
(we don't!).  And I failed.

> Perhaps we could do that earlier, in FE (or fixup in free lang data), for all 
> EXTERn decls
> avoiding those duplications.

So I concluded the same - either we should get rid of that or we should
fix the FEs to put all non-automatics into this vector.

Or simply put _all_ decls into a vector and not a DECL_CHAIN.

Richard.

> Honza
>>
>> Richard.
>>
>> >
>> >Honza
>> >>
>> >> Richard.
>> >>
>> >> > Richard.
>> >> >
>> >> >> Thanks.
>> >> >> Aldy
>>


Re: Frame pointer optimization issues

2014-08-21 Thread Richard Earnshaw
On 21/08/14 00:24, Richard Henderson wrote:
> On 08/20/2014 08:22 AM, Wilco Dijkstra wrote:
>> 2. Change the mid-end to call _frame_pointer_required even when
>> !flag_omit_frame_pointer.
> 
> Um, it does that already.  At least as far as I can see from
> ira_setup_eliminable_regset and update_eliminables.
> 
> It turns out to be much easier to re-enable a frame pointer for a given
> function than to disable a frame pointer.  Thus I believe that you should
> approach -momit_leaf_frame_pointer as setting flag_omit_frame_pointer, and 
> then
> re-enabling it in frame_pointer_required.  This requires more than one line in
> common/config/arch/arch.c, but it shouldn't be much more than ten.
> 
>> A second issue with frame pointers is that update_eliminables() in reload1.c 
>> might set
>> frame_pointer_needed to false without any checks.
> 
> How?  I don't see that path, since the very first thing update_eliminables 
> does
> is call frame_pointer_required -- even before it calls can_eliminate.
> 
> Incidentally, I was working on exactly this (plus improving the unwind info)
> before I left on vacation a couple weeks ago.  Note that you'll also need to
> remove x29 from the fixed registers before eliminating the frame pointer does
> any real good.

Removing x29 from the list of fixed registers will cause any code
relying on a frame chain to crash horribly (external profiling agents,
for example); this conforms to the second option for frame-pointer use
in AAPCS64.  I've seen very little code that really benefits from an
additional register here (performance mostly comes from savings in the
prologue/epilogue), so I think users should have to explicitly remove it
from the fixed list (-fcall-saved-x29) if that's their preference.

R.



Re: [match-and-simplify] express conversions in transform ?

2014-08-21 Thread Richard Biener
On Thu, Aug 21, 2014 at 10:42 AM, Prathamesh Kulkarni
 wrote:
> On Tue, Aug 19, 2014 at 4:35 PM, Richard Biener
>  wrote:
>> On Tue, Aug 19, 2014 at 12:18 PM, Prathamesh Kulkarni
>>  wrote:
>>> I was wondering how to write transform that involves conversions
>>> (without using c_expr) ?
>>> for eg:
>>> (T1)(~(T2) X)  -> ~(T1) X
>>> // T1, T2 has same precision and X is an integral type not narrower than 
>>> T1, T2
>>>
>>> this could be written as:
>>> (simplify
>>>   (convert (bit_not (convert@0 integral_op_p@1)))
>>>   if-expr
>>>   transform)
>>>
>>> How would the transform be written ?
>>> with c_expr we could probably write as: (is that correct?)
>>> (bit_not { fold_convert (type, @1); })
>>
>> No, unfortunately while that's correct for GENERIC it doesn't work
>> for GIMPLE code-gen.  c_exprs are required to evaluate to
>> "atomics", that is, non-expressions.
>>
>>> I was wondering whether we should have an explicit support for
>>> conversions (something equivalent of "casting") ?
>>> sth like:
>>> (bit_not (cast type @0))
>>
>> Indeed simply writing
>>
>>(bit_not (convert @1))
>>
>> will make the code-generator "convert" @0 to its own type
>> (I knew the hack to simply use the first operands type is not
>> going to work in all cases... ;).  Other conversion operators
>> include FIX_TRUNC_EXPR (float -> int), FLOAT_EXPR
>> (int -> float).
>>
>> For most cases it works by accident as the outermost expression
>> knows its type via 'type'.
>>
>> In the cases where we need to specify a type I'd like to avoid
>> making the type-converting operators take an additional operand.
>> Instead can we make it use a operator flag-with-argument?  Like
>>
>>   (bit_not (convert:type @1))
>>
>> or for the type of a capture (convert:t@2 @1).
>>
>> We can also put in some more intelligence in automatically
>> determining the type to convert to.  In your example we
>> know the bit_not is of type 'type' and this has to match
>> the type of its operands thus the 'convert' needs to convert
>> to 'type' as well.  I'm sure there are cases we can't
>> auto-compute though, like chains of conversions (but not sure
>> they matter in practice).
>>
>> So let's try that - Micha will like more intelligence in the generator ;)
>>
>> I suggest to add a predicate to the generator conversion_op_p
>> to catch the various converting operators and then assert that
>> during code-gen we are going to create it with 'type' (outermost
>> type), not TREE_TYPE (ops[0]).  Then start from the cases
>> we want to support and special-case (which requires passing
>> down the type of the enclosing expression).
> Um sorry, I am not sure if I understood that correctly.
> How would conversion_op_p know when to use 'type' and not TREE_TYPE (ops[0]) ?
> conversion_op_p would be used somewhat similar to in the attached
> patch (illustrate only) ?

Yeah, kind of.  Attached is my try which handles

   (bit_not (convert (plus @0 (convert @1

fine (converting @1 to the type of @0) but obviously not

   (convert (convert @1))

which is impossible to auto-guess and not

  (bit_not (convert (plus (convert @0) @1)))

though that can be made work by first generating code for @1
and then for (convert @0) so it can access the type of @1
to convert to its type.

I'll give the patch proper testing and will install it.

Thanks,
Richard.

> Thanks,
> Prathamesh
>>
>> Thanks,
>> Richard.
>>
>>> in case we want to cast to a type of another capture
>>> we could use:
>>> (foo (cast (type @0) @1)) ?  // cast @1 to @0's type
>>> here type would be an operator that extracts type of the capture.
>>>
>>> Another use of having type as an operator I guess would be for
>>> type-matching (not sure if it's useful).
>>> for instance:
>>> (foo (bar @0 @1) @(type @0)@1)
>>> that checks if @0, @1 have same types.
>>>
>>> I was wondering on the same lines if we could introduce new
>>> keyword "precision" analogous to type ?
>>> (precision @0) would be short-hand for
>>> TYPE_PRECISION (TREE_TYPE (@0))
>>>
>>> So far I found couple patterns in fold_unary_loc with conversions
>>> in transform:
>>> (T1)(X p+ Y) into ((T1)X p+ Y)
>>> (T1)(~(T2) X) -> ~(T1) X
>>> (T1) (X * Y) -> (T1) X * (T1) Y
>>>
>>> Thanks,
>>> Prathamesh


p2-2
Description: Binary data


RE: Frame pointer optimization issues

2014-08-21 Thread Thomas Preud'homme
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of
> Wilco Dijkstra
> 
> One could hack this a bit further and set flag_omit_frame_pointer = 2 to
> differentiate between a
> user setting and the override hack, but that's just making things even worse.
> So I see 3 possible
> solutions:
> 
> 1. Add a copy of flag_omit_frame_pointer, and only modify that in the
> override. This is the generic
> correct solution that allows any kind of modifications on the copies. This
> could be done by making
> all flags separate variables and automating the copy in the options parsing
> code. Any code that
> writes the x_flag_ variables should eventually be fixed to stop doing this to
> avoid these bugs (i386
> does this 22 times and c6x 2x).

I just sent a patch [1] following a similar approach for a problem on ARM target
when compiling for thumb1 and optimizing for size. In this case call_used_regs
will be modified to not save some registers in the prologue, ignoring any
value the user might have set.

[1] https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01970.html

So I think a general solution would be useful.

Best regards,

Thomas





Re: [match-and-simplify] express conversions in transform ?

2014-08-21 Thread Richard Biener
On Thu, Aug 21, 2014 at 11:57 AM, Richard Biener
 wrote:
> On Thu, Aug 21, 2014 at 10:42 AM, Prathamesh Kulkarni
>  wrote:
>> On Tue, Aug 19, 2014 at 4:35 PM, Richard Biener
>>  wrote:
>>> On Tue, Aug 19, 2014 at 12:18 PM, Prathamesh Kulkarni
>>>  wrote:
 I was wondering how to write transform that involves conversions
 (without using c_expr) ?
 for eg:
 (T1)(~(T2) X)  -> ~(T1) X
 // T1, T2 has same precision and X is an integral type not narrower than 
 T1, T2

 this could be written as:
 (simplify
   (convert (bit_not (convert@0 integral_op_p@1)))
   if-expr
   transform)

 How would the transform be written ?
 with c_expr we could probably write as: (is that correct?)
 (bit_not { fold_convert (type, @1); })
>>>
>>> No, unfortunately while that's correct for GENERIC it doesn't work
>>> for GIMPLE code-gen.  c_exprs are required to evaluate to
>>> "atomics", that is, non-expressions.
>>>
 I was wondering whether we should have an explicit support for
 conversions (something equivalent of "casting") ?
 sth like:
 (bit_not (cast type @0))
>>>
>>> Indeed simply writing
>>>
>>>(bit_not (convert @1))
>>>
>>> will make the code-generator "convert" @0 to its own type
>>> (I knew the hack to simply use the first operands type is not
>>> going to work in all cases... ;).  Other conversion operators
>>> include FIX_TRUNC_EXPR (float -> int), FLOAT_EXPR
>>> (int -> float).
>>>
>>> For most cases it works by accident as the outermost expression
>>> knows its type via 'type'.
>>>
>>> In the cases where we need to specify a type I'd like to avoid
>>> making the type-converting operators take an additional operand.
>>> Instead can we make it use a operator flag-with-argument?  Like
>>>
>>>   (bit_not (convert:type @1))
>>>
>>> or for the type of a capture (convert:t@2 @1).
>>>
>>> We can also put in some more intelligence in automatically
>>> determining the type to convert to.  In your example we
>>> know the bit_not is of type 'type' and this has to match
>>> the type of its operands thus the 'convert' needs to convert
>>> to 'type' as well.  I'm sure there are cases we can't
>>> auto-compute though, like chains of conversions (but not sure
>>> they matter in practice).
>>>
>>> So let's try that - Micha will like more intelligence in the generator ;)
>>>
>>> I suggest to add a predicate to the generator conversion_op_p
>>> to catch the various converting operators and then assert that
>>> during code-gen we are going to create it with 'type' (outermost
>>> type), not TREE_TYPE (ops[0]).  Then start from the cases
>>> we want to support and special-case (which requires passing
>>> down the type of the enclosing expression).
>> Um sorry, I am not sure if I understood that correctly.
>> How would conversion_op_p know when to use 'type' and not TREE_TYPE (ops[0]) 
>> ?
>> conversion_op_p would be used somewhat similar to in the attached
>> patch (illustrate only) ?
>
> Yeah, kind of.  Attached is my try which handles
>
>(bit_not (convert (plus @0 (convert @1
>
> fine (converting @1 to the type of @0) but obviously not
>
>(convert (convert @1))
>
> which is impossible to auto-guess and not
>
>   (bit_not (convert (plus (convert @0) @1)))
>
> though that can be made work by first generating code for @1
> and then for (convert @0) so it can access the type of @1
> to convert to its type.
>
> I'll give the patch proper testing and will install it.

And if we really need sth like (convert (convert @0)) then
this can either use a c-expr or we can invent sth like
(convert (match-type @1 (convert @0)))

that is, add a "fake" binary operation to guess the type.

Richard.

> Thanks,
> Richard.
>
>> Thanks,
>> Prathamesh
>>>
>>> Thanks,
>>> Richard.
>>>
 in case we want to cast to a type of another capture
 we could use:
 (foo (cast (type @0) @1)) ?  // cast @1 to @0's type
 here type would be an operator that extracts type of the capture.

 Another use of having type as an operator I guess would be for
 type-matching (not sure if it's useful).
 for instance:
 (foo (bar @0 @1) @(type @0)@1)
 that checks if @0, @1 have same types.

 I was wondering on the same lines if we could introduce new
 keyword "precision" analogous to type ?
 (precision @0) would be short-hand for
 TYPE_PRECISION (TREE_TYPE (@0))

 So far I found couple patterns in fold_unary_loc with conversions
 in transform:
 (T1)(X p+ Y) into ((T1)X p+ Y)
 (T1)(~(T2) X) -> ~(T1) X
 (T1) (X * Y) -> (T1) X * (T1) Y

 Thanks,
 Prathamesh


Possible "C++ for embedded systems" WG21 working group

2014-08-21 Thread Daniel Gutson
Hi falks,

   since the last WG21 meeting held at Rapperswil, where I gave a
presentation of
issues and opportunities to improve in the C++ language regarding
embedded systems development,
I'm pursuing the creation of a Committee's Working Group.

There was agreement that the issues were relevant and I was told that
the group could be created
in the case more people is interested and involved.
I created a mailing list were these issues are being discussed and
some proposals are being drafted,
to be presented in the next WG21 meeting at Urbana.

If anyone is interested to participate, please send me an email
(daniel dot gutson at taller technologies dot com) telling me if you
have ever attended a WG21 meeting and/or would be willing to
participate in one.
Having gcc and clang maintainers would be great to have an implementer
POV in the discussions.

Thanks!

   Daniel.

-- 

Daniel F. Gutson
Chief Engineering Officer, SPD


San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina


Phone: +54 351 4217888 / +54 351 4218211

Skype: dgutson


RE: Frame pointer optimization issues

2014-08-21 Thread Wilco Dijkstra
> Richard Henderson wrote:
> On 08/20/2014 08:22 AM, Wilco Dijkstra wrote:
> > 2. Change the mid-end to call _frame_pointer_required even when
> > !flag_omit_frame_pointer.
> 
> Um, it does that already.  At least as far as I can see from
> ira_setup_eliminable_regset and update_eliminables.

No, in ira_setup_eliminable_regset the frame pointer is always forced if 
!flag_omit_frame_pointer without allowing frame_pointer_required to override it:

  frame_pointer_needed
= (! flag_omit_frame_pointer
   ...
   || targetm.frame_pointer_required ());

This would allow targets to choose whether to do leaf tail pointer optimization:

  frame_pointer_needed
= ((! flag_omit_frame_pointer && targetm.frame_pointer_required ())

> It turns out to be much easier to re-enable a frame pointer for a given
> function than to disable a frame pointer.  Thus I believe that you should
> approach -momit_leaf_frame_pointer as setting flag_omit_frame_pointer, and 
> then
> re-enabling it in frame_pointer_required.  This requires more than one line in
> common/config/arch/arch.c, but it shouldn't be much more than ten.

As I explained it is not correct to force flag_omit_frame_pointer to be true. 
This is what is done today and it fails in various cases. So unless the way 
options
are handled is changed, this possibility is out.

> > A second issue with frame pointers is that update_eliminables() in 
> > reload1.c might set
> > frame_pointer_needed to false without any checks.
> 
> How?  I don't see that path, since the very first thing update_eliminables 
> does
> is call frame_pointer_required -- even before it calls can_eliminate.

Update_eliminables() does indeed call frame_pointer_required at the start, 
however this 
only blocks elimination *from* HARD_FRAME_POINTER_REGNUM, while the code at the 
end clears 
frame_pointer_needed if FRAME_POINTER_REGNUM can be eliminated into any 
register other 
than HARD_FRAME_POINTER_REGNUM. The middle bit of the function is not relevant 
as 
HARD_FRAME_POINTER_REGNUM should only be eliminable into SP (but even if say it 
could be 
eliminable into another register X, it will only block eliminations of X to SP).
 
So frame_pointer_needed can be cleared even when frame_pointer_required is 
true...

In principle if this function worked reliably then we could implement leaf FPO 
using this 
mechanism. Unfortunately it doesn't, update_eliminables is not called in 
trivial leaf 
functions even when can_eliminate always returns true, so the frame pointer is 
never removed. 
Additionally I'd be worried about compilation performance as it would introduce 
extra
register allocation passes for ~50% of functions.

Wilco





[NEW PLATFORM] [HELP] GCC on the Broadcom VideoCore IV VPU

2014-08-21 Thread Mohamed MEDIOUNI
The BCM2835 (the RPi chip) does have a custom ISA general-purpose CPU with SIMD 
extensions ( which does not have a MMU).

As I develop a FOSS blob for the Pi(github.com/freeblob , I need something less 
crappy than the ACK (it can only create 5 vars per function).

The GCC port for VC4 has mangled stack frames and is available 
at:github.com/mm120/gcc-vc4 branch vc4.

Can anyone figure what is wrong in the code and submit a patch( pull request, 
mainlining, or a custom repo).

Thanks in advance.


gcc-4.8-20140821 is now available

2014-08-21 Thread gccadmin
Snapshot gcc-4.8-20140821 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140821/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch 
revision 214296

You'll find:

 gcc-4.8-20140821.tar.bz2 Complete GCC

  MD5=e88381671a58922a04cc9f287e8ed8a8
  SHA1=d834783f4f9b9b09f6bb4babfbab9097ab9e6317

Diffs from 4.8-20140814 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Is that a problem?

2014-08-21 Thread lin zuojian
Hi,
After applied a patch to GCC to make it warn about strict aliasing
violating, like this: 
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b6ecaa4..95e745c 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2913,6 +2913,10 @@ setup_one_parameter (copy_body_data *id, tree p, tree 
value, tree fn,
}
 }
 
+  if (warn_strict_aliasing > 2)
+if (strict_aliasing_warning (TREE_TYPE (rhs), TREE_TYPE(p), rhs))
+  warning (OPT_Wstrict_aliasing, "during inlining function %s into 
function %s", fndecl_name(fn), function_name(cfun));
+
Compiling gcc/testsuite/g++.dg/opt/pmf1.C triggers that warning:

gcc/testsuite/g++.dg/opt/pmf1.C: In function 'int main()':
gcc/testsuite/g++.dg/opt/pmf1.C:72:42: warning: dereferencing type-punned 
pointer will break strict-aliasing rules. With expression: &t, type of 
expresssion: struct Container *, type to cast: struct WorldObject * const. 
[-Wstrict-aliasing]
 t.forward(itemfunptr, &Item::fred, 1);
  ^
gcc/testsuite/g++.dg/opt/pmf1.C:72:42: warning: during inlining function void 
WorldObject::forward(memfunT, arg1T, arg2T) [with memfunT = void 
(Container::*)(void (Item::*)(int), int); arg1T = void (Item::*)(int); arg2T = 
int; Derived = Container] into function int main() [-Wstrict-aliasing]

It that a problem here? We try to case type Container to its base
type WorldObject, and that violating the strict aliasing? Let's take
a look at this.

--
Lin Zuojian 


Re: Is that a problem?

2014-08-21 Thread lin zuojian
Hi,
I knew what is going on now. strict_aliasing_warning has not
considered tbaa. We might want to fix it.
--
Lin Zuojian