Re: No git tag for GCC 4.8.2

2013-12-10 Thread Andreas Schwab
"H.J. Lu"  writes:

> There are git tags for GCC 4.8.0 and 4.8.1.  But git tag
> for GCC 4.8.2 is missing.

Fixed.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


break in statement expression in while condition fails to compile

2013-12-10 Thread Prathamesh Kulkarni
The following code fails to compile with gcc-4.8.2.

int main(void)
{
while ( ({ break; 0; }) )
;
return 0;
}

foo.c:3:14: error: break statement not within loop or switch
   while ( ({ break; 0; }) )
  ^
Is this a compile-error or is it a bug in GCC ?
clang-3.2 seems to compile it.

I came across a thread  on this issue
in context of for loop, but I couldn't get a definite answer.
http://gcc.gnu.org/ml/gcc-help/2013-07/msg00100.html

I think the reason above code  fails to compile is because,
c_break_label has value zero_size_node instead of NULL_TREE
when c_finish_bc_stmt() gets called.

static void
c_parser_while_statement (c_parser *parser)
{
  tree block, cond, body, save_break, save_cont;
  location_t loc;
  gcc_assert (c_parser_next_token_is_
keyword (parser, RID_WHILE));
  c_parser_consume_token (parser);
  block = c_begin_compound_stmt (flag_isoc99);
  loc = c_parser_peek_token (parser)->location;
  cond = c_parser_paren_condition (parser);
  save_break = c_break_label;
  c_break_label = NULL_TREE;
  save_cont = c_cont_label;
  c_cont_label = NULL_TREE;
  body = c_parser_c99_block_statement (parser);
  c_finish_loop (loc, cond, NULL, body, c_break_label, c_cont_label, true);
  add_stmt (c_end_compound_stmt (loc, block, flag_isoc99));
  c_break_label = save_break;
  c_cont_label = save_cont;
}

c_parser_paren_condition() is called *before* assigning
c_break_label and c_cont_label to NULL_TREE

cond = c_parser_paren_condition (parser);
save_break = c_break_label;
c_break_label = NULL_TREE;
save_cont = c_cont_label;
c_cont_label = NULL_TREE;

Instead if c_parser_paren_condition(parser) is placed
*after* setting c_break_label and c_cont_label to NULL_TREE, the above
code compiles correctly (i changed the order and built cc1 and the
above test-case compiled without error):

save_break = c_break_label;
c_break_label = NULL_TREE;
save_cont = c_cont_label;
c_cont_label = NULL_TREE;
cond = c_parser_paren_condition (parser);

I guess that's because of the following if-else sequence
in c_bc_finish_stmt(loc, label_p, is_break):

tree c_finish_bc_stmt(location_t loc, tree *label_p, bool is_break)
{
   label = *label_p;
   skip = !block_may_fallthru (cur_stmt_list);
if (!label)
{
  if (!skip)
  * label_p = label = create_artificial_label (loc);
}
else if (TREE_CODE (label) == LABEL_DECL)
printf("2nd time here\n");
else switch (TREE_INT_CST_LOW (label))
{
case 0:
if (is_break)
error_at (loc, "break statement not within loop or switch");
}
// rest of function
}

This function gets called from here (in c-parser.c):
case RID_BREAK:
c_parser_consume_token (parser);
stmt = c_finish_bc_stmt (loc, &c_break_label, true);
goto expect_semicolon;

Initially c_break_label (and c_cont_label) are set to
zero_size_node in c-decl.c in start_function()
c_break_label  = c_cont_label = size_zero_node;
where size_zero_node is non-null.
so the control enters else switch() branch,
and the error "break statement not within loop or switch" gets printed.

Instead if c_parse_paren_condition() is called
after setting c_break_label (and c_cont_label) to NULL_TREE,
then control shall enter the if (!label) {} branch.


Thanks and Regards,
Prathamesh


Unoptimal code.

2013-12-10 Thread Umesh Kalappa
Hi All,

Below is the patterns defined  for the  mov and add  instruction
.
[(set (match_operand:HI 0 "general_mov_operand" "=r,rRA")
(match_operand:HI 1 "general_mov_operand" "rRAi,ri"))]
  ""
 {

}
)

(define_insn "addhi3"
  [(set (match_operand:HI 0 "register_operand" "=Ar")
(plus:HI (match_operand:HI 1 "register_operand" "%0")
 (match_operand:HI 2 "general_mov_operand" "Ar")))]
  ""
  "add\t%0, (%2)"
)

The problem we stuck with is that the compiler emit unoptimal code for
the below testcase with -O0 option

int a,b;

int func()
{
   return a=b;
}

.s file

 ld  BC, (a)
 ld  WA, (b)
 add WA, BC
 ld  (a), WA
 ret

the compiler try to load a and b to the register BC and WA
respectively in the expand_assignment and add them , then store back
the result to a.

But  if you see the addhi3 definition ,it states that i'm allowed to
emit instruction like

add WA,(a)

where second operand can be register indirect addressing .

I can write peephole pattern to optimize  the emitted code like

.s file

 ld  WA, (b)
 add WA, (a)
 ld  (a), WA
 ret


the reason for the unoptimal  code is that  the code is expanded to
load  the memory contents  to the registers  and then update the add
operands accordingly. I don't want this to happen .

I will be glad ,if somebody from the group  share their experience or
through some insights  how i can achieve this .

Thanks
~Umesh


Issues with GCSE pre step and double hard registers

2013-12-10 Thread Claudiu Zissulescu
Hi,

Our ARC processor has a multiplication operation that returns a 64 bit result 
into a fixed register pair named  like this:

mlo:DI=zero_extend(r159:SI)*sign_extend(r181:SI)

The GCSE rtl pre step has some difficulties to handle hard register pair 
information. To exemplify my problem please see the following example:

   18: mlo:DI=zero_extend(r159:SI)*sign_extend(r170:SI)
  REG_EQUAL zero_extend(r159:SI)*0xcccd
  REG_DEAD r170:SI
  REG_UNUSED mlo:SI

   20: r168:SI=mhi:SI 0>>0x3
  REG_DEAD mhi:SI
  REG_EQUAL udiv(r159:SI,0xa)



   36: mlo:DI=zero_extend(r159:SI)*sign_extend(r181:SI)
  REG_EQUAL zero_extend(r159:SI)*0xcccd
  REG_DEAD r181:SI
  REG_UNUSED mlo:SI

   38: r179:SI=mhi:SI 0>>0x3
  REG_DEAD mhi:SI
  REG_EQUAL udiv(r159:SI,0xa)

The "reg_avail_info" structure misses information about mhi register. The mlo 
information, the first register in the register pair, the information is 
correctly computed.  Due to the missing mhi information,  the instruction 38 is 
considered an anticipated expression (due to faulty return of function 
oprs_unchanged_p() ). This leads to removal of instruction 38 ( gcse via insn 
20) and all the dependent instructions.

A possible solution is to check if reg_avail_info holds initialized data in 
oprs_unchanged_p() function, avoiding false positive returns, like this:

--- a/gcc/gcse.c
+++ b/gcc/gcse.c
@@ -881,6 +881,8 @@ oprs_unchanged_p (const_rtx x, const_rtx insn, int avail_p)
   {
struct reg_avail_info *info = ®_avail_info[REGNO (x)];

+   if (info->last_bb == NULL)
+ return 0;
if (info->last_bb != current_bb)
  return 1;
if (avail_p)


Please let me know if this is an acceptable solution for my issue.

Thank you,
Claudiu Zissulescu


Re: Unoptimal code.

2013-12-10 Thread Richard Biener
On Tue, Dec 10, 2013 at 4:03 PM, Umesh Kalappa  wrote:
> Hi All,
>
> Below is the patterns defined  for the  mov and add  instruction
> .
> [(set (match_operand:HI 0 "general_mov_operand" "=r,rRA")
> (match_operand:HI 1 "general_mov_operand" "rRAi,ri"))]
>   ""
>  {
>
> }
> )
>
> (define_insn "addhi3"
>   [(set (match_operand:HI 0 "register_operand" "=Ar")
> (plus:HI (match_operand:HI 1 "register_operand" "%0")
>  (match_operand:HI 2 "general_mov_operand" "Ar")))]
>   ""
>   "add\t%0, (%2)"
> )
>
> The problem we stuck with is that the compiler emit unoptimal code for
> the below testcase with -O0 option
>
> int a,b;
>
> int func()
> {
>return a=b;
> }
>
> .s file
>
>  ld  BC, (a)
>  ld  WA, (b)
>  add WA, BC
>  ld  (a), WA
>  ret
>
> the compiler try to load a and b to the register BC and WA
> respectively in the expand_assignment and add them , then store back
> the result to a.
>
> But  if you see the addhi3 definition ,it states that i'm allowed to
> emit instruction like
>
> add WA,(a)
>
> where second operand can be register indirect addressing .
>
> I can write peephole pattern to optimize  the emitted code like
>
> .s file
>
>  ld  WA, (b)
>  add WA, (a)
>  ld  (a), WA
>  ret
>
>
> the reason for the unoptimal  code is that  the code is expanded to
> load  the memory contents  to the registers  and then update the add
> operands accordingly. I don't want this to happen .
>
> I will be glad ,if somebody from the group  share their experience or
> through some insights  how i can achieve this .

Use -On with n > 0.

Richard.

> Thanks
> ~Umesh


cpp0x test suite PASS/FAIL

2013-12-10 Thread BELBACHIR Selim
Hi,

I'm playing c++ testsuite on my gcc.4.7.3 port and I encounter the following 
result on test auto27.C

PASS: g++.dg/cpp0x/auto27.C -std=c++98 std (test for errors, line 3)
PASS: g++.dg/cpp0x/auto27.C -std=c++98 auto (test for errors, line 3)
PASS: g++.dg/cpp0x/auto27.C -std=c++98 no type (test for errors, line 3)
PASS: g++.dg/cpp0x/auto27.C -std=c++98 (test for excess errors)
FAIL: g++.dg/cpp0x/auto27.C -std=c++11 std (test for errors, line 3)
FAIL: g++.dg/cpp0x/auto27.C -std=c++11 auto (test for errors, line 3)
FAIL: g++.dg/cpp0x/auto27.C -std=c++11 no type (test for errors, line 3)
PASS: g++.dg/cpp0x/auto27.C -std=c++11 (test for excess errors)



auto27.C :

auto main()->int   // { dg-error "std=" "std" { target c++98 } }
   // { dg-error "auto" "auto" { target c++98 } 3 }
   // { dg-error "no type" "no type" { target c++98 
} 3 }
{ }



I don't understand if DejaGNU tells me that the test is OK or KO ...

When I use -std=c++98 option, I get the 3 expected errors and when I use 
-std=c++11, I get no error. That seems to be the expected result but I don't 
understand why the word FAIL appears in the log...

Should I ignore the FAILs when the comment contains '(test for errors' and 
consider that those tests are parts of a larger test with comment '(test for 
excess errors'  ?

   Regards,

Selim


Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Ramana Radhakrishnan
On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
> Hi,
>
> Near the start of schedule_block, find_modifiable_mems is called if 
> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems on 
> c6x backend currently uses this.
> However, it's quite strange that this is not a requirement for all backends 
> since find_modifiable_mems, moves all my dependencies in SD_LIST_HARD_BACK to 
> SD_LIST_SPEC_BACK even though I don't have DO_SPECULATION enabled.
>
> Since dependencies are accessed later on from try_ready (for example), I 
> would have thought that it would be always good not to call 
> find_modifiable_mems,  given that it seems to 'literally' break dependencies.
>
> Is the behaviour of find_modifiable_mems a bug or somehow expected?


It's funny how I've been trying to track down a glitch and ended up
asking the same question today. Additionally if I use
TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective
scheduler, this does nothing. Does anyone know why is this the default
for ports where we don't turn on selective scheduling and might need a
hook to turn this off ?

regards
Ramana

>
> Cheers,
>
> Paulo Matos
>
>


Re: break in statement expression in while condition fails to compile

2013-12-10 Thread Florian Weimer

On 12/10/2013 02:21 PM, Prathamesh Kulkarni wrote:

The following code fails to compile with gcc-4.8.2.

int main(void)
{
 while ( ({ break; 0; }) )
 ;
 return 0;
}

foo.c:3:14: error: break statement not within loop or switch
while ( ({ break; 0; }) )
   ^
Is this a compile-error or is it a bug in GCC ?
clang-3.2 seems to compile it.

I came across a thread  on this issue
in context of for loop, but I couldn't get a definite answer.
http://gcc.gnu.org/ml/gcc-help/2013-07/msg00100.html


There's also bug .

I think it's fine to reject such code.

--
Florian Weimer / Red Hat Product Security Team


Re: proposal to make SIZE_TYPE more flexible

2013-12-10 Thread Joseph S. Myers
On Mon, 9 Dec 2013, DJ Delorie wrote:

> First pass at actual code.  I took the path of using a new macro in
> TARGET-modes.def and having genmodes build the relevent tables.  Part
> of the table is created by genmodes, the rest is created at runtime.

This seems mostly plausible, though I don't see anything to ensure that 
__intN does not exist at all if the size matches one of the standard C 
types, or if the mode fails targetm.scalar_mode_supported_p.  (To move 
__int128 to this system, given that its availability will depend on the 
options passed to the compiler, I suppose the macro call to create the 
type will be in machmode.def alongside the definition of TImode, but 
whether it does in fact end up creating a type will depend on such 
conditionals in the compiler - and code will need to check for NULL tree 
nodes for types not supported with the given options passed to the 
compiler.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: cpp0x test suite PASS/FAIL

2013-12-10 Thread Joseph S. Myers
On Tue, 10 Dec 2013, BELBACHIR Selim wrote:

> FAIL: g++.dg/cpp0x/auto27.C -std=c++11 std (test for errors, line 3)
> FAIL: g++.dg/cpp0x/auto27.C -std=c++11 auto (test for errors, line 3)
> FAIL: g++.dg/cpp0x/auto27.C -std=c++11 no type (test for errors, line 3)

That means that the desired result is an error message on that line, and 
either there was no such error message or the error message did not match 
what the testcase expected.

> Should I ignore the FAILs when the comment contains '(test for errors' 
> and consider that those tests are parts of a larger test with comment 
> '(test for excess errors' ?

No, FAILs indicate a bug in either the compiler or the testcase (or in 
your test environment, etc.); don't ignore them.

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: cpp0x test suite PASS/FAIL

2013-12-10 Thread BELBACHIR Selim
I have exactly the same behaviour than my native linux compiler. I don't 
understand why DejaGnu exp files print such FAIL.

3 errors has to be printed when using -std=c++98 and 0 errors has to be printed 
when using -std=c++11. That's what my compiler does.

The selector 'target c++98' (in { dg-error "std=" "std" { target c++98 } }  for 
example) do not prevent the FAIL to be printed when -std=c++11 options is used.
Only the last 'test for excess errors' seems to understand that no errors has 
to be printed when using -std=c++11


Here is the DejaGnu log :

Running  gcc-4.7.3/gcc/testsuite/g++.dg/dg.exp ...
ALWAYS_CXXFLAGS set to additional_flags= ldflags= 
additional_flags=-fmessage-length=0
Executing on host:  prism-g++ gcc-4.7.3/gcc/testsuite/g++.dg/cpp0x/auto27.C   
-fmessage-length=0 -std=c++98  -pedantic-errors -Wno-long-long  -S -o auto27.s  
  (timeout = 300)
gcc-4.7.3/gcc/testsuite/g++.dg/cpp0x/auto27.C:3:14: error: ISO C++ forbids 
declaration of 'main' with no type [-pedantic]
gcc-4.7.3/gcc/testsuite/g++.dg/cpp0x/auto27.C:3:14: error: top-level 
declaration of 'main' specifies 'auto'
gcc-4.7.3/gcc/testsuite/g++.dg/cpp0x/auto27.C:3:14: error: trailing return type 
only available with -std=c++11 or -std=gnu++11
compiler exited with status 1
output is:
gcc-4.7.3/gcc/testsuite/g++.dg/cpp0x/auto27.C:3:14: error: ISO C++ forbids 
declaration of 'main' with no type [-pedantic]
gcc-4.7.3/gcc/testsuite/g++.dg/cpp0x/auto27.C:3:14: error: top-level 
declaration of 'main' specifies 'auto'
gcc-4.7.3/gcc/testsuite/g++.dg/cpp0x/auto27.C:3:14: error: trailing return type 
only available with -std=c++11 or -std=gnu++11
PASS: g++.dg/cpp0x/auto27.C -std=c++98 std (test for errors, line 3)
PASS: g++.dg/cpp0x/auto27.C -std=c++98 auto (test for errors, line 3)
PASS: g++.dg/cpp0x/auto27.C -std=c++98 no type (test for errors, line 3)
PASS: g++.dg/cpp0x/auto27.C -std=c++98 (test for excess errors)
Executing on host: prism-g++ gcc-4.7.3/gcc/testsuite/g++.dg/cpp0x/auto27.C   
-fmessage-length=0 -std=c++11  -pedantic-errors -Wno-long-long  -S 
-DSIGNAL_SUPPRESS -DNO_TRAMPOLINES -DSTACK_SIZE=0x4800 -o auto27.s(timeout 
= 300)
FAIL: g++.dg/cpp0x/auto27.C -std=c++11 std (test for errors, line 3)
FAIL: g++.dg/cpp0x/auto27.C -std=c++11 auto (test for errors, line 3)
FAIL: g++.dg/cpp0x/auto27.C -std=c++11 no type (test for errors, line 3)
PASS: g++.dg/cpp0x/auto27.C -std=c++11 (test for excess errors)




-Message d'origine-
De : Joseph Myers [mailto:jos...@codesourcery.com] 
Envoyé : mardi 10 décembre 2013 18:22
À : BELBACHIR Selim
Cc : gcc@gcc.gnu.org
Objet : Re: cpp0x test suite PASS/FAIL

On Tue, 10 Dec 2013, BELBACHIR Selim wrote:

> FAIL: g++.dg/cpp0x/auto27.C -std=c++11 std (test for errors, line 3)
> FAIL: g++.dg/cpp0x/auto27.C -std=c++11 auto (test for errors, line 3)
> FAIL: g++.dg/cpp0x/auto27.C -std=c++11 no type (test for errors, line 
> 3)

That means that the desired result is an error message on that line, and either 
there was no such error message or the error message did not match what the 
testcase expected.

> Should I ignore the FAILs when the comment contains '(test for errors' 
> and consider that those tests are parts of a larger test with comment 
> '(test for excess errors' ?

No, FAILs indicate a bug in either the compiler or the testcase (or in your 
test environment, etc.); don't ignore them.

--
Joseph S. Myers
jos...@codesourcery.com


RE: cpp0x test suite PASS/FAIL

2013-12-10 Thread Joseph S. Myers
On Tue, 10 Dec 2013, BELBACHIR Selim wrote:

> The selector 'target c++98' (in { dg-error "std=" "std" { target c++98 } 
> } for example) do not prevent the FAIL to be printed when -std=c++11 
> options is used.

Well, that would be a bug in one of (a) the test harness code, (b) the way 
the selector is used, (c) your DejaGnu installation.  In none of those 
cases is ignoring the FAIL appropriate; both (a) and (c) could well cause 
other problems with inaccurate test results elsewhere in the testsuite.  
You'll need to investigate why DejaGnu isn't behaving as intended on your 
system.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: proposal to make SIZE_TYPE more flexible

2013-12-10 Thread DJ Delorie

> This seems mostly plausible, though I don't see anything to ensure that 
> __intN does not exist at all if the size matches one of the standard C 
> types,

My thought here was that, since each __intN is specified by the
target, they'd know to only do so if it doesn't match an existing (for
that target) type.  But what if it does?  What if the standard types
changed based on command line options, and the target wanted to offer
an __intN as a "this will always work" alternative to standard types?

I did code it such that standard types are preferred over __intN types
in the searches.

> or if the mode fails targetm.scalar_mode_supported_p.

As I mentioned, I haven't added that part yet.  My though there is to
add a runtime flag to the int_n_trees[] array and have each loop over
that data check the flag.  I suppose I could check for the types being
NULL but IIRC there are cases that only use the genmodes data and not
the types.  For aesthetic reasons I wanted to keep the flag separate.

> (To move __int128 to this system, given that its availability will

I put all the new code alongside the __int128 code, so the __int128
code can just be cut out after the new code is in place, but that's
down the road a bit.


Re: proposal to make SIZE_TYPE more flexible

2013-12-10 Thread Joseph S. Myers
On Tue, 10 Dec 2013, DJ Delorie wrote:

> > This seems mostly plausible, though I don't see anything to ensure that 
> > __intN does not exist at all if the size matches one of the standard C 
> > types,
> 
> My thought here was that, since each __intN is specified by the
> target, they'd know to only do so if it doesn't match an existing (for
> that target) type.  But what if it does?  What if the standard types
> changed based on command line options, and the target wanted to offer
> an __intN as a "this will always work" alternative to standard types?

If you have such alternative types, you have extra complications: either 
they are the same as a standard type (and you need to define what the 
order of preference is and keep it constant, to avoid __intN changing from 
int to long at random), or they are distinct (and you need to define 
integer promotion ranks and ensure the front end code handles promotions 
for types the same size as int correctly).  Not having such alternatives 
avoids such complications.  Thus, I'd rather any possibility of such types 
came later after the basic support for types not matching the standard 
types.

(For the types you do have, there's a need to define C++ name mangling.  
There are various other issues for such extended types - see PR 50441 
comment 3, and PR 43622 - but you can safely ignore such issues for now as 
pre-existing problems with __int128.  Whereas __int128 does have mangling 
defined in cp/mangle.c, and targets can already define their own mangling 
for target-specific types through a hook.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: proposal to make SIZE_TYPE more flexible

2013-12-10 Thread DJ Delorie

> (For the types you do have, there's a need to define C++ name mangling.  

I mentioned this before, and I don't have a good solution for it.
Both C++ and LTO need a mangled form of __intN types.


Re: Issues with GCSE pre step and double hard registers

2013-12-10 Thread Steven Bosscher
On Tue, Dec 10, 2013 at 4:17 PM, Claudiu Zissulescu wrote:
> Hi,
>
> Our ARC processor has a multiplication operation that returns a 64 bit result 
> into a fixed register pair named  like this:
>
> mlo:DI=zero_extend(r159:SI)*sign_extend(r181:SI)
>
> The GCSE rtl pre step has some difficulties to handle hard register pair 
> information. To exemplify my problem please see the following example:
>
>18: mlo:DI=zero_extend(r159:SI)*sign_extend(r170:SI)
>   REG_EQUAL zero_extend(r159:SI)*0xcccd
>   REG_DEAD r170:SI
>   REG_UNUSED mlo:SI
>
>20: r168:SI=mhi:SI 0>>0x3
>   REG_DEAD mhi:SI
>   REG_EQUAL udiv(r159:SI,0xa)
>
> 
>
>36: mlo:DI=zero_extend(r159:SI)*sign_extend(r181:SI)
>   REG_EQUAL zero_extend(r159:SI)*0xcccd
>   REG_DEAD r181:SI
>   REG_UNUSED mlo:SI
>
>38: r179:SI=mhi:SI 0>>0x3
>   REG_DEAD mhi:SI
>   REG_EQUAL udiv(r159:SI,0xa)
>
> The "reg_avail_info" structure misses information about mhi register. The mlo 
> information, the first register in the register pair, the information is 
> correctly computed.  Due to the missing mhi information,  the instruction 38 
> is considered an anticipated expression (due to faulty return of function 
> oprs_unchanged_p() ). This leads to removal of instruction 38 ( gcse via insn 
> 20) and all the dependent instructions.
>
> A possible solution is to check if reg_avail_info holds initialized data in 
> oprs_unchanged_p() function, avoiding false positive returns, like this:
>
> --- a/gcc/gcse.c
> +++ b/gcc/gcse.c
> @@ -881,6 +881,8 @@ oprs_unchanged_p (const_rtx x, const_rtx insn, int 
> avail_p)
>{
> struct reg_avail_info *info = ®_avail_info[REGNO (x)];
>
> +   if (info->last_bb == NULL)
> + return 0;
> if (info->last_bb != current_bb)
>   return 1;
> if (avail_p)
>
>
> Please let me know if this is an acceptable solution for my issue.

I don't think this is not the right fix for the problem. GCSE doesn't
handle expressions containing hard registers, oprs_unchanged_p should
never even see expressions involving hard registers.

What is the expression that is recorded as anticipated in insn 38? Is
it "mho:SI 0>>0x3" or "udiv(r159:SI,0xa)" from the REG_EQUAL note?

Ciao!
Steven


Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Maxim Kuvyrkov
On 11/12/2013, at 5:17 am, Ramana Radhakrishnan  
wrote:

> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
>> Hi,
>> 
>> Near the start of schedule_block, find_modifiable_mems is called if 
>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems on 
>> c6x backend currently uses this.
>> However, it's quite strange that this is not a requirement for all backends 
>> since find_modifiable_mems, moves all my dependencies in SD_LIST_HARD_BACK 
>> to SD_LIST_SPEC_BACK even though I don't have DO_SPECULATION enabled.
>> 
>> Since dependencies are accessed later on from try_ready (for example), I 
>> would have thought that it would be always good not to call 
>> find_modifiable_mems,  given that it seems to 'literally' break dependencies.
>> 
>> Is the behaviour of find_modifiable_mems a bug or somehow expected?

"Breaking" a dependency in scheduler involves modification of instructions that 
would allow scheduler to move one instruction past the other.  The most common 
case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" which can be 
transformed into "r3 = [r1 + 4]; r2 = r1 + 4;".  Breaking a dependency is not 
ignoring it, speculatively or otherwise; it is an equivalent code 
transformation to allow scheduler more freedom to fill up CPU cycles.

> 
> 
> It's funny how I've been trying to track down a glitch and ended up
> asking the same question today. Additionally if I use
> TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective
> scheduler, this does nothing. Does anyone know why is this the default
> for ports where we don't turn on selective scheduling and might need a
> hook to turn this off ?

SCHED_FLAGS is used to enable or disable various parts of GCC scheduler.  On an 
architecture that supports speculative scheduling with recovery (IA64) it can 
turn this feature on or off.  The documentation for various features of 
sched-rgn, sched-ebb and sel-sched is not the best and one will likely get 
weird artefacts by trying out non-default settings.

I believe that only IA64 backend supports selective scheduling reliably.  I've 
other ports trying out selective scheduling, but I don't know whether those 
efforts got positive results.

--
Maxim Kuvyrkov
www.kugelworks.com




Re: Dependency confusion in sched-deps

2013-12-10 Thread Maxim Kuvyrkov
On 6/12/2013, at 9:44 pm, shmeel gutl  wrote:

> On 06-Dec-13 01:34 AM, Maxim Kuvyrkov wrote:
>> On 6/12/2013, at 8:44 am, shmeel gutl  wrote:
>> 
>>> On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote:
 Dependency type plays a role for estimating costs and latencies between 
 instructions (which affects performance), but using wrong or imprecise 
 dependency type does not affect correctness.
>>> On multi-issue architectures it does make a difference. Anti dependence 
>>> permits the two instructions to be issued during the same cycle whereas 
>>> true dependency and output dependency would forbid this.
>>> 
>>> Or am I misinterpreting your comment?
>> On VLIW-flavoured machines without resource conflict checking -- "yes", it 
>> is critical not to use anti dependency where an output or true dependency 
>> exist.  This is the case though, only because these machines do not follow 
>> sequential semantics for instruction execution (i.e., effects from previous 
>> instructions are not necessarily observed by subsequent instructions on the 
>> same/close cycles.
>> 
>> On machines with internal resource conflict checking having a wrong type on 
>> the dependency should not cause wrong behavior, but "only" suboptimal 
>> performance.
>> 
>> 
...
> Earlier in the thread you wrote
>> Output dependency is the right type (write after write).  Anti dependency is 
>> write after read, and true dependency is read after write.
> Should the code be changed to accommodate vliw machines.. It has been there 
> since the module was originally checked into trunk.

The usual solution for VLIW machines is to have assembler split VLIW bundles 
that have internal dependencies and execute them on different cycles.  The idea 
is for compiler to strive to do its best to produce code without any internal 
dependencies, but it is up to assembler to do the final check and fix any 
occasional problems.  [A good assembler has to do this work anyway to 
accommodate for mistakes in hand-written assembly.]

The scheduler is expected to produces code with no internal dependencies for 
VLIW machines 99% of the time.  This 99% effectiveness is good enough since 
scheduler is often not the last pass that touches code, and subsequent 
transformations can screw up VLIW bundles anyway.

--
Maxim Kuvyrkov
www.kugelworks.com




Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Ramana Radhakrishnan
On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov  wrote:
> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan  
> wrote:
>
>> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
>>> Hi,
>>>
>>> Near the start of schedule_block, find_modifiable_mems is called if 
>>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems 
>>> on c6x backend currently uses this.
>>> However, it's quite strange that this is not a requirement for all backends 
>>> since find_modifiable_mems, moves all my dependencies in SD_LIST_HARD_BACK 
>>> to SD_LIST_SPEC_BACK even though I don't have DO_SPECULATION enabled.
>>>
>>> Since dependencies are accessed later on from try_ready (for example), I 
>>> would have thought that it would be always good not to call 
>>> find_modifiable_mems,  given that it seems to 'literally' break 
>>> dependencies.
>>>
>>> Is the behaviour of find_modifiable_mems a bug or somehow expected?
>
> "Breaking" a dependency in scheduler involves modification of instructions 
> that would allow scheduler to move one instruction past the other.  The most 
> common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" which can 
> be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;".  Breaking a dependency is 
> not ignoring it, speculatively or otherwise; it is an equivalent code 
> transformation to allow scheduler more freedom to fill up CPU cycles.


Yes, but there are times when it does this a bit too aggressively and
this looks like the cause for a performance regression that I'm
investigating on ARM. I was looking for a way of preventing this
transformation and there doesn't seem to be an easy one other than the
obvious hack.

Additionally there appears to be no way to control "flags" in a
backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the
DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then
it looks like we should allow for these to also be handled or describe
TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective
scheduler.

>
>>
>>
>> It's funny how I've been trying to track down a glitch and ended up
>> asking the same question today. Additionally if I use
>> TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective
>> scheduler, this does nothing. Does anyone know why is this the default
>> for ports where we don't turn on selective scheduling and might need a
>> hook to turn this off ?
>
> SCHED_FLAGS is used to enable or disable various parts of GCC scheduler.  On 
> an architecture that supports speculative >scheduling with recovery (IA64) it 
> can turn this feature on or off.  The documentation for various features of 
> sched-rgn, sched-ebb and sel-sched is not the best and one will likely get 
> weird artefacts by trying out non-default settings.


Well, it appears as though TARGET_SCHED_SET_SCHED_FLAGS is only valid
with the selective scheduler on as above and is a no-op as far as
sched-rgn goes. This whole area could do with some improved
documentation - I'll follow up with some patches to see if I can
improve the situation.

Thanks for your reply though.

regards
Ramana

>
> --
> Maxim Kuvyrkov
> www.kugelworks.com
>
>


Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Maxim Kuvyrkov
On 11/12/2013, at 11:14 am, Ramana Radhakrishnan  
wrote:

> On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov  wrote:
>> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan  
>> wrote:
>> 
>>> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
 Hi,
 
 Near the start of schedule_block, find_modifiable_mems is called if 
 DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems 
 on c6x backend currently uses this.
 However, it's quite strange that this is not a requirement for all 
 backends since find_modifiable_mems, moves all my dependencies in 
 SD_LIST_HARD_BACK to SD_LIST_SPEC_BACK even though I don't have 
 DO_SPECULATION enabled.
 
 Since dependencies are accessed later on from try_ready (for example), I 
 would have thought that it would be always good not to call 
 find_modifiable_mems,  given that it seems to 'literally' break 
 dependencies.
 
 Is the behaviour of find_modifiable_mems a bug or somehow expected?
>> 
>> "Breaking" a dependency in scheduler involves modification of instructions 
>> that would allow scheduler to move one instruction past the other.  The most 
>> common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" which can 
>> be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;".  Breaking a dependency is 
>> not ignoring it, speculatively or otherwise; it is an equivalent code 
>> transformation to allow scheduler more freedom to fill up CPU cycles.
> 
> 
> Yes, but there are times when it does this a bit too aggressively and
> this looks like the cause for a performance regression that I'm
> investigating on ARM. I was looking for a way of preventing this
> transformation and there doesn't seem to be an easy one other than the
> obvious hack.

If you want a particular transformation from occurring, then you need to 
investigate why scheduler thinks that there is nothing better to do than to 
schedule an instruction which requires breaking a dependency.  "Breaking" a 
dependency only increases pool of instructions available to schedule, and your 
problem seems to be laying in "why" the wrong instruction is selected from that 
pool.

Are you sure that the problem is introduced by dependency breaking, rather than 
dependency breaking exposing a latent bug?

> 
> Additionally there appears to be no way to control "flags" in a
> backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the
> DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then
> it looks like we should allow for these to also be handled or describe
> TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective
> scheduler.

I'm not sure I follow you here.  Any port can define 
TARGET_SCHED_SET_SCHED_FLAGS and set current_sched_info->flags to whatever it 
thinks is appropriate.  E.g., c6x does this to disable dependency breaking for 
a particular kind of loops.

> 
>> 
>>> 
>>> 
>>> It's funny how I've been trying to track down a glitch and ended up
>>> asking the same question today. Additionally if I use
>>> TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective
>>> scheduler, this does nothing. Does anyone know why is this the default
>>> for ports where we don't turn on selective scheduling and might need a
>>> hook to turn this off ?
>> 
>> SCHED_FLAGS is used to enable or disable various parts of GCC scheduler.  On 
>> an architecture that supports speculative >scheduling with recovery (IA64) 
>> it can turn this feature on or off.  The documentation for various features 
>> of sched-rgn, sched-ebb and sel-sched is not the best and one will likely 
>> get weird artefacts by trying out non-default settings.
> 
> 
> Well, it appears as though TARGET_SCHED_SET_SCHED_FLAGS is only valid
> with the selective scheduler on as above and is a no-op as far as
> sched-rgn goes. This whole area could do with some improved
> documentation - I'll follow up with some patches to see if I can
> improve the situation.

I don't think this is the case.  TARGET_SCHED_SET_SCHED_FLAGS has two outputs: 
one is SPEC_INFO structure (which is used for IA64 only, both for sel-sched and 
sched-rgn), and the other one is modification of current_sched_info->flags, 
which affects all schedulers (sched-rgn, sched-ebb and sel-sched) and all ports.

--
Maxim Kuvyrkov
www.kugelworks.com






Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Ramana Radhakrishnan
On Wed, Dec 11, 2013 at 12:02 AM, Maxim Kuvyrkov  wrote:
> On 11/12/2013, at 11:14 am, Ramana Radhakrishnan  
> wrote:
>
>> On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov  wrote:
>>> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan  
>>> wrote:
>>>
 On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
> Hi,
>
> Near the start of schedule_block, find_modifiable_mems is called if 
> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems 
> on c6x backend currently uses this.
> However, it's quite strange that this is not a requirement for all 
> backends since find_modifiable_mems, moves all my dependencies in 
> SD_LIST_HARD_BACK to SD_LIST_SPEC_BACK even though I don't have 
> DO_SPECULATION enabled.
>
> Since dependencies are accessed later on from try_ready (for example), I 
> would have thought that it would be always good not to call 
> find_modifiable_mems,  given that it seems to 'literally' break 
> dependencies.
>
> Is the behaviour of find_modifiable_mems a bug or somehow expected?
>>>
>>> "Breaking" a dependency in scheduler involves modification of instructions 
>>> that would allow scheduler to move one instruction past the other.  The 
>>> most common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" 
>>> which can be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;".  Breaking a 
>>> dependency is not ignoring it, speculatively or otherwise; it is an 
>>> equivalent code transformation to allow scheduler more freedom to fill up 
>>> CPU cycles.
>>
>>
>> Yes, but there are times when it does this a bit too aggressively and
>> this looks like the cause for a performance regression that I'm
>> investigating on ARM. I was looking for a way of preventing this
>> transformation and there doesn't seem to be an easy one other than the
>> obvious hack.
>
> If you want a particular transformation from occurring, then you need to 
> investigate why scheduler thinks that there is nothing better to do than to 
> schedule an instruction which requires breaking a dependency.  "Breaking" a 
> dependency only increases pool of instructions available to schedule, and 
> your problem seems to be laying in "why" the wrong instruction is selected 
> from that pool.
>
> Are you sure that the problem is introduced by dependency breaking, rather 
> than dependency breaking exposing a latent bug?

From my reading because the dependency breaking is of addresses that
are in a memcpy type loop which is unrolled and the original
expectation is that by switching this to an add and a negative offset
one can get more ILP in theory, but in practice the effects appear to
be worse because of secondary issues that I'm still investigating.

>
>>
>> Additionally there appears to be no way to control "flags" in a
>> backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the
>> DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then
>> it looks like we should allow for these to also be handled or describe
>> TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective
>> scheduler.
>
> I'm not sure I follow you here.  Any port can define 
> TARGET_SCHED_SET_SCHED_FLAGS and set current_sched_info->flags to whatever it 
> thinks is appropriate.  E.g., c6x does this to disable dependency breaking 
> for a particular kind of loops.

Ah, that will probably work and that's probably what I was missing. I
don't like the idea in general of the same interface setting global
state randomly in a backend is probably not the best approach in the
long term. Expecting to set global state in this form from an
interface is something I wasn't expecting especially when it takes a
parameter.

Thanks for the emails and the clarifications - useful enough for me to
try something in the morning.

regards
Ramana

>
>>
>>>


 It's funny how I've been trying to track down a glitch and ended up
 asking the same question today. Additionally if I use
 TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective
 scheduler, this does nothing. Does anyone know why is this the default
 for ports where we don't turn on selective scheduling and might need a
 hook to turn this off ?
>>>
>>> SCHED_FLAGS is used to enable or disable various parts of GCC scheduler.  
>>> On an architecture that supports speculative >scheduling with recovery 
>>> (IA64) it can turn this feature on or off.  The documentation for various 
>>> features of sched-rgn, sched-ebb and sel-sched is not the best and one will 
>>> likely get weird artefacts by trying out non-default settings.
>>
>>
>> Well, it appears as though TARGET_SCHED_SET_SCHED_FLAGS is only valid
>> with the selective scheduler on as above and is a no-op as far as
>> sched-rgn goes. This whole area could do with some improved
>> documentation - I'll follow up with some patches to see if I can
>> improve the situation.
>
> I don't think this is the case.  TARGET_

Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Maxim Kuvyrkov
On 11/12/2013, at 3:45 pm, Ramana Radhakrishnan  
wrote:

> On Wed, Dec 11, 2013 at 12:02 AM, Maxim Kuvyrkov  wrote:
>> On 11/12/2013, at 11:14 am, Ramana Radhakrishnan  
>> wrote:
>> 
>>> On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov  
>>> wrote:
 On 11/12/2013, at 5:17 am, Ramana Radhakrishnan 
  wrote:
 
> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
>> Hi,
>> 
>> Near the start of schedule_block, find_modifiable_mems is called if 
>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It 
>> seems on c6x backend currently uses this.
>> However, it's quite strange that this is not a requirement for all 
>> backends since find_modifiable_mems, moves all my dependencies in 
>> SD_LIST_HARD_BACK to SD_LIST_SPEC_BACK even though I don't have 
>> DO_SPECULATION enabled.
>> 
>> Since dependencies are accessed later on from try_ready (for example), I 
>> would have thought that it would be always good not to call 
>> find_modifiable_mems,  given that it seems to 'literally' break 
>> dependencies.
>> 
>> Is the behaviour of find_modifiable_mems a bug or somehow expected?
 
 "Breaking" a dependency in scheduler involves modification of instructions 
 that would allow scheduler to move one instruction past the other.  The 
 most common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" 
 which can be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;".  Breaking a 
 dependency is not ignoring it, speculatively or otherwise; it is an 
 equivalent code transformation to allow scheduler more freedom to fill up 
 CPU cycles.
>>> 
>>> 
>>> Yes, but there are times when it does this a bit too aggressively and
>>> this looks like the cause for a performance regression that I'm
>>> investigating on ARM. I was looking for a way of preventing this
>>> transformation and there doesn't seem to be an easy one other than the
>>> obvious hack.
>> 
>> If you want a particular transformation from occurring, then you need to 
>> investigate why scheduler thinks that there is nothing better to do than to 
>> schedule an instruction which requires breaking a dependency.  "Breaking" a 
>> dependency only increases pool of instructions available to schedule, and 
>> your problem seems to be laying in "why" the wrong instruction is selected 
>> from that pool.
>> 
>> Are you sure that the problem is introduced by dependency breaking, rather 
>> than dependency breaking exposing a latent bug?
> 
> From my reading because the dependency breaking is of addresses that
> are in a memcpy type loop which is unrolled and the original
> expectation is that by switching this to an add and a negative offset
> one can get more ILP in theory, but in practice the effects appear to
> be worse because of secondary issues that I'm still investigating.

Is this happening in the 1st or 2nd scheduling pass?  From your comments I get 
a feeling that dependency breaking is introducing an additional instruction, 
rather then adding an offset to a memory reference.  Ideally, dependency 
breaking during 1st scheduling pass should be more conservative and avoid too 
many new instructions (e.g., by breaking a dependency only if nothing 
whatsoever can be scheduled on the current cycle).  Dependency breaking during 
2nd scheduling pass can be more aggressive as it can make sure that adding 
offset to a memory instruction will not cause it to be split.

> 
>> 
>>> 
>>> Additionally there appears to be no way to control "flags" in a
>>> backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the
>>> DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then
>>> it looks like we should allow for these to also be handled or describe
>>> TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective
>>> scheduler.
>> 
>> I'm not sure I follow you here.  Any port can define 
>> TARGET_SCHED_SET_SCHED_FLAGS and set current_sched_info->flags to whatever 
>> it thinks is appropriate.  E.g., c6x does this to disable dependency 
>> breaking for a particular kind of loops.
> 
> Ah, that will probably work and that's probably what I was missing. I
> don't like the idea in general of the same interface setting global
> state randomly in a backend is probably not the best approach in the
> long term. Expecting to set global state in this form from an
> interface is something I wasn't expecting especially when it takes a
> parameter.

Originally TARGET_SCHED_SET_SCHED_FLAGS was setting current_sched_info->flags 
and nothing else, hence the name.  The parameter spec_info appeared later to 
hold flags related to IA64-specific speculative scheduling.


--
Maxim Kuvyrkov
www.kugelworks.com