from:"Markus"

gcc4.3.1 optimizations with restrict / -fargument-noalias / VLA

2008-06-24 Thread Markus

I've been experimenting which optimizations gcc is willing to apply  
depending on the kind of function arguments and compiler flags.  
Perhaps someone can comment on what strange behavior I experienced:


If I understand the concept of C99's "restrict" qualifier for  
function arguments correctly, it gives basically the same  
"guarantees" to the compiler as -fargument-noalias-global.


However, gcc applies much more aggressive optimizations (for my test  
kernels the new predictive commoning seems to be most benefical) for - 
fargument-noalias, and seems to take no advantage of "restrict" in  
C99 (or the __restrict__ extension, in C99 as well as in C++).


Quite strange, optimization seems to be at least as good for C99's  
VLAs, even without restrict and without -fargument-noalias. I don't  
have the C99 standard to read that up, but could not find a hint on  
the web that VLA arguments are forbidden to alias.


Am I overlooking something here? Why is "restrict" not useful in that  
context (but -fargument-noalias is)? And why does it work for VLAs  
without any of both?


Regards,
Markus

machine-dependent Passes on GIMPLE/SSA Tree's?

2006-11-27 Thread Markus Franke

Dear GCC Developers,

I am currently trying to get familiar with basic functionalities (and
their implementation) of the GCC (cc1).
After generating the GIMPLE representation of the Input-Parse-Tree a
huge number of optimisation passes takes place. These passes are said to
be hardware independent. However I am wondering, whether there are maybe
also some passes which make use of machine descriptions. I am thinking
about some Loop-Optimisations which can be performed much better if the
appropriate optimisation pass knows if there are vector machine
instructions. This would allow to parallelize the loop. (Tree level
if-conversion for vectorizer)
Are there also some other optimisation passes working on the GIMPLE/SSA
representation which make use of any machine-dependent features?

Thanks for help,

Markus Franke

poisened macro definitions

2006-12-05 Thread Markus Franke

Dear GCC Developers,

I want to port an existing backend (based on version gcc-2.7.2.3) on the
most recent release (gcc-4.1.1). During compilation process I get
several messages about some poisened macro definitions. The macros which
make problems are listed below:

---snip---
"CPP_PREDEFINES"
"TARGET_SWITCHES"
"EXTRA_CC_MODES"
"STRUCT_VALUE_REGNUM"
"NO_RECURSIVE_FUNCTION_CSE"
"FUNCTION_PROLOGUE"
"FUNCTION_EPILOGUE"
"TARGET_BELL"
"TARGET_BS"
"TARGET_TAB"
"TARGET_NEWLINE"
"TARGET_VT"
"TARGET_FF"
"TARGET_CR"
"EASY_DIV_EXPR"
"PREDICATE_CODES"
"CONST_COSTS"
"SELECT_RTX_SECTION"
"SELECT_SECTION"
"ASM_IDENTIFY_GCC"
"ASM_FILE_START"
"ASM_FILE_END"
"ASM_OPEN_PAREN"
"ASM_CLOSE_PAREN"
"ASM_GLOBALIZE_LABEL"
"FUNCTION_ARG_PARTIAL_NREGS"
--snap---

I read something about poisened macros and that they shouldn't be used
anymore. But in fact I was not able to find any documentation about
these macros. When were they declared as poisened and especially why?
What should be done instead of using this macros? Just uncommenting
everything can't be a solution. I was also looking in GCC-Internals
manual without any success.


Thanks for your help,
Markus Franke

needed headerfiles in .c

2006-12-22 Thread Markus Franke

Dear GCC Developers / Users,

I am trying to port a GCC-Backend from GCC 2.7.2.3 to GCC 4.1.1. After
having had a look on some already existing backends like the PDP11, I
found out that there have been a lot of new Header-Files added to
".c" as includes.

My question is now whether some kind of standard set of Header-Files
exists which is needed by every backend? Can somebody give me a list or
something like that. I had already a look at the Internals Manual but
without finding something about it.

Thanks in advance,

Markus Franke

relevant files for target backends

2007-01-16 Thread Markus Franke

Dear GCC Developers/Users,

I am trying to port a backend from GCC version 2.7.2.3 to version 4.1.1.
As far as I have seen, a lot of new relevant files were introduced to
build a good structure for new backends.

I am wondering where to define the prototypes for functions in
.c Shall the prototypes be defined in -protos.h or in
.h or in .c. As far as I understand the prototypes
should be defined in -protos.h, right? But if I do so several
errors/warnings arise because of undeclared prototypes.

Another question is where target macros should be defined. As far as I
can see .c has something like such a structure:

---snip---
#define 
#define 
#define 
#define 

struct gcc_target targetm = TARGET_INITIALIZER;
---snap---

But there are also a lot of macros defined in .h. What kind of
macros should be defined in .h and which macros should be
defined in .c?

I try to write good and standardised code in order to contribute my
development. I would appreciate any help. By the way, I already had a
look in the GCC Internals manual but I am still a bit confused.


Thanks in advance and regards,
Markus Franke

Re: relevant files for target backends

2007-01-16 Thread Markus Franke

Thank you for your response. I understood everything you said but I am
still confused about the file -protos.h. Which prototypes have
to be defined there?

Thanks in advance,
Markus Franke

pranav bhandarkar wrote:
>> I am wondering where to define the prototypes for functions in
>> .c Shall the prototypes be defined in -protos.h or in
>> .h or in .c. As far as I understand the prototypes
>> should be defined in -protos.h, right? But if I do so several
>> errors/warnings arise because of undeclared prototypes.
>>
>> Another question is where target macros should be defined. As far as I
>> can see .c has something like such a structure:
>>
>> ---snip---
>> #define 
>> #define 
>> #define 
>> #define 
>>
>> struct gcc_target targetm = TARGET_INITIALIZER;
> 
> .h is used to define macros that give such information as the
> register classes, whether little endian or not, sizes of integral
> types etc.
> The file .c, like you rightly said defines the targetm
> structure that holds pointers to target related functions and data.
> Such functions are defined in the .c file. Such target hooks are
> #defined in the .c file.
> HTH,
> Pranav
> 

-- 
Nichts ist so praktisch wie eine gute Theorie!

Re: relevant files for target backends

2007-01-16 Thread Markus Franke

Thank you very much. That was exactly the information I was looking for.
I will think about a contribution to the GCC Internals.

Thanks again,
Markus Franke

Rask Ingemann Lambertsen wrote:
> On Tue, Jan 16, 2007 at 11:24:56AM +0100, Markus Franke wrote:
> 
> 
>>I am wondering where to define the prototypes for functions in
>>.c Shall the prototypes be defined in -protos.h or in
>>.h or in .c. As far as I understand the prototypes
>>should be defined in -protos.h, right? But if I do so several
>>errors/warnings arise because of undeclared prototypes.
> 
> 
>All functions and variables not declared static in .c should
> have a prototype in -protos.h. Also, does this patch help?
> 
> Index: gcc/var-tracking.c
> ===
> --- gcc/var-tracking.c(revision 120287)
> +++ gcc/var-tracking.c(working copy)
> @@ -106,6 +106,7 @@
>  #include "expr.h"
>  #include "timevar.h"
>  #include "tree-pass.h"
> +#include "tm_p.h"
>  
>  /* Type of micro operation.  */
>  enum micro_operation_type
> 
> 
>>But there are also a lot of macros defined in .h. What kind of
>>macros should be defined in .h and which macros should be
>>defined in .c?
> 
> 
>Those listed as macros in the documentation should go into .h
> while those listed as target hooks should go into .c.
> 
> 
>>I try to write good and standardised code in order to contribute my
>>development. I would appreciate any help. By the way, I already had a
>>look in the GCC Internals manual but I am still a bit confused.
> 
> 
>I would like to encurage you to submit a patch for the GCC Internals
> manual to make it clearer.
>

-- 
Nichts ist so praktisch wie eine gute Theorie!

order of local variables in stack frame

2007-01-23 Thread Markus Franke

Dear GCC Developers,

I am working on a target backend for the DLX architecture and I have a
question concerning the layout of the stack frame.
Here is a simple test C-program:

---snip---
int main(void)
{
int a = 1;
int b = 2;
int c = a + b;
return c;
}
---snap---

The initialisation of the variables a and b produce the following output:

---snip---
movl$1, -24(%ebp)
movl$2, -20(%ebp)
---snap---

Although I have declared "STACK_GROWS_DOWNWARD" the variables a and b
are lying upwards in memory (-24 < -20). Shouldn't it be the other way
around because the stack should grow downwards towards smaller
addresses. I think it should be like this:

---snip---
movl$1, -20(%ebp)
movl$2, -24(%ebp)
---snap---

Please let me know whether I missunderstood something completely. If
this behaviour is correct what can I do to change it to the other way
around. Which macro variable do I have to change?


Thanks in advance,
Markus Franke

Re: order of local variables in stack frame

2007-01-23 Thread Markus Franke

Well, you are right. The code looks good and works also. But I have some
kind of a reference implementation which is based on GCC 2.7.2.3. In
this version the local variables are allocated the other way around, the
way in which I expected. Obviously, the order of allocation has changed
till now (4.1.1). I just wanted to know whether I can correct this, but
if not its also OK.

Thanks,
Markus

Robert Dewar wrote:
> Markus Franke wrote:
> 
>> Please let me know whether I missunderstood something completely. If
>> this behaviour is correct what can I do to change it to the other way
>> around. Which macro variable do I have to change?
> 
> 
> There is no legitimate reason to care about the order of variables
> in the local stack frame! Or at least I don't see one, why do *you*
> care? Generally one may want to reorder the variables for alignment
> purposes anyway.
> 

-- 
Nichts ist so praktisch wie eine gute Theorie!

warning: source missing a mode?

2007-02-22 Thread Markus Franke

Dear GCC Developers,

I have problems with a dlx backend, which I have ported to GCC 4.1.1.
During compilation of gcc I get warnings about missing mode definitions
in the machine description file. The following instruction template is
affected:

---snip---
;;
;; calls that return int in r1
;;
(define_insn "call_val_internal_return_r1"
[(parallel [(set (reg:SI 1)
  (call (match_operand:QI 0 "sym_ref_mem_operand" "")
(match_operand 1 "" "i")))
(clobber (reg:SI 31))])]
  ""
  "jal\\t%S0%("
  [(set_attr "type" "jump")
   (set_attr "mode" "none")])
---snap---

I think the warning is caused by the second parameter of the set
instruction, right? But I don't know where to specify the source mode. I
had already a look into the GCC Internals Manual without success.


Any suggestions how to fix this problem?



Regards,

Markus Franke

Re: warning: source missing a mode?

2007-02-22 Thread Markus Franke

Hello,

thank you for your answer. Having changed the code in the way you
suggested I get still the same warning message.
Any further suggestions?

Regards,
Markus

Ian Lance Taylor wrote:
> Markus Franke <[EMAIL PROTECTED]> writes:
> 
> 
>>---snip---
>>;;
>>;; calls that return int in r1
>>;;
>>(define_insn "call_val_internal_return_r1"
>>[(parallel [(set (reg:SI 1)
>>  (call (match_operand:QI 0 "sym_ref_mem_operand" "")
>>(match_operand 1 "" "i")))
>>(clobber (reg:SI 31))])]
>>  ""
>>  "jal\\t%S0%("
>>  [(set_attr "type" "jump")
>>   (set_attr "mode" "none")])
>>---snap---
>>
>>I think the warning is caused by the second parameter of the set
>>instruction, right? But I don't know where to specify the source mode. I
>>had already a look into the GCC Internals Manual without success.
> 
> 
> The missing mode is here:
> (match_operand 1 "" "i")
> That should most likely be
> (match_operand:SI 1 "" "i")
> 
> Ian
> 

-- 
Nichts ist so praktisch wie eine gute Theorie!

Re: warning: source missing a mode?

2007-02-22 Thread Markus Franke

Ian Lance Taylor wrote:
> Oh, yeah, you probably want to say call:SI too.

Yes I can do this and the warning message disappears. But now I get an
internal error message about a non matching rtl expression when
compiling a test program. Without "call:SI" I get warnings during
compilation but the compilation of my test program works. :-)

Re: warning: source missing a mode?

2007-02-22 Thread Markus Franke

Ian Lance Taylor wrote:
> Presumably the insn which doesn't match uses call with some mode other
> than SI.  What mode does it use?  What generated that insn with some
> other mode?

Well, the internal compiler error message which is thrown looks like this:

---snip---
Processing file 2120-2.c
2120-2.c: In function 'foo':
2120-2.c:13: error: unrecognizable insn:
(call_insn 14 13 15 1 (parallel [
(set:SI (reg:SI 1 r1)
(call (mem:QI (symbol_ref:SI ("odd") [flags 0x3]
) [0 S1 A8])
(const_int 8 [0x8])))
(clobber (reg:SI 31 r31))
]) -1 (nil)
(expr_list:REG_EH_REGION (const_int 0 [0x0])
(nil))
(nil))
2120-2.c:13: internal compiler error: in extract_insn, at recog.c:2084
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html> for instructions.
---snap---

By the way, this source code file is from the GCC testsuite. (torture tests)

My corresponding rtl template looks now like following one:

---snip---
;;
;; calls that return int in r1
;;
(define_insn "call_val_internal_return_r1"
[(parallel [(set (reg:SI 1)
  (call:SI (match_operand:QI 0 "sym_ref_mem_operand" "")
(match_operand:SI 1 "" "i")))
(clobber (reg:SI 31))])]
  ""
  "jal\\t%S0%("
  [(set_attr "type" "jump")
   (set_attr "mode" "none")])
---snap---

> Actually, most targets don't use a mode with call, since a call can
> often return multiple modes.  But then they don't use a mode for the
> result of the call, either.  Look at what other targets do.

Till, now I haven't found a similar template where the return value
which is set is a "reg:SI". I will keep on searching :-)

Regards,
Markus Franke

Re: warning: source missing a mode?

2007-02-26 Thread Markus Franke

Hello,

thanks for reply. I was reading and comparing machine description files
from other backends. Now I am using a template like this:

---snip---
(define_insn "call_value"
[(parallel [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:QI 1 sym_ref_mem_operand"
""))
(match_operand:SI 2 "" "")))
   (clobber (reg:SI 31))])]
  ""
  "jal\\t%S0%("
  [(set_attr "type" "jump")
   (set_attr "mode" "none")])
---snap---

This pattern matches for function calls but unfortunately now I get a
segmentation fault during compilation of the following program:

---snip---
int odd(int i)
{
  return i & 0x1;
}

int
foo(int i, int j)
{
  int a;
  a=odd(i+j);
  return a;
}
---snap---

The segmentation fault occurs in the last line of the foo()-function.

Any suggestions?

Regards and thanks for any comments,
Markus


Richard Henderson wrote:
> On Thu, Feb 22, 2007 at 11:10:09AM +0100, Markus Franke wrote:
> 
>>;; calls that return int in r1
>>;;
>>(define_insn "call_val_internal_return_r1"
>>[(parallel [(set (reg:SI 1)
>>  (call (match_operand:QI 0 "sym_ref_mem_operand" "")
> 
> 
> Why would you have a specific pattern for (reg:SI 1) as
> the return value?  I suggest you look at other ports for
> guidence here.  Normal is something like
> 
> 
> (define_insn "*call_value_1"
>   [(set (match_operand 0 "" "")
> (call (mem:QI (match_operand:SI 1 "call_insn_operand" "rsm"))
>   (match_operand:SI 2 "" "")))]
>   ...)
> 
> where both the return value and the call have no mode specified.
> 
> 
> 
> r~
> 

-- 
Nichts ist so praktisch wie eine gute Theorie!

call_value pattern

2007-03-09 Thread Markus Franke

Dear GCC Developers,

I have a rather simple piece of a "call_value" instruction pattern in my
machine description:

---snip---
(define_insn "call_value"
  [(set (match_operand 0 "" "")
(call (match_operand:QI 1 "" "")
(match_operand:SI 2 "" "")))]
  ""
  "jal\\t%S0%("
  [(set_attr "type" "jump")
   (set_attr "mode" "none")])
---snap---

However when I try to compile the following example I get a "Internal
Compiler" Error message.

---snip---
int odd(int i)
{
  return i & 0x1;
}

int foo(int i, int j)
{
  int a;
  a=odd(i+j);
  return a;
}
---snap---

If simply omit the "a=odd(i+j)"-line everything works fine so it has
something to do with this call.
What am I doing wrong? It seems so simple but I can't figure out what's
wrong with my pattern.

Regards,
Markus Franke

error: unable to find a register to spill in class 'FP_REGS'

2007-03-14 Thread Markus Franke

Dear GCC Developers/Users,

I am working on a port of a target backend. I have a problem when
compiling the following program:

---snip---
short b = 5;
short c = 5;
int main(void)
{
long int a[b][c];
a[1][1]=5;
return 0;
}
---snap---

During compilation I get the following error message:

---snip---
simple_alloc.c: In function 'main':
simple_alloc.c:8: error: unable to find a register to spill in class
'FP_REGS'
simple_alloc.c:8: error: this is the insn:
(insn 45 47 46 0 (set (subreg:SI (reg:DI 92 [ D.1212 ]) 4)
(reg:SI 1 r1 [orig:93 D.1211 ] [93])) 40 {movsi_general} (nil)
(expr_list:REG_DEAD (reg:SI 1 r1 [orig:93 D.1211 ] [93])
(expr_list:REG_NO_CONFLICT (reg:SI 1 r1 [orig:93 D.1211 ] [93])
(nil
simple_alloc.c:8: confused by earlier errors, bailing out
---snap---

After a while of diving through gcc source code I can't get a clue
what's going wrong.
The problem is that the reloading stage fails because it can't find any
free registers for spilling, right? But the compiler says that it wants
to spill in class 'FP_REGS' and the above RTL-statement doesn't have
anything to do with floating point arithmetic!?!?!

I found out that in "gcc/reload1.c:find_reg()" the following
if-statement never gets satisfied:

---snip---
if (! TEST_HARD_REG_BIT (not_usable, regno)
  && ! TEST_HARD_REG_BIT (used_by_other_reload, regno)
  && HARD_REGNO_MODE_OK (regno, rl->mode))
---snap---

This causes in the end the error message because no suitable registers
could be found.
Does anybody have an idea what could be wrong in the machine description
or to where start finding the error?

Any suggestions are welcome.

Regards,
Markus Franke

-- 
Nichts ist so praktisch wie eine gute Theorie!

Re: error: unable to find a register to spill in class 'FP_REGS'

2007-03-15 Thread Markus Franke

Hello,

thanks for your answer.
Here is an excerpt of the .00.expand file for insn 45:

---snip---
(insn 45 47 46 1 (set (subreg:SI (reg:DI 92 [ D.1212 ]) 4)
(reg:SI 93 [ D.1211 ])) -1 (nil)
(expr_list:REG_NO_CONFLICT (reg:SI 93 [ D.1211 ])
(nil)))
---snap---

That means the compiler has to reload the pseudo registers 92 and 93 for
this instruction, right?

Jim Wilson wrote:
> and lreg ones.  The greg one will have a section listing all of the
> reloads generated, find the list of reloads generated for this insn 45.

The relevant data for instruction 45 in .greg looks like that:
---snip---
;; Function main

;; Need 2 regs of class FP_REGS (for insn 15).
;; Need 2 regs of class ALL_REGS (for insn 15).
Spilling reg 32.
Spilling reg 33.
;; Register dispositions:
69 in 3  70 in 8  71 in 4  72 in 8  73 in 5  74 in 8
75 in 6  76 in 1  77 in 8  78 in 9  79 in 10  80 in 9
81 in 8  82 in 9  83 in 8  84 in 10  85 in 8  86 in 9
87 in 8  88 in 8  89 in 9  90 in 7  91 in 9  92 in 9
93 in 10  94 in 2

;; Hard regs used:  1 2 3 4 5 6 7 8 9 10 29 30 31 32 33



(insn 45 44 47 (use (reg:SI 8 r8)) -1 (nil)
(nil))


---snap---

> lreg will have info about register class preferencing.  It will tell you
> what register class the compiler wants to use for this insn.

Same for the .lreg file:

---snip---
;; Function main

95 registers.

Register 69 used 7 times across 0 insns.

Register 70 used 2 times across 0 insns; pointer.

Register 71 used 6 times across 0 insns.

Register 72 used 2 times across 0 insns.

Register 73 used 6 times across 0 insns.

Register 74 used 2 times across 0 insns; pointer.

Register 75 used 7 times across 0 insns.

Register 76 used 2 times across 0 insns; pref FP_REGS.

Register 77 used 4 times across 0 insns.

Register 78 used 2 times across 0 insns.

Register 79 used 2 times across 0 insns.

Register 80 used 2 times across 0 insns.

Register 81 used 2 times across 0 insns.

Register 82 used 2 times across 0 insns.

Register 83 used 2 times across 0 insns.

Register 84 used 2 times across 0 insns.

Register 85 used 2 times across 0 insns; pref FP_REGS; pointer.

Register 86 used 2 times across 0 insns.

Register 87 used 2 times across 0 insns.

Register 88 used 4 times across 0 insns.

Register 89 used 2 times across 0 insns.

Register 90 used 6 times across 0 insns.

Register 91 used 2 times across 0 insns.

Register 92 used 2 times across 0 insns; pointer.

Register 93 used 2 times across 0 insns.

Register 94 used 2 times across 0 insns; crosses 1 call; pref FP_REGS.

0 basic blocks.

;; Register 69 in 3.
;; Register 70 in 8.
;; Register 71 in 4.
;; Register 72 in 8.
;; Register 73 in 5.
;; Register 74 in 8.
;; Register 75 in 6.
;; Register 76 in 1.
;; Register 77 in 8.
;; Register 78 in 9.
;; Register 79 in 10.
;; Register 80 in 9.
;; Register 81 in 8.
;; Register 82 in 9.
;; Register 83 in 8.
;; Register 84 in 10.
;; Register 85 in 8.
;; Register 86 in 9.
;; Register 87 in 8.
;; Register 88 in 8.
;; Register 89 in 9.
;; Register 90 in 7.
;; Register 91 in 9.
;; Register 92 in 9.
;; Register 93 in 10.
;; Register 94 in 2.



(insn 45 44 47 (use (reg:SI 88)) -1 (nil)
(nil))


---snap---

> The fact that this insn doesn't do FP isn't important.  What is
> important is how the pseudo-regs are used.  If the pseudo-reg 92 is used
> in 10 insns, and 8 of them are FP insns and 2 are integer move insns,
> then the register allocator will prefer an FP reg, since that should
> give the best overall result, as only 2 insns will need reloads.  If it
> used an integer reg, then 8 insns would need reloads.

Ok that's clear. Thanks for explaining. Nevertheless I can't figure out
from the above files what's wrong and maybe I am just lacking of the
right interpretation of these intermediate files.
Can you figure out any abnormal behaviour from the above file excerpts?


Thanks in advance,

Markus Franke

Re: error: unable to find a register to spill in class 'FP_REGS'

2007-03-16 Thread Markus Franke

Again with attachment and CC to the Mailing List. Sorry for missing this.
Regards,
Markus

---

Hello,

thanks for your instructions. Indeed you were right. I mixed up some files.
Again an excerpt of the output-files:

---snip--- // expand
(insn 45 47 46 1 (set (subreg:SI (reg:DI 92 [ D.1212 ]) 4)
(reg:SI 93 [ D.1211 ])) -1 (nil)
(expr_list:REG_NO_CONFLICT (reg:SI 93 [ D.1211 ])
(nil)))
---snap---

---snip--- // lreg
(insn 45 47 46 0 (set (subreg:SI (reg:DI 92 [ D.1212 ]) 4)
(reg:SI 93 [ D.1211 ])) 40 {movsi_general} (nil)
(expr_list:REG_DEAD (reg:SI 93 [ D.1211 ])
(expr_list:REG_NO_CONFLICT (reg:SI 93 [ D.1211 ])
(nil
---snap---

---snip--- // greg
;; Function main (main)

Spilling for insn 45.
---snap---

Aditionally, you can find the three files in the attachment. It would be
really nice of you if you could have a look at it. Maybe you can get a
clue what's going wrong.

Regards,
Markus Franke

Jim Wilson wrote:
> Markus Franke wrote:
> 
>> That means the compiler has to reload the pseudo registers 92 and 93 for
>> this instruction, right?
> 
> 
> First we do register allocation.  Then, after register allocation, if
> the chosen hard registers don't match the constraints, then we use
> reload to fix it.
> 
>> The relevant data for instruction 45 in .greg looks like that:
> 
> 
> Insn 45 in the greg dump looks nothing like the insn 45 in the expand
> dump, which means you are looking at the wrong insn here.  But it was
> insn 45 in the original mail.  Did you change the testcase perhaps?  Or
> use different optimization options?
> 
> The info we are looking for should look something like this
> Reloads for insn # 13
> Reload 0: reload_out (SI) = (reg:SI 97)
> R1_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
> reload_out_reg: (reg:SI 97)
> reload_reg_rtx: (reg:SI 1 %r1)
> 
>> ;; Register 92 in 9.
>> ;; Register 93 in 10.
> 
> 
> This tells us that pseudo 92 was allocated to hard reg 9, and pseudo 93
> was allocated to hard reg 10.  I didn't see reg class preferencing info
> for these regs, but maybe it is in one of the other dump files.
> 
> The earlier message has rtl claiming that pseudo 92 got allocated to
> register 1 (r1).  I seem to be getting inconsistent information here.

-- 
Nichts ist so praktisch wie eine gute Theorie!



;; Function main (main)


;; Generating RTL for tree basic block 0


;;
;; Full RTL generated for this function:
;;
(note 2 0 6 NOTE_INSN_DELETED)

;; Start of basic block 0, registers live: (nil)
(note 6 2 3 0 [bb 0] NOTE_INSN_BASIC_BLOCK)

(note 3 6 180 0 NOTE_INSN_FUNCTION_BEG)

(insn 180 3 4 0 (set (reg:SI 139)
(reg/f:SI 29 r29)) -1 (nil)
(nil))

(note 4 180 5 0 NOTE_INSN_DELETED)

(call_insn 5 4 7 0 (parallel [
(call (mem:QI (symbol_ref:SI ("__main") [flags 0x41]) [0 S1 A8])
(const_int 0 [0x0]))
(clobber (reg:SI 31 r31))
]) -1 (nil)
(expr_list:REG_EH_REGION (const_int 0 [0x0])
(nil))
(nil))
;; End of basic block 0, registers live:
 (nil)

;; Start of basic block 1, registers live: (nil)
(note 7 5 8 1 [bb 1] NOTE_INSN_BASIC_BLOCK)

(insn 8 7 9 1 (set (reg:SI 108)
(reg/f:SI 29 r29)) -1 (nil)
(nil))

(insn 9 8 11 1 (set (reg:SI 70 [ saved_stack3 ])
(reg:SI 108)) -1 (nil)
(nil))

(insn 11 9 12 1 (set (reg/f:SI 109)
(symbol_ref:SI ("c") [flags 0x2] )) -1 (nil)
(nil))

(insn 12 11 13 1 (set (reg:HI 106 [ c0 ])
(mem/c/i:HI (reg/f:SI 109) [0 c+0 S2 A16])) -1 (nil)
(nil))

(insn 13 12 14 1 (set (reg:SI 105 [ D.1199 ])
(sign_extend:SI (reg:HI 106 [ c0 ]))) -1 (nil)
(nil))

(insn 14 13 15 1 (set (reg:SI 104 [ D.1200 ])
(plus:SI (reg:SI 105 [ D.1199 ])
(const_int -1 [0x]))) -1 (nil)
(nil))

(insn 15 14 16 1 (set (reg:SI 103 [ D.1201 ])
(reg:SI 104 [ D.1200 ])) -1 (nil)
(nil))

(insn 16 15 19 1 (set (reg:SI 102 [ D.1202 ])
(sign_extend:SI (reg:HI 106 [ c0 ]))) -1 (nil)
(nil))

(insn 19 16 17 1 (clobber (reg:DI 101 [ D.1203 ])) -1 (nil)
(insn_list:REG_LIBCALL 18 (nil)))

(insn 17 19 18 1 (set (subreg:SI (reg:DI 101 [ D.1203 ]) 4)
(reg:SI 102 [ D.1202 ])) -1 (nil)
(expr_list:REG_NO_CONFLICT (reg:SI 102 [ D.1202 ])
(nil)))

(insn 18 17 22 1 (set (subreg:SI (reg:DI 101 [ D.1203 ]) 0)
(const_int 0 [0x0])) -1 (nil)
(insn_list:REG_RETVAL 19 (expr_list:REG_NO_CONFLICT (reg:SI 102 [ D.1202 ])
(nil

(insn 22 18 20 1 (clobber (reg:DI 110)) -1 (nil)
(insn_list:REG_LIBCALL 21 (nil)))

(insn 20 22 21 1 (set (subreg:SI (reg:DI 110) 0)
(and:SI (subreg:SI (reg:DI 101 [ D.1203 ]) 0)
(const_int 15 [0xf]))) -1 (nil)
(expr_list:REG_NO_CONFLICT (reg:DI 101 [ D.1203 ])
(nil)))

(insn 21 20 25 1 (set (subr

Re: Skipping incompatable libaries on a SPARC cross compile

2005-11-08 Thread Markus Trippelsdorf

On Tue, Nov 08, 2005 at 09:17:10AM -0700, Mark Cuss wrote:
> Hi Eric
> 
> sparc-sun-solaris2.9-objdump -f returns the following:
> libc.so:
> start address 0x
> ...

Congratulations, this must be the longest top-post ever.

-- 
Markus

Re: debug-early branch merged into mainline

2015-06-09 Thread Markus Trippelsdorf

On 2015.06.06 at 18:52 -0400, Aldy Hernandez wrote:
> On 06/06/2015 05:47 PM, Aldy Hernandez wrote:
> > On 06/06/2015 03:33 PM, Jan Hubicka wrote:
> >> Aldy,
> >> also at PPC64le LTO bootstrap (at gcc112) dies with:
> >> ^
> >> 0x104ae8f7 check_die
> >>  ../../gcc/dwarf2out.c:5715
> >
> > Hmmm... this is in the LTO/ltrans stage?  If so, that's weird.  The LTO
> > path does not do the early DIE dance.  Since check_die() is a new sanity
> > check for DIEs, I wonder if this is a latent bug that was already there.
> 
> It looks like ppc64le fails to bootstrap with the same comparison 
> failure aarch64 fails with, so I need to take a look at ppc64le regardless.
> 
> However, for your particular problem, I wonder if this was a preexisting 
> condition.  Would you mind reproducing your problem without my 
> debug-early patchset, but with the attached patch?
> 
> The commit prior to debug-early is:
> 
>   git commit d51560f9afc5c8a826bcfa6fc90a96156b623559
>   trunk@224160
> 
> The attached patch adds the sanity check, but without the debug-early 
> work.  If you still get a failure, this is a pre-existing bug.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66468 for a small
testcase.
It looks like this is not a pre-existing issue, because with your
sanity check there is not failure.

-- 
Markus

Re: Moving to git

2015-08-21 Thread Markus Trippelsdorf

On 2015.08.21 at 06:47 -0700, H.J. Lu wrote:
> On Fri, Aug 21, 2015 at 6:37 AM, Ramana Radhakrishnan
>  wrote:
> > On Fri, Aug 21, 2015 at 11:48 AM, Jonathan Wakely  
> > wrote:
> >> On 21 August 2015 at 11:44, Ramana Radhakrishnan wrote:
> >>>>
> >>>> Absolutely, a non-fast-forward push is anathema for anything other people
> >>>> might be working on.  The git repository already prohibits this; people 
> >>>> that
> >>>> want to push-rebase-push their own branches need to delete the branch 
> >>>> before
> >>>> pushing again.
> >>>
> >>> On the FSF trunk and the main release branches - I agree this is a
> >>> complete no-no.
> >>>
> >>> A push-rebase-push development model is possible / may be useful when
> >>> the developers collaborating on that branch agree on that model.
> >>
> >> Teams following a different model could use a separate repo shared by
> >> those developers, not the gcc.gnu.org one. It's much easier to do that
> >> with git.
> >
> > Yes you are right they sure can, but one of the reasons that teams are
> > doing their development on a feature branch is so that they can obtain
> > feedback and collaborate with others in the community. People wanting
> > to adopt more aggressive uses of git should be allowed to do so in
> > their private branches as long as they are not able to mess up the
> > official branches in the repository.
> >
> > If there is no way to have some branches in a repo allow rebasing and
> > others not, that's fine but I'd like to know that's the case.
> >
> > Adopting restrictions on the official branches is quite right (list
> > below not extensive but it sounds like) ...
> >
> > a. no rebase / rewriting history
> > b. no git merges from feature branches.
> 
> One very frustrating thing for me is "git bisect" doesn't always
> work.  I think cherry-pick is OK, but probably not rebase nor merge.
> 
> Can we enforce that "git bisect" must work on official branches?

The Linux kernel uses merges all the time, yet "git bisect" works
without any issues. So this not a reason to forbid merges.

BTW while I have your attention: Why are you constantly creating
(rebasing) and deleting branches? Why not simply use a local git tree
for this purpose?

-- 
Markus

Re: Offer of help with move to git

2015-08-23 Thread Markus Trippelsdorf

On 2015.08.23 at 11:36 -0500, Segher Boessenkool wrote:
> On Sun, Aug 23, 2015 at 12:26:25PM -0400, Eric S. Raymond wrote:
> > One way to do it would be to mine the list archives for not just names
> > but name-date pairs. With a little scripting work that could be processed
> > into a sequence of map files, each one valid for a known span of dates.  The
> > only assumption required is that an email address is valid for a person
> > until explicitly superseded by a different address in the archive.
> 
> We also have a MAINTAINERS file (in the toplevel dir of the repo) that
> should hold useful email addresses for everyone, at any point in time.
> Of course sometimes people forget to update it.  It also does not hold
> the actual account names, but you can almost always get those from the
> checkin to the MAINTAINERS file itself (or correlate with ChangeLogs,
> etc.)  Won't that work better than the ML archives?

Another possibility would be to simply use the @gcc.gnu.org addresses.
That should make the mapping pretty straightforward.

-- 
Markus

Re: compiling gcc 5.x from the scratch: issues

2015-09-09 Thread Markus Trippelsdorf

On 2015.09.09 at 08:36 +, Michael Mishourovsky wrote:
>  At my work I would like to have recent gcc installed but i have no
>  sudo rights to update the current gcc (its 4.4.7, and OS is redhat
>  linux).
> 
>  So I checked out latest version of gcc via svn, and following
>  guidelines given at gcc webpage, tried to download prerequisites,
>  configure gcc to be compiled into a separate folder and to start
>  make. But got a lot of errors including errors in test files (that
>  are easy to fix), errors during linkage (seems to be almost final
>  step); eventually failed to  build it up. As it is supposed that this
>  SW is extensively tested, could you tell me: whether someone tried to
>  build latest gcc (say, gcc 5.1 or 5.2) from scratch and was succeed
>  in this (if he/she follows the guidelines given at the gcc web-site?
>  If possible, could you check -whether it is still possible to compile
>  gcc or not and update guidelines, if something has been changed when
>  time passes by?

Please make sure that you use a build directory outside the gcc source
tree.
You may simply follow: https://gcc.gnu.org/install/

If the issue still persists, please post the exact errors that you
encounter.

-- 
Markus

Re: Devirtualization causing undefined symbol references at link?

2015-11-17 Thread Markus Trippelsdorf

On 2015.11.16 at 14:18 -0800, Steven Noonan wrote:
> Hi folks,
> 
> (I'm not subscribed to the list, so please CC me on all responses.)
> 
> This is using GCC 5.2 on Linux x86_64. On a project at work I've found
> that one of our shared libraries refuses to link because of some
> symbol references it shouldn't be making. If I add "-fno-devirtualize
> -fno-devirtualize-speculatively" to the compile flags, the issue goes
> away and everything links/runs fine. The issue does *not* appear on
> GCC 4.8 (which is used by our current production toolchain).
> 
> First of all, does anyone have any ideas off the top of their head why
> devirtualization would break like this?
> 
> Second, I'm looking for any ideas on how to gather meaningful data to
> submit a useful bug report for this issue. The best idea I've come up
> with so far is to preprocess one of the sources with the incorrect
> references and use 'delta' to reduce it to a minimal preprocessed
> source file that references one of these incorrect symbols.
> Unfortunately this is a sluggish process because such a minimal test
> case would need to compile correctly to an object file -- so "delta"
> is reducing it very slowly. So far I'm down from 11MB preprocessed
> source to 1.1MB preprocessed source after running delta a few times.

These undefined references are normally user errors. For example, when
you define an inline function, you need to link with the symbols it
uses.

markus@x4 /tmp % cat main.ii
struct A {
  void foo();
};
struct B {
  A v;
  virtual void bar() { v.foo(); }
};
struct C {
  B *w;
  void Test() {
if (!w)
  return;
while (1)
  w->bar();
  }
};
C a;
int main() { a.Test(); }

markus@x4 /tmp % g++ -fno-devirtualize -O2 -Wl,--no-undefined main.ii
markus@x4 /tmp % g++ -O2 -Wl,--no-undefined main.ii
/tmp/ccEvh2dL.o:main.ii:function B::bar(): error: undefined reference to 
'A::foo()'
/tmp/ccEvh2dL.o:main.ii:function main: error: undefined reference to 'A::foo()'
collect2: error: ld returned 1 exit status

Instead of using delta you could try creduce instead. It is normally
much quicker:

https://github.com/csmith-project/creduce

-- 
Markus

Re: Devirtualization causing undefined symbol references at link?

2015-11-23 Thread Markus Trippelsdorf

On 2015.11.23 at 11:11 -0800, Steven Noonan wrote:
> On Tue, Nov 17, 2015 at 1:09 AM, Markus Trippelsdorf
>  wrote:
> > On 2015.11.16 at 14:18 -0800, Steven Noonan wrote:
> >> Hi folks,
> >>
> >> (I'm not subscribed to the list, so please CC me on all responses.)
> >>
> >> This is using GCC 5.2 on Linux x86_64. On a project at work I've found
> >> that one of our shared libraries refuses to link because of some
> >> symbol references it shouldn't be making. If I add "-fno-devirtualize
> >> -fno-devirtualize-speculatively" to the compile flags, the issue goes
> >> away and everything links/runs fine. The issue does *not* appear on
> >> GCC 4.8 (which is used by our current production toolchain).
> >>
> >> First of all, does anyone have any ideas off the top of their head why
> >> devirtualization would break like this?
> >>
> >> Second, I'm looking for any ideas on how to gather meaningful data to
> >> submit a useful bug report for this issue. The best idea I've come up
> >> with so far is to preprocess one of the sources with the incorrect
> >> references and use 'delta' to reduce it to a minimal preprocessed
> >> source file that references one of these incorrect symbols.
> >> Unfortunately this is a sluggish process because such a minimal test
> >> case would need to compile correctly to an object file -- so "delta"
> >> is reducing it very slowly. So far I'm down from 11MB preprocessed
> >> source to 1.1MB preprocessed source after running delta a few times.
> >
> > These undefined references are normally user errors. For example, when
> > you define an inline function, you need to link with the symbols it
> > uses.
> >
> > markus@x4 /tmp % cat main.ii
> > struct A {
> >   void foo();
> > };
> > struct B {
> >   A v;
> >   virtual void bar() { v.foo(); }
> > };
> > struct C {
> >   B *w;
> >   void Test() {
> > if (!w)
> >   return;
> > while (1)
> >   w->bar();
> >   }
> > };
> > C a;
> > int main() { a.Test(); }
> >
> > markus@x4 /tmp % g++ -fno-devirtualize -O2 -Wl,--no-undefined main.ii
> > markus@x4 /tmp % g++ -O2 -Wl,--no-undefined main.ii
> > /tmp/ccEvh2dL.o:main.ii:function B::bar(): error: undefined reference to 
> > 'A::foo()'
> > /tmp/ccEvh2dL.o:main.ii:function main: error: undefined reference to 
> > 'A::foo()'
> > collect2: error: ld returned 1 exit status
> >
> > Instead of using delta you could try creduce instead. It is normally
> > much quicker:
> >
> > https://github.com/csmith-project/creduce
> >
> 
> creduce did make a much smaller test case, and it's actually sort of
> readable. I'm not sure that I selected for the right criteria in my
> test script though. It appears to exhibit the negative behavior we're
> observing at least.
> 
> ---
> namespace panorama {
> class A {
> public:
>   virtual int *AccessIUIStyle() = 0;
> };
> class CUIPanel : A {
>   int *AccessIUIStyle() { return AccessStyle(); }
>   int *AccessStyle() const;
> };
> class B {
>   float GetSplitterPosition();
>   A *m_pIUIPanel;
> };
> }
> using namespace panorama;
> float B::GetSplitterPosition() {
>   m_pIUIPanel->AccessIUIStyle();
>   return 0.0f;
> }

Yes. It is the same issue that I've pointed out in my example above.
You need to either link with the object file that provides the
_ZNK8panorama8CUIPanel11AccessStyleEv symbol. Or move the definition of 
panorama::CUIPanel::AccessIUIStyle() to the file that defines
panorama::CUIPanel::AccessStyle().

-- 
Markus

Re: getting bugzilla access for my account

2016-01-02 Thread Markus Trippelsdorf

On 2016.01.02 at 03:49 -0500, Mike Frysinger wrote:
> seeing as how i have commit access to the gcc tree, could i have
> my bugzilla privs extended as well ?  atm i only have normal ones
> which means i only get to edit my own bugs ... can't dupe/update
> other ones people have filed.  couldn't seem to find docs for how
> to request this, so spamming this list.

Just log in with your gcc email address...

-- 
Markus

Re: Status of GCC 6 on x86_64 (Debian)

2016-01-22 Thread Markus Trippelsdorf

On 2016.01.22 at 11:27 -0800, Martin Michlmayr wrote:
> * Martin Michlmayr  [2016-01-21 21:09]:
> > * 13: test suite failures (segfaults and similar); not clear if the
> >   package or if GCC is at fault.
> 
> Rene Engelhard pointed me to 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69327
> which might explain some of these segfaults.

My guess would be that most segfaults are caused by the much more
aggressive optimization based on the assumption that "this" must never
be NULL in C++. 

QT5, Chromium and Kdevelop all call methods from a NULL pointer and
crash for this reason. -fno-delete-null-pointer-checks is a workaround.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68853 for an analysis
of the Chromium crash.

-- 
Markus

Re: Status of GCC 6 on x86_64 (Debian)

2016-02-03 Thread Markus Trippelsdorf

On 2016.02.03 at 01:13 +0100, Matthias Klose wrote:
> On 22.01.2016 08:27, Matthias Klose wrote:
> >On 22.01.2016 06:09, Martin Michlmayr wrote:
> >>In terms of build failures, I reported 520 bugs to Debian.  Most of them
> >>were new GCC errors or warnings (some packages use -Werror and many
> >>-Werror=format-security).
> >>
> >>Here are some of the most frequent errors see:
> >
> >[...]
> >Martin tagged these issues; https://wiki.debian.org/GCC6 has links with these
> >bug searches.
> 
> Now added the issues with the gcc6-unknown tag, including packages with
> build failures in running the test suites, which might point out wrong-code
> issues.

Looks like Google Mock (of googletest) invokes undefined behavior
(member call on null "this" pointer). So potentially all packages that
use googletest in their testsuite may be affected. 

I've opened: https://github.com/google/googletest/issues/705

-- 
Markus

Re: gnu-gabi group

2016-02-19 Thread Markus Trippelsdorf

On 2016.02.19 at 12:57 -0800, H.J. Lu wrote:
> On Mon, Feb 15, 2016 at 10:20 AM, Jose E. Marchesi
>  wrote:
> >
> > > Great. I'll ask overseers to create a mailinglist. [...]
> >
> > Done [1] [2].  If y'all need a wiki too, just ask.
> >
> > [1] gnu-g...@sourceware.org
> > [2] https://sourceware.org/ml/gnu-gabi/
> >
> > The link to the "GNU GABI project web page" in
> > https://sourceware.org/ml/gnu-gabi is broken.
> 
> How do I subscribe gnu-abi mailing list?  The project page just
> points to the mailing list archive.  There is no option to subscribe
> it.

https://sourceware.org/lists.html#ml-requestor

-- 
Markus

Re: [isocpp-parallel] Proposal for new memory_order_consume definition

2016-02-28 Thread Markus Trippelsdorf

On 2016.02.27 at 15:10 -0800, Paul E. McKenney via llvm-dev wrote:
> On Sat, Feb 27, 2016 at 11:16:51AM -0800, Linus Torvalds wrote:
> > On Feb 27, 2016 09:06, "Paul E. McKenney" 
> > wrote:
> > >
> > >
> > > But we do already have something very similar with signed integer
> > > overflow.  If the compiler can see a way to generate faster code that
> > > does not handle the overflow case, then the semantics suddenly change
> > > from twos-complement arithmetic to something very strange.  The standard
> > > does not specify all the ways that the implementation might deduce that
> > > faster code can be generated by ignoring the overflow case, it instead
> > > simply says that signed integer overflow invoked undefined behavior.
> > >
> > > And if that is a problem, you use unsigned integers instead of signed
> > > integers.
> > 
> > Actually, in the case of there Linux kernel we just tell the compiler to
> > not be an ass. We use
> > 
> >   -fno-strict-overflow
> 
> That is the one!
> 
> > or something. I forget the exact compiler flag needed for "the standard is
> > as broken piece of shit and made things undefined for very bad reasons".
> > 
> > See also there idiotic standard C alias rules. Same deal.
> 
> For which we use -fno-strict-aliasing.

Do not forget -fno-delete-null-pointer-checks. 

So the kernel obviously is already using its own C dialect, that is
pretty far from standard C.
All these options also have a negative impact on the performance of the
generated code.

-- 
Markus

Re: Aggressive load in gcc when accessing escaped pointer?

2016-03-18 Thread Markus Trippelsdorf

On 2016.03.18 at 22:05 +0800, Cy Cheng wrote:
> Hi,
> 
> Please look at this c code:
> 
> typedef struct _PB {
>   void* data;  /* required.*/
>   int   f1_;
>   float f2_;
> } PB;
> 
> PB** bar(PB** t);
> 
> void qux(PB* c) {
>   bar(&c);  /* c is escaped because of bar */
>   c->f1_ = 0;
>   c->f2_ = 0.f;
> }
> 
> // gcc-5.2.1 with -fno-strict-aliasing -O2 on x86
> callbar
> movq8(%rsp), %rax  <= load pointer c
> movl$0, 8(%rax)
> movl$0x, 12(%rax)
> 
> // intel-icc-13.0.1 with -fno-strict-aliasing -O2 on x86
> call  bar(_PB**)
> movq  (%rsp), %rdx  <= load pointer c
> movl  %ecx, 8(%rdx)
> movq  (%rsp), %rsi   <= load pointer c
> movl  %ecx, 12(%rsi)
> 
> GCC only load pointer c once, but if I implement bar in such way:
> PB** bar(PB** t) {
>   char* ptr = (char*)t;
>   *t = (PB*)(ptr-8);
>return t;
> }
> 
> Does this volatile C99 standard rule
> (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf):
> 
>6.5.16 Assignment operators
> 
>"3. ... The side effect of updating the stored value of the left operand
> shall occur between the previous and the next sequence point."

No. We discussed this on IRC today and Richard Biener pointed out that
bar cannot make c point to &c - 8, because computing that pointer would
be invalid. So c->f1_ cannot clobber c itself. 

-- 
Markus

Re: diag color

2016-07-30 Thread Markus Trippelsdorf

On 2016.07.31 at 08:08 +0200, phi gcc wrote:
> Hi All,
> 
> Reposting this here from gcc-help.
> 
> 
> I got the impression that I got colors in diag output depsite the fact
> that I got no GCC env var setup.
> 
> The version I used is
> CX08$ cc --version
> cc (Ubuntu 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609
> 
> This is a bit of a problem because I don't use an xterm, and the term
> I used don't understand vt100 (xterm, etc..) escape sequence.
> 
> my TERM env var is properly setup, emacs etc all works with my TERM
> and terminfo database that define how to do colors.
> 
> Would it be posible that gcc honors the TERM variable setup, if using
> termcap/terminfo is too complicated, at least disable coloring if TERM
> is not vt100 compatible.

Well, the current setup works for the vast majority of users.
You can either use -fno-diagnostics-color in your CFLAGS or configure
gcc with: --with-diagnostics-color=never.

-- 
Markus

Re: diag color

2016-07-31 Thread Markus Trippelsdorf

On 2016.07.31 at 10:46 +0200, phi gcc wrote:
> While a simple getenv("TERM") to setup the default of the color
> predicate before going to the sequence of testing CFLAGS, et the
> optargs, would cost almost nothing.

If you want a full explanation of the current behavior please read the
comments in gcc/diagnostic-color.c.

-- 
Markus

Re: Potential bug about the order of destructors of static objects and atexit() callbacks in C++?

2016-08-01 Thread Markus Trippelsdorf

On 2016.08.01 at 18:16 +0800, lh mouse wrote:
> Hello GCC developers,
> 
> Reading the ISO C++ standard,
> > 3.6.4 Termination [basic.start.term]
> > 3 If the completion of the initialization of an object with
> > static storage duration is sequenced before a call to std::atexit
> > (see , 18.5), the call to the function passed to std::atexit
> > is sequenced before the call to the destructor for the object. ...
> 
> Notwithstanding the vagueness of 'the completion of the initialization of an 
> object',
> the following program:
> 
> #include 
> #include 
> 
> enum class state {
> null,
> initialized,
> destroyed,
> };
> 
> extern void broken_atexit();
> 
> struct global_data {
> state s;
> 
> global_data()
> : s(state::null)
> {
> std::puts("delegated constructor");
> }
> global_data(int) 
> : global_data()
> {
> s = state::initialized;
> std::atexit(&broken_atexit);
> std::puts("delegating constructor");
> }
> ~global_data(){
> s = state::destroyed;
> }
> } data(1);
> 
> void broken_atexit(){
> if(data.s == state::destroyed){
> std::puts("attempt to use a destroyed object?");
> std::abort();
> }
> std::puts("okay");
> }
> 
> int main(){
> }
> 
> , when compiled with GCC, results in use of a destroyed object:
> 
> lh_mouse@lhmouse-dev:~$ g++ test.cc -std=c++11
> lh_mouse@lhmouse-dev:~$ ./a.out 
> delegated constructor
> delegating constructor
> attempt to use a destroyed object?
> Aborted
> lh_mouse@lhmouse-dev:~$
> 
> The reason of this problem is that GCC front-end registers the dtor after
> the delegating constructor returns, which is invoked before the other 
> callback registered inside the delegating constructor body.
> 
> The problem would be gone only if the GCC front-end registers the dtor after
> the delegated constructor returns.
> 
> Is this a GCC bug?

I don't think so. All compilers (clang, icc, visual C++) behave the
same. 
Also these kinds of questions regarding the C++ standard should be asked
on a more appropriate forum like stackoverflow.com or the ISO C++ group:
https://groups.google.com/a/isocpp.org/forum/?fromgroups#!forum/std-discussion


-- 
Markus

Re: Chasing a potential wrong-code bug on trunk

2016-11-17 Thread Markus Trippelsdorf

On 2016.11.17 at 10:49 +0100, Martin Reinecke wrote:
> Hi,
> 
> At some point in May 2016 there was a patch to the gcc trunk which
> caused one of my numerical codes to give incorrect results when compiled
> with this gcc version. This may of course be caused by some undefined
> behavior I'm unknowingly invoking in the code, or it may be a code
> generation bug in gcc. I tried to isolate the exact gcc commit that
> caused the change, but I got stuck...

You should check this first by compiling with -fsanitize=undefined and
fixing all issues that may pop up.

-- 
Markus

git server is stuck

2016-12-10 Thread Markus Trippelsdorf

The git server seems to be stuck for over a day.
Latest revision on it is r243504.
Latest svn revision is r243523.

-- 
Markus

Re: git server is stuck

2016-12-11 Thread Markus Trippelsdorf

On 2016.12.11 at 09:59 -0500, Jason Merrill wrote:
> On Dec 11, 2016 2:41 AM, "Markus Trippelsdorf" 
> wrote:
> > The git server seems to be stuck for over a day.
> > Latest revision on it is r243504.
> > Latest svn revision is r243523.

> Yes, someone branched the entire SVN repository instead of just the source
> tree, and git-svn has been dutifully importing all of that. I'll try to
> deal with it later today.

This is the svn revision in question, by Kelvin Nilsen:
https://gcc.gnu.org/viewcvs/gcc?view=revision&sortby=date&revision=243505

It was later deleted by Mike Meissner:
https://gcc.gnu.org/viewcvs/gcc/branches/ibm/?sortby=date&view=log

-- 
Markus

Re: Is there any possibility to parallel compilation in a single file?

2014-07-29 Thread Markus Trippelsdorf

On 2014.07.29 at 08:07 +, Gengyulei (Gengyl) wrote:
> Hi:
> 
>  Is there any possibility to parallel the compilation in a single file
>  scope? For large application the compilation time is long, although
>  we can parallel the process at the level of files, we still try to
>  find a way to accelerate the compilation in a single file. Can we
>  change some serial process into 
> 
> Parallel?  Could you give me some advices? Thank you very much.

Compiling with -flto= and gcc-4.9 should help.

-- 
Markus

Re: GCC version bikeshedding

2014-07-29 Thread Markus Trippelsdorf

On 2014.07.29 at 19:14 +0200, Richard Biener wrote:
> On July 29, 2014 6:45:13 PM CEST, Eric Botcazou  
> wrote:
> >> I think that if anybody has strong objections, now is the time to
> >make
> >> them.  Otherwise I think we should go with this plan.
> >
> >IMHO the cure is worse than the disease.
> >
> >> Given that there is no clear reason to ever change the major version
> >> number, making that change will not convey any useful information to
> >> our users.  So let's just drop the major version number.  Once we've
> >> made that decision, then the next release (in 2015) naturally becomes
> >> 5.0, the release after that (in 2016) becomes 6.0, etc.
> >
> >I don't really understand the "naturally": if you drop the major
> >version 
> >number, the next release should be 10.0, not 5.0. 
> 
> 10.0 would be even better from a marketing perspective.

Since gcc is released annually why not tie the version to the year of
the release, instead of choosing an arbitrary number? 

15.o

-- 
Markus

Re: GCC 5 snapshots produce broken kernel for powerpc-e500v2-linux-gnuspe?

2014-09-09 Thread Markus Trippelsdorf

On 2014.09.09 at 17:35 +0800, Arseny Solokha wrote:
> Hello,
> 
> I've recently faced an issue I'm afraid I currently unable to debug. When
> building an arbitrary version of Linux kernel for powerpc-e500v2-linux-gnuspe
> target, it seems gcc prior to 5 produces a good image which boots just fine, 
> and
> current gcc 5 snapshots (4.10.0-alpha20140810 for example) produce an image
> which hangs just after U-Boot hands over to the kernel.
> 
> This behavior is well reproducible on real hardware as well as under qemu. 
> I've
> prepared a minimal kernel config which is dysfunctional as is but still enough
> to demonstrate the problem in qemu. I believe the exact Linux version number
> doesn't actually matter here, but see the attachment for details.
> 
> Compare the output produced by u-boot and this minified kernel build using
> gcc 4.9.1 and 4.10.0-alpha20140810 snapshot.
> 
> I now have completely no idea what to do next to find a cause of (1) gcc 5
> snapshots producing unbootable kernel, 

gcc trunk also miscompiles x86_64 kernels currently, but I haven't
looked deeper yet. 
The best way to narrow down the issue is to use git (or svn) bisect to
find out which gcc revision causes the miscompile.  Then you can md5sum
the kernel object files for the bad revision and for the first good
revision and compare the results. After that you can look at the
disassembly of the object files, for which md5sum differs, and try to
figure out the reason why.

-- 
Markus

Re: gcc 4.7.4 lto build failure

2014-09-19 Thread Markus Trippelsdorf

On 2014.09.19 at 13:15 +0100, Rogelio Serrano wrote:
> /home/rogelio/gcc-build/./prev-gcc/g++
> -B/home/rogelio/gcc-build/./prev-gcc/ -B/x86_64-unknown-linux-gnu/bin/
> -nostdinc++ 
> -B/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> -B/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
> -I/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
> -I/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include
> -I/home/rogelio/gcc-4.7.4/libstdc++-v3/libsupc++
> -L/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> -L/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
>   -g -O2 -flto=jobserver -frandom-seed=1 -fprofile-use -DIN_GCC
> -fno-exceptions -fno-rtti -W -Wall -Wno-narrowing -Wwrite-strings
> -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long
> -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H
> -static-libstdc++ -static-libgcc  -o cc1 c-lang.o c-family/stub-objc.o
> attribs.o c-errors.o c-decl.o c-typeck.o c-convert.o c-aux-info.o
> c-objc-common.o c-parser.o tree-mudflap.o c-family/c-common.o
> c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o
> c-family/c-gimplify.o c-family/c-lex.o c-family/c-omp.o
> c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o
> c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o
> c-family/c-ada-spec.o i386-c.o default-c.o \
>   cc1-checksum.o main.o  libbackend.a libcommon-target.a libcommon.a
> ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a libcommon.a
> ../libcpp/libcpp.a   ../libiberty/libiberty.a
> ../libdecnumber/libdecnumber.a-lmpc -lmpfr -lgmp -rdynamic -ldl
> -lz
> c-family/c-format.o (symbol from plugin): warning: memset used with
> constant zero length parameter; this could be due to transposed
> parameters
> collect2: error: ld returned 1 exit status
> 
> 
> Any pointers how to debug this?
> 
> I configured with:
> 
> ../gcc-4.7.4/configure --prefix= --libexecdir=/lib --enable-shared
> --enable-threads=posix --enable-__cxa_atexit --enable-clocale
> --enable-languages=c,lto --disable-multilib --with-system-zlib
> --enable-lto --with-build-config=bootstrap-lto
> 
> then did a profiledbootstrap make

When using profiledbootstrap you should add --disable-werror to the
configuration flags.

-- 
Markus

Re: gcc 4.7.4 lto build failure

2014-09-19 Thread Markus Trippelsdorf

On 2014.09.19 at 14:55 +0200, Markus Trippelsdorf wrote:
> On 2014.09.19 at 13:15 +0100, Rogelio Serrano wrote:
> > /home/rogelio/gcc-build/./prev-gcc/g++
> > -B/home/rogelio/gcc-build/./prev-gcc/ -B/x86_64-unknown-linux-gnu/bin/
> > -nostdinc++ 
> > -B/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> > -B/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
> > -I/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
> > -I/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include
> > -I/home/rogelio/gcc-4.7.4/libstdc++-v3/libsupc++
> > -L/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> > -L/home/rogelio/gcc-build/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
> >   -g -O2 -flto=jobserver -frandom-seed=1 -fprofile-use -DIN_GCC
> > -fno-exceptions -fno-rtti -W -Wall -Wno-narrowing -Wwrite-strings
> > -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long
> > -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H
> > -static-libstdc++ -static-libgcc  -o cc1 c-lang.o c-family/stub-objc.o
> > attribs.o c-errors.o c-decl.o c-typeck.o c-convert.o c-aux-info.o
> > c-objc-common.o c-parser.o tree-mudflap.o c-family/c-common.o
> > c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o
> > c-family/c-gimplify.o c-family/c-lex.o c-family/c-omp.o
> > c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o
> > c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o
> > c-family/c-ada-spec.o i386-c.o default-c.o \
> >   cc1-checksum.o main.o  libbackend.a libcommon-target.a libcommon.a
> > ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a libcommon.a
> > ../libcpp/libcpp.a   ../libiberty/libiberty.a
> > ../libdecnumber/libdecnumber.a-lmpc -lmpfr -lgmp -rdynamic -ldl
> > -lz
> > c-family/c-format.o (symbol from plugin): warning: memset used with
> > constant zero length parameter; this could be due to transposed
> > parameters
> > collect2: error: ld returned 1 exit status
> > 
> > 
> > Any pointers how to debug this?
> > 
> > I configured with:
> > 
> > ../gcc-4.7.4/configure --prefix= --libexecdir=/lib --enable-shared
> > --enable-threads=posix --enable-__cxa_atexit --enable-clocale
> > --enable-languages=c,lto --disable-multilib --with-system-zlib
> > --enable-lto --with-build-config=bootstrap-lto
> > 
> > then did a profiledbootstrap make
> 
> When using profiledbootstrap you should add --disable-werror to the
> configuration flags.

Hmm, I think this is actually a linker bug. Could you try gold? See:
https://sourceware.org/bugzilla/show_bug.cgi?id=16746

-- 
Markus

Re: Devirtualize virtual call hierarchy if just base dtor exists

2014-10-22 Thread Markus Trippelsdorf

On 2014.10.22 at 17:15 +0200, Martin Liška wrote:
> Hello.
> 
> I've been playing with following example:
> 
> #include 
> 
> class Base
> {
> public:
>virtual ~Base() {}
> };
> 
> class Derived: public Base
> {
> };
> 
> #define N 1000
> 
> int main()
> {
>Base **b = (Base **)malloc (sizeof(Base *) * N);
>for (unsigned i = 0; i < N; i++)
>  b[i] = new Derived();
> 
>for (unsigned i = 0; i < N; i++)
>  delete b[i];
> 
>return 0;
> }
> 
> Where I would like to somehow give an advice to devirtualize
> machinery. My motivation is to inline destruction in 'delete b[i]'.
> 'final' keyword does not solve my problem:
> 
> a.c:9:7: error: virtual function ‘virtual Derived::~Derived()’
>   class Derived: public Base
> ^
> a.c:6:11: error: overriding final function ‘virtual Base::~Base()’
> virtual ~Base() final {}

What about:

class Derived final: public Base {};

-- 
Markus

More useful support for low-end ARM architecture

2014-11-08 Thread Markus Hitter

Hello gcc folks,

recently I started to expand a project of mine running mainly on AVR ATmega to 
low end ARM chips. To my enlightment, gcc supports these thingies already. To 
my disappointment, this support is pretty complicated. One faces at least a 
much steeper learning curve that on AVR. Accordingly I suggested on the 
avr-libc mailing list to do similar work for ARM, Cortex-M0 to Cortex-M4. At 
least four people expressed interest, it looks like arm-libc is about to be 
born.

To those not knowing what this is, I talk here about all-in-one CPUs (MCUs) 
with memory and some peripherals already on the chip. Program memory can be as 
low as 8 kB, RAM as low as 1 kB. Usually they're programmed bare-metal, this 
is, without any operating system.

If you want to take a look at a simple Hello World application, here is one:

https://bugs.launchpad.net/gcc-arm-embedded/+bug/1387906

Looking at its Makefile, it requires quite a number of flags, among them nine 
-I with custom paths, --specs, -T and also auto-generated C files. Lots of 
stuff average programmers probably don't even know it exists. One of the 
interested persons on the avr-libc mailing list explained what's missing, much 
better than I could:

> I think what the other responders missed is that avr-libc (via its
> integration with binutils and gcc) gives you two key pieces of
> functionality:
> 
> -mmcu=atmega88
> #include 
> 
> You *also* get classic libc functionality (printf, etc) that's provided
> on ARM by newlib etc, but that's not the big deal IMO.
> 
> The #include is *relatively* easy, [... no topic for gcc ...]
> 
> The -mmcu= functionality is even more deeply useful, although less
> easily repeatable on ARM. It brings in the relevant linker script,
> startup code, vector tables, and all the other infrastructure. *THAT*
> is what makes it possible to write a program like:
> 
> #include 
> int main() {
>   DDRD = 0x01;PORTD = 0x01;
> }
> 
> # avr-gcc -mmcu=atmega88 -o test test.c
> # avrdude
> 
> Writing a program for your random ARM chip requires digging *deeply*
> into the various websites or IDEs of the manufacturer, trying to find
> the right files (the filenames alone of which vary in strange ways),
> probably determining the right way to alter them because the only
> example you found was for a different chip in the same line, and then
> hoping you've got everything glued together properly.
> 
> I want to be able to write the above program (modified for the right
> GPIO) and run:
> 
> # arm-none-eabi-gcc -mmcu=stm32f405 -o test test.c

This is why I joined here, we'd like to get -mmcu for all the ARM flavours. It 
should pick up a linker script which works in most cases on its own. It should 
also bring in startup code, so code writers don't have to mess with stuff 
happening before main(). And not to forget, pre-set #defines like 
__ARM_LPC1114__, __ARM_STM32F405__, etc.

- How would we proceed in general?

- Many flavours at once, or better start with one or two, adding more when 
these work?

- Did AVR support make things we should not repeat?


Thanks for discussing,

Markus

P.S.: arm-libc discussion so far can be followed here:
http://lists.nongnu.org/archive/html/avr-libc-dev/2014-11/threads.html

-- 
- - - - - - - - - - - - - - - - - - -
Dipl. Ing. (FH) Markus Hitter
http://www.jump-ing.de/

Re: testing policy for C/C++ front end changes

2014-11-13 Thread Markus Hitter

Am 13.11.2014 um 14:08 schrieb Fabien Chêne:
> Perhaps that would make sense to mention the existence of the compile
> farm, and add link to it.

Good idea. Bonus points for adding a script which executes all the required 
steps.


Markus

-- 
- - - - - - - - - - - - - - - - - - -
Dipl. Ing. (FH) Markus Hitter
http://www.jump-ing.de/

Re: More explicit what's wrong with this: FAILED: Bootstrap (build config: lto; languages: all; trunk revision 217898) on x86_64-unknown-linux-gnu

2014-11-21 Thread Markus Trippelsdorf

On 2014.11.21 at 16:16 +0100, Toon Moene wrote:
> See: https://gcc.gnu.org/ml/gcc-testresults/2014-11/msg02259.html
> 
> What's not in the log file sent to gcc-results:

See: http://thread.gmane.org/gmane.comp.gcc.patches/327449

-- 
Markus

Re: GCC 4.9.2 -O3 gives a seg fault / GCC 4.8.2 -O3 works

2015-01-06 Thread Markus Trippelsdorf

On 2015.01.06 at 03:18 -0500, Paul Smith wrote:
> Hi all.  It's possible my code is doing something illegal, but it's also
> possible I've found a problem with -O3 optimization in GCC 4.9.2.  I've
> built this same code with GCC 4.8.2 -O3 on GNU/Linux and it works fine.
> It also works with GCC 4.9.2 with lower -O (-O2 for example).
> 
> When I try a build with GCC 4.9.2 -O3 I'm seeing a segfault, because we
> get confused and invoke code that we should be skipping.
> 
> I've compressed the test down to a self-contained sample file that I've
> attached here.  Save it and run:
> 
>   g++-4.9.2 -g -O3 -o mystring mystring.cpp
> 
> Then:
> 
>   ./mystring
>   Segmentation fault
> 
> You can also add -fno-strict-aliasing etc. and it doesn't make any
> difference.
> 
> The seg fault happens in the implementation of operator +=() where we're
> appending to an empty string, so this->data is NULL.  That method starts
> like this (after the standard pushes etc.):
> 
>0x00400a51 <+17>:mov(%rdi),%r14
> 
> which puts this->data (null) into %r14.  Later on, with no intervening
> reset of r14, we see this:
> 
>0x00400ac5 <+133>:   cmp%r12,%r14
>0x00400ac8 <+136>:   je 0x400b18  const*)+216>
>0x00400aca <+138>:   subl   $0x1,-0x8(%r14)
> 
> We don't take the jump, and this (+138) is where we get the segfault
> because r14 is still 0.  This is in the if-statement in the release()
> method where it subtracts 1 from count... but it should never get here
> because this->data (r14) is NULL!
> 
>   (gdb) i r rip
>   rip0x400aca 0x400aca 
>   (gdb) i r r14
>   r140x0  0
> 
> Anyone have any thoughts about this?  I know the inline new/delete is
> weird but it's required to repro the bug, and we need our own new/delete
> because we're using our own allocator.

gcc-help is more appropriate for this kind of question.

If you compile with gcc-5 and -fsanitize=undefined you'll get:

mystring.cpp:104:26: runtime error: null pointer passed as argument 2, which is 
declared to never be null

So you should guard the memcpy() call.

-- 
Markus

Re: future versions

2015-03-20 Thread Markus Trippelsdorf

On 2015.03.20 at 20:08 -0400, Jack Howarth wrote:
> What was the final decision concerning future versioning of FSF
> gcc post-5.0? In particular, I am confused about the designation of
> maintenance releases of 5.0. Will the next maintenance release be 5.1
> or 5.0.1? I assume if it is 5.1, then after branching for release of
> 5.0, trunk will become 6.0, no?

http://gcc.gnu.org/develop.html#num_scheme

-- 
Markus

Re: future versions

2015-03-21 Thread Markus Trippelsdorf

On 2015.03.21 at 12:11 -0400, Jack Howarth wrote:
> On Sat, Mar 21, 2015 at 1:45 AM, Markus Trippelsdorf
>  wrote:
> > On 2015.03.20 at 20:08 -0400, Jack Howarth wrote:
> >> What was the final decision concerning future versioning of FSF
> >> gcc post-5.0? In particular, I am confused about the designation of
> >> maintenance releases of 5.0. Will the next maintenance release be 5.1
> >> or 5.0.1? I assume if it is 5.1, then after branching for release of
> >> 5.0, trunk will become 6.0, no?
> >
> > http://gcc.gnu.org/develop.html#num_scheme
> 
> So according to that webpage, trunk becomes 6.0 and the first
> maintenance release of 5.0 becomes 5.1 (with 5.0.1 being the
> pre-release state of the gcc-5_0-branch prior to the actual 5.1
> maintenance release). What is confusing me is all of these references
> in the mailing list to postponing bug fixes until 5.2 instead of 5.1
> (https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01129.html for
> example). What is that all about?

The first release of gcc-5 will be 5.1.0. There will be no 5.0.0
release...

-- 
Markus

Re: Change to C++11 by default?

2015-05-07 Thread Markus Trippelsdorf

On 2015.05.07 at 13:46 -0500, Jason Merrill wrote:
> I think it's time to switch to C++11 as the default C++ dialect for GCC 
> 6.  Any thoughts?

Why not C++14?

-- 
Markus

Re: Precompiled headers - still useful feature?

2015-05-27 Thread Markus Trippelsdorf

On 2015.05.27 at 10:14 +0200, Martin Liška wrote:
> I would like to ask folks what is their opinion about support of
> precompiled headers for future releases of GCC. From my point of view,
> the feature brings some speed-up, but question is if it's worth for?
> 
> Last time I hit precompiled headers was when I was rewriting memory
> allocation statistics infrastructure, where GGC memory is 'streamed'
> and loaded afterwards in usage of precompiled headers.  Because of
> that I was unable to track some pointers that were allocated in the
> first phase of compilation.
> 
> There are numbers related to --disable-libstdcxx-pch option:
> 
> Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz:
> Boostrap time w/ precompiled headers enabled: 35m47s (100.00%)
> Boostrap time w/ precompiled headers disabled: 36m27s (101.86%)
> 
> make -j9 check-target-libstdc++-v3 -k time:
> precompiled headers enabled: 8m11s (100.00%)
> precompiled headers disabled: 8m42s (106.31%)
> 
> Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz:
> Boostrap time w/ precompiled headers enabled: 57m35s (100.00%)
> Boostrap time w/ precompiled headers disabled: 57m12s (99.33%)

Measuring the impact on bigger projects that use pch like QT or Boost
would be more informative perhaps.

And until C++ modules are implemented (unfortunately nobody is working
on this AFAIK) pch is still the only option left. So deprecating them
now seem premature.

-- 
Markus

asking for attribute((aligned()) clarification

2019-08-19 Thread Markus Fröschle

All,

this is my first post on these lists, so please bear with me.

My question is about gcc's __attribute__((aligned()). Please consider the 
following code:

#include 

typedef uint32_t uuint32_t __attribute__((aligned(1)));

uint32_t getuuint32(uint8_t p[]) {
return *(uuint32_t*)p;
}

This is meant to prevent gcc to produce hard faults/address errors on 
architectures that do not support unaligned access to shorts/ints (e.g some 
ARMs, some m68k). On these architectures, gcc is supposed to replace the 32 bit 
access with a series of 8 or 16 bit accesses.

I originally came from gcc 4.6.4 (yes, pretty old) where this did not work and 
gcc does not respect the aligned(1) attribute for its code generation (i.e. it 
generates a 'normal' pointer dereference, consequently crashing when the code 
executes). To be fair, it is my understanding that the gcc manuals never 
promised this *would* work.

As - at least as far as I can tell - documentation didn't really change 
regarding lowering alignment for variables and does not appear to say anything 
specific regarding pointer dereference on single, misaligned variables, I was 
pretty astonished to see this working on newer gcc versions (tried 6.2 and 
7.4), however. gcc appears to even know the differences within an architecture 
(68000 generates a bytewise copy while ColdFire - that supports unaligned 
access - copies a 32 bit value).

My question: is this now intended behaviour we can rely on? If yes, can we have 
documentation upgraded to clearly state that this use case is valid?

Thank you.
Markus

Aw: Re: asking for attribute((aligned()) clarification

2019-08-19 Thread Markus Fröschle

Thank you (and others) for your answers. Now I'm just as smart as before, 
however.

Is it a supported, documented, 'long term' feature we can rely on or not?

If yes, I would expect it to be properly documented. If not, never mind.

> Gesendet: Montag, 19. August 2019 um 16:08 Uhr
> Von: "Alexander Monakov" 
> An: "Richard Earnshaw (lists)" 
> Cc: "Paul Koning" , "Markus Fröschle" 
> , gcc@gcc.gnu.org
> Betreff: Re: asking for __attribute__((aligned()) clarification
>
> On Mon, 19 Aug 2019, Richard Earnshaw (lists) wrote:
> 
> > Correct, but note that you can only pack structs and unions, not basic 
> > types.
> > there is no way of under-aligning a basic type except by wrapping it in a
> > struct.
> 
> I don't think that's true. In GCC-9 the doc for 'aligned' attribute has been
> significantly revised, and now ends with
> 
>   When used as part of a typedef, the aligned attribute can both increase and
>   decrease alignment, and specifying the packed attribute generates a 
> warning. 
> 
> (but I'm sure defacto behavior of accepting and honoring reduced alignment on
> a typedef'ed scalar type goes way earlier than gcc-9)
> 
> Alexander
>

Re: Warning annoyances in list_read.c

2017-03-26 Thread Markus Trippelsdorf

On 2017.03.26 at 19:30 -0700, Steve Kargl wrote:
> On Sun, Mar 26, 2017 at 06:45:07PM -0700, Jerry DeLisle wrote:
> > On 03/26/2017 11:45 AM, Steve Kargl wrote:
> > > On Sun, Mar 26, 2017 at 11:27:59AM -0700, Jerry DeLisle wrote:
> > >>
> > >> +#pragma GCC diagnostic push
> > >> +#pragma GCC diagnostic ignored "-Wimplicit-fallthrough"
> > > 
> > > IMNSHO, the correct fix is to complain loudly to whomever
> > > added -Wimplicit-fallthrough to compiler options.  It should
> > > be removed (especially if is has been added to -Wall).
> > > 
> > > You can also probably add -Wno-implicit-fallthrough to 
> > > libgfortran/configure.ac at 
> > > 
> > > # Add -Wall -fno-repack-arrays -fno-underscoring if we are using GCC.
> > > if test "x$GCC" = "xyes"; then
> > >   AM_FCFLAGS="-I . -Wall -Werror -fimplicit-none -fno-repack-arrays 
> > > -fno-underscoring"
> > > 
> > 
> > Problem I have is I don't know who to complain to. I think there is a bit 
> > of a
> > glass wall going on here anyway, so what would be the point of complaining 
> > if
> > the retrievers of the message all have the ON-OFF switch in the OFF 
> > position.
> > (After all, I do not have a PHD, I am not a computer science graduate, why
> > bother looking down ones nose at a low life such as myself, OMG its an 
> > engineer,
> > what the hell does he know.)
> > 
> > Maybe these warnings are being turned on as a matter of policy, but truth 
> > is,
> > when I build 50 times a day, the warnings flying by are masking the errors 
> > or
> > other warnings that may be important. For example, I inadvertently passed a 
> > ptr
> > to a function rather than the *ptr.
> > 
> > The warning that ensued flew by mixed in with all the other crap warnings 
> > and I
> > did not see it. That cost me wasted cycle time (remember, I am not an 
> > expert and
> > should not be expected to see such things. Hell, for that matter I should 
> > not
> > even be doing any of this work. :)
> > 
> 
> This option is clearly enforceing someone's preferred markup of
> adding a comment to explicitly note a fall through.  Candidate
> individual to complain to
> 
> If he added a new option affecting libgfortran, then he should
> fix up libgfortran.

He didn't add the warning to specifically annoy fortran developers.
It is trivial to add seven gcc_fallthrough() or breaks for someone who
knows the code and the person who added the warning obviously doesn't.

-- 
Markus

Re: Warning annoyances in list_read.c

2017-03-27 Thread Markus Trippelsdorf

On 2017.03.27 at 06:26 -0700, Steve Kargl wrote:
> On Mon, Mar 27, 2017 at 08:58:43AM +0200, Markus Trippelsdorf wrote:
> > On 2017.03.26 at 19:30 -0700, Steve Kargl wrote:
> > > On Sun, Mar 26, 2017 at 06:45:07PM -0700, Jerry DeLisle wrote:
> > > > On 03/26/2017 11:45 AM, Steve Kargl wrote:
> > > > > On Sun, Mar 26, 2017 at 11:27:59AM -0700, Jerry DeLisle wrote:
> > > > >>
> > > > >> +#pragma GCC diagnostic push
> > > > >> +#pragma GCC diagnostic ignored "-Wimplicit-fallthrough"
> > > > > 
> > > > > IMNSHO, the correct fix is to complain loudly to whomever
> > > > > added -Wimplicit-fallthrough to compiler options.  It should
> > > > > be removed (especially if is has been added to -Wall).
> > > > > 
> > > > > You can also probably add -Wno-implicit-fallthrough to 
> > > > > libgfortran/configure.ac at 
> > > > > 
> > > > > # Add -Wall -fno-repack-arrays -fno-underscoring if we are using GCC.
> > > > > if test "x$GCC" = "xyes"; then
> > > > >   AM_FCFLAGS="-I . -Wall -Werror -fimplicit-none -fno-repack-arrays 
> > > > > -fno-underscoring"
> > > > > 
> > > > 
> > > > Problem I have is I don't know who to complain to. I think there is a 
> > > > bit of a
> > > > glass wall going on here anyway, so what would be the point of 
> > > > complaining if
> > > > the retrievers of the message all have the ON-OFF switch in the OFF 
> > > > position.
> > > > (After all, I do not have a PHD, I am not a computer science graduate, 
> > > > why
> > > > bother looking down ones nose at a low life such as myself, OMG its an 
> > > > engineer,
> > > > what the hell does he know.)
> > > > 
> > > > Maybe these warnings are being turned on as a matter of policy, but 
> > > > truth is,
> > > > when I build 50 times a day, the warnings flying by are masking the 
> > > > errors or
> > > > other warnings that may be important. For example, I inadvertently 
> > > > passed a ptr
> > > > to a function rather than the *ptr.
> > > > 
> > > > The warning that ensued flew by mixed in with all the other crap 
> > > > warnings and I
> > > > did not see it. That cost me wasted cycle time (remember, I am not an 
> > > > expert and
> > > > should not be expected to see such things. Hell, for that matter I 
> > > > should not
> > > > even be doing any of this work. :)
> > > > 
> > > 
> > > This option is clearly enforceing someone's preferred markup of
> > > adding a comment to explicitly note a fall through.  Candidate
> > > individual to complain to
> > > 
> > > If he added a new option affecting libgfortran, then he should
> > > fix up libgfortran.
> > 
> > He didn't add the warning to specifically annoy fortran developers.
> > It is trivial to add seven gcc_fallthrough() or breaks for someone who
> > knows the code and the person who added the warning obviously doesn't.
> > 
> 
> I completely disagree with your viewpoint here.  If someone turns
> on a silly warning, that someone should fix all places within the
> tree that triggers that warning.  There is ZERO value to this warning,
> but added work for others to clean up that someone's mess.

Well, a missing break is a bug. No?
This warning has already uncovered several bugs in the tree, so calling
it silly makes no sense.

-- 
Markus

Re: Warning annoyances in list_read.c

2017-03-27 Thread Markus Trippelsdorf

On 2017.03.27 at 07:44 -0700, Steve Kargl wrote:
> On Mon, Mar 27, 2017 at 03:39:37PM +0200, Markus Trippelsdorf wrote:
> > 
> > Well, a missing break is a bug. No?
> 
> Every 'case' statement without exception must be accompanied by
> a 'break' statement?  Wasting others' time to "fix" working
> correct code is acceptable?

Sorry, I should have written "potential bug".
For legacy code I would simply disable the warning.
But to dismiss it utterly, as you do, is shortsighted, because it has
the potential to point out real bugs.

-- 
Markus

Re: Warning annoyances in list_read.c

2017-03-27 Thread Markus Trippelsdorf

On 2017.03.27 at 06:49 -0700, Steve Kargl wrote:
> On Mon, Mar 27, 2017 at 02:36:27PM +0100, Jonathan Wakely wrote:
> > On 27 March 2017 at 14:26, Steve Kargl wrote:
> > > I completely disagree with your viewpoint here.  If someone turns
> > > on a silly warning, that someone should fix all places within the
> > > tree that triggers that warning.  There is ZERO value to this warning,
> > > but added work for others to clean up that someone's mess.
> > 
> > Your absolutist view is just an opinion and reasonable people disagree
> > on the value of the warning. It's already found bugs in real code.
> > 
> > You could continue being upset, or somebody who understands the code
> > could just fix the warnings and everybody can get on with their lives.
> 
> Go scan the gcc-patches mailing list for "fallthrough".  I'll
> note other have concerns.  Here's one example:
> 
> https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00300.html
> 
>Without Bernd's patch to set the default to 1 you will drown
>in false positives once you start using gcc-7 to build a whole
>distro. On my Gentoo test box anything but level 1 is simply
>unacceptable, because you will miss important other warnings
>in the -Wimplicit-fallthrough noise otherwise.

The quotation doesn't have anything to do with the current discussion,
which is the general usefulness of the warning.
It only talks about one of the (admittedly over-engineered) six
different levels of the warning.

-- 
Markus

Re: [PATCH] gcc 8: Implement -felide-function-bodies

2017-03-31 Thread Markus Trippelsdorf

On 2017.04.01 at 01:00 -0400, David Malcolm wrote:
> The following patch implements a new function-body-elision
> optimization, which can dramatically improve performance,
> especially under certain benchmarks.
> 
> gcc/ChangeLog:
>   * common.opt (felide-function-bodies): New option.
>   * gimplify.c (gimplify_body): Implement function-body
>   elision.
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 4021622..a32a56d 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1299,6 +1299,10 @@ fipa-sra
>  Common Report Var(flag_ipa_sra) Init(0) Optimization
>  Perform interprocedural reduction of aggregates.
>  
> +felide-function-bodies
> +Common Optimization Var(flag_elide_function_bodies)
> +Perform function body elision optimization
> +
>  feliminate-unused-debug-symbols
>  Common Report Var(flag_debug_only_used_symbols)
>  Perform unused symbol elimination in debug info.
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index fbf136f..4853953 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -12390,6 +12390,9 @@ gimplify_body (tree fndecl, bool do_parms)
>   the body so that DECL_VALUE_EXPR gets processed correctly.  */
>parm_stmts = do_parms ? gimplify_parameters () : NULL;
>  
> +  if (flag_elide_function_bodies)
> +DECL_SAVED_TREE (fndecl) = NULL_TREE;
> +
>/* Gimplify the function's body.  */
>seq = NULL;
>gimplify_stmt (&DECL_SAVED_TREE (fndecl), &seq);

Haha, your option also has dramatic binary size saving effects.
I would suggest to enable it unconditionally on every April Fools' Day.

-- 
Markus

[RFA] update ggc_min_heapsize_heuristic()

2017-04-09 Thread Markus Trippelsdorf

The minimum size heuristic for the garbage collector's heap, before it
starts collecting, was last updated over ten years ago.
It currently has a hard upper limit of 128MB.
This is too low for current machines where 8GB of RAM is normal.
So, it seems to me, a new upper bound of 1GB would be appropriate.

Compile times of large C++ projects improve by over 10% due to this
change.

What do you think?
Thanks.


diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c
index b4c36fb0bbd4..91e121d7dafe 100644
--- a/gcc/ggc-common.c
+++ b/gcc/ggc-common.c
@@ -810,7 +810,7 @@ ggc_min_heapsize_heuristic (void)
   phys_kbytes = MIN (phys_kbytes, limit_kbytes);
 
   phys_kbytes = MAX (phys_kbytes, 4 * 1024);
-  phys_kbytes = MIN (phys_kbytes, 128 * 1024);
+  phys_kbytes = MIN (phys_kbytes, 1000 * 1024);
 
   return phys_kbytes;
 }

-- 
Markus

Re: [RFA] update ggc_min_heapsize_heuristic()

2017-04-09 Thread Markus Trippelsdorf

On 2017.04.09 at 21:25 +0300, Alexander Monakov wrote:
> On Sun, 9 Apr 2017, Markus Trippelsdorf wrote:
> 
> > The minimum size heuristic for the garbage collector's heap, before it
> > starts collecting, was last updated over ten years ago.
> > It currently has a hard upper limit of 128MB.
> > This is too low for current machines where 8GB of RAM is normal.
> > So, it seems to me, a new upper bound of 1GB would be appropriate.
> 
> While amount of available RAM has grown, so has the number of available CPU
> cores (counteracting RAM growth for parallel builds). Building under a
> virtualized environment with less-than-host RAM got also more common I think.
> 
> Bumping it all the way up to 1GB seems excessive, how did you arrive at that
> figure? E.g. my recollection from watching a Firefox build is that most of
> compiler instances need under 0.5GB (RSS).

1GB was just a number I've picked to get the discussion going. 
And you are right, 512MB looks like a good compromise.

> > Compile times of large C++ projects improve by over 10% due to this
> > change.
> 
> Can you explain a bit more, what projects you've tested?.. 10+% looks
> surprisingly high to me.

I've checked LLVM build times on ppc64le and X86_64.
But you can observe the effect also with a single big C++ file like
tramp3d-v4.cpp. On my old machine:

 --param ggc-min-heapsize=131072  : 26.97 secs / 711MB peak memory (current 
default)
 --param ggc-min-heapsize=393216  : 26.04 secs / 886MB peak memory 
 --param ggc-min-heapsize=524288  : 25.37 secs / 983MB peak memory
 --param ggc-min-heapsize=100 : 25.36 secs / 990MB peak memory

> > What do you think?
> 
> I wonder if it's possible to reap most of the compile time benefit with a bit
> more modest gc threshold increase?

512MB looks like the sweet spot. And of course one is basically trading memory
usage for compile time performance.

-- 
Markus

Re: [RFA] update ggc_min_heapsize_heuristic()

2017-04-09 Thread Markus Trippelsdorf

On 2017.04.09 at 20:23 +0200, Richard Biener wrote:
> On Sun, Apr 9, 2017 at 4:41 PM, Markus Trippelsdorf
>  wrote:
> > The minimum size heuristic for the garbage collector's heap, before it
> > starts collecting, was last updated over ten years ago.
> > It currently has a hard upper limit of 128MB.
> > This is too low for current machines where 8GB of RAM is normal.
> > So, it seems to me, a new upper bound of 1GB would be appropriate.
> >
> > Compile times of large C++ projects improve by over 10% due to this
> > change.
> 
> How does memory use change?

It increases e.g. 25% on tramp3d-v4.cpp when increasing 
ggc-min-heapsize from 131072 (default) to 524288.

-- 
Markus

Re: [RFA] update ggc_min_heapsize_heuristic()

2017-04-09 Thread Markus Trippelsdorf

On 2017.04.09 at 21:10 +0200, Markus Trippelsdorf wrote:
> On 2017.04.09 at 21:25 +0300, Alexander Monakov wrote:
> > On Sun, 9 Apr 2017, Markus Trippelsdorf wrote:
> > 
> > > The minimum size heuristic for the garbage collector's heap, before it
> > > starts collecting, was last updated over ten years ago.
> > > It currently has a hard upper limit of 128MB.
> > > This is too low for current machines where 8GB of RAM is normal.
> > > So, it seems to me, a new upper bound of 1GB would be appropriate.
> > 
> > While amount of available RAM has grown, so has the number of available CPU
> > cores (counteracting RAM growth for parallel builds). Building under a
> > virtualized environment with less-than-host RAM got also more common I 
> > think.
> > 
> > Bumping it all the way up to 1GB seems excessive, how did you arrive at that
> > figure? E.g. my recollection from watching a Firefox build is that most of
> > compiler instances need under 0.5GB (RSS).
> 
> 1GB was just a number I've picked to get the discussion going. 
> And you are right, 512MB looks like a good compromise.
> 
> > > Compile times of large C++ projects improve by over 10% due to this
> > > change.
> > 
> > Can you explain a bit more, what projects you've tested?.. 10+% looks
> > surprisingly high to me.
> 
> I've checked LLVM build times on ppc64le and X86_64.

Here are the ppc64le numbers (llvm+clang+lld Release build):

--param ggc-min-heapsize=131072 :
 ninja -j60  15951.08s user 256.68s system 5448% cpu 4:57.46 total

--param ggc-min-heapsize=524288 :
 ninja -j60  14192.62s user 253.14s system 5527% cpu 4:21.34 total

-- 
Markus

Re: [RFA] update ggc_min_heapsize_heuristic()

2017-04-10 Thread Markus Trippelsdorf

On 2017.04.10 at 10:56 +0100, Richard Earnshaw (lists) wrote:
> On 09/04/17 21:06, Markus Trippelsdorf wrote:
> > On 2017.04.09 at 21:10 +0200, Markus Trippelsdorf wrote:
> >> On 2017.04.09 at 21:25 +0300, Alexander Monakov wrote:
> >>> On Sun, 9 Apr 2017, Markus Trippelsdorf wrote:
> >>>
> >>>> The minimum size heuristic for the garbage collector's heap, before it
> >>>> starts collecting, was last updated over ten years ago.
> >>>> It currently has a hard upper limit of 128MB.
> >>>> This is too low for current machines where 8GB of RAM is normal.
> >>>> So, it seems to me, a new upper bound of 1GB would be appropriate.
> >>>
> >>> While amount of available RAM has grown, so has the number of available 
> >>> CPU
> >>> cores (counteracting RAM growth for parallel builds). Building under a
> >>> virtualized environment with less-than-host RAM got also more common I 
> >>> think.
> >>>
> >>> Bumping it all the way up to 1GB seems excessive, how did you arrive at 
> >>> that
> >>> figure? E.g. my recollection from watching a Firefox build is that most of
> >>> compiler instances need under 0.5GB (RSS).
> >>
> >> 1GB was just a number I've picked to get the discussion going. 
> >> And you are right, 512MB looks like a good compromise.
> >>
> >>>> Compile times of large C++ projects improve by over 10% due to this
> >>>> change.
> >>>
> >>> Can you explain a bit more, what projects you've tested?.. 10+% looks
> >>> surprisingly high to me.
> >>
> >> I've checked LLVM build times on ppc64le and X86_64.
> > 
> > Here are the ppc64le numbers (llvm+clang+lld Release build):
> > 
> > --param ggc-min-heapsize=131072 :
> >  ninja -j60  15951.08s user 256.68s system 5448% cpu 4:57.46 total
> > 
> > --param ggc-min-heapsize=524288 :
> >  ninja -j60  14192.62s user 253.14s system 5527% cpu 4:21.34 total
> > 
> 
> I think that's still too high.  We regularly see quad-core boards with
> 1G of ram, or octa-core with 2G.  ie 256k/core.
> 
> So even that would probably be touch and go after you've accounted for
> system memory and other processes on the machine.

Yes, the calculation in ggc_min_heapsize_heuristic() could be adjusted
to take the number of "cores" into account. 
So that on an 8GB 4-core machine it would return 512k. And less than
that for machines with less memory or higher core counts.

> Plus, for big systems it's nice to have beefy ram disks as scratch
> areas, it can save a lot of disk IO.
> 
> What are the numbers with 256M?

Here are the numbers from a 4core/8thread 16GB RAM Skylake machine.
They look less stellar than the ppc64le ones (variability is smaller):

 --param ggc-min-heapsize=131072
11264.89user 311.88system 24:18.69elapsed 793%CPU (0avgtext+0avgdata 
1265352maxresident)k

 --param ggc-min-heapsize=393216
10655.42user 347.92system 23:01.17elapsed 796%CPU (0avgtext+0avgdata 
1280476maxresident)k

 --param ggc-min-heapsize=524288
10565.33user 352.90system 22:51.33elapsed 796%CPU (0avgtext+0avgdata 
1506348maxresident)k

-- 
Markus

Re: [RFA] update ggc_min_heapsize_heuristic()

2017-04-10 Thread Markus Trippelsdorf

On 2017.04.10 at 12:15 +0200, Markus Trippelsdorf wrote:
> On 2017.04.10 at 10:56 +0100, Richard Earnshaw (lists) wrote:
> > 
> > What are the numbers with 256M?
> 
> Here are the numbers from a 4core/8thread 16GB RAM Skylake machine.
> They look less stellar than the ppc64le ones (variability is smaller):
> 
>  --param ggc-min-heapsize=131072
> 11264.89user 311.88system 24:18.69elapsed 793%CPU (0avgtext+0avgdata 
> 1265352maxresident)k

 --param ggc-min-heapsize=262144
10778.52user 336.34system 23:15.71elapsed 796%CPU (0avgtext+0avgdata 
1277468maxresident)k 

>  --param ggc-min-heapsize=393216
> 10655.42user 347.92system 23:01.17elapsed 796%CPU (0avgtext+0avgdata 
> 1280476maxresident)k
> 
>  --param ggc-min-heapsize=524288
> 10565.33user 352.90system 22:51.33elapsed 796%CPU (0avgtext+0avgdata 
> 1506348maxresident)k
-- 
Markus

Re: [RFA] update ggc_min_heapsize_heuristic()

2017-04-10 Thread Markus Trippelsdorf

On 2017.04.10 at 13:14 +0100, Richard Earnshaw (lists) wrote:
> On 10/04/17 12:06, Segher Boessenkool wrote:
> > On Mon, Apr 10, 2017 at 12:52:15PM +0200, Markus Trippelsdorf wrote:
> >>>  --param ggc-min-heapsize=131072
> >>> 11264.89user 311.88system 24:18.69elapsed 793%CPU (0avgtext+0avgdata 
> >>> 1265352maxresident)k
> >>
> >>  --param ggc-min-heapsize=262144
> >> 10778.52user 336.34system 23:15.71elapsed 796%CPU (0avgtext+0avgdata 
> >> 1277468maxresident)k 
> >>
> >>>  --param ggc-min-heapsize=393216
> >>> 10655.42user 347.92system 23:01.17elapsed 796%CPU (0avgtext+0avgdata 
> >>> 1280476maxresident)k
> >>>
> >>>  --param ggc-min-heapsize=524288
> >>> 10565.33user 352.90system 22:51.33elapsed 796%CPU (0avgtext+0avgdata 
> >>> 1506348maxresident)k
> > 
> > So 256MB gets 70% of the speed gain of 512MB, but for only 5% of the cost
> > in RSS.  384MB is an even better tradeoff for this testcase (but smaller
> > is safer).
> > 
> > Can the GC not tune itself better?  Or, not cost so much in the first
> > place ;-)
> > 
> > 
> > Segher
> > 
> 
> I think the idea of a fixed number is that it avoids the problem of bug
> reproducibility in the case of memory corruption.

Please note that you will get fixed numbers (defined in gcc/params.def)
for all non-release compiler configs.
For release builds the numbers already vary according to the host. They
get calculated in ggc-common.c.

-- 
Markus

Re: Machine problems at gcc.gnu.org?

2017-04-21 Thread Markus Trippelsdorf

On 2017.04.21 at 09:17 -0700, Steve Ellcey wrote:
> 
> I am having problems getting to https://gcc.gnu.org this morning and
> I have also had problems getting to the glibc mail archives though the
> main web page for glibc seem available.  Anyone else having problems?
> Of course if this email goes through the machines that are having problems
> it may not get anywhere

Yes, looks like the sourceware server is having problems:
https://check-host.net/check-ping?host=gcc.gnu.org

-- 
Markus

Re: Missing git tags for released GCC

2017-05-03 Thread Markus Trippelsdorf

On 2017.05.03 at 09:30 +0200, Martin Liška wrote:
> Can you someone add 7.1 release to git tags. I guess it's following revision:
> f9105a38249fb57f7778acf3008025f2dcac2b1f

Everyone can add it:

 % git tag gcc-7_1_0-release f9105a38249fb57f7778acf3008025f2dcac2b1f
 % git push origin gcc-7_1_0-release

I've added the tag.

-- 
Markus

Re: git-svn error due to out-of-sync changes?

2017-05-18 Thread Markus Trippelsdorf

On 2017.05.18 at 12:41 -0600, Martin Sebor wrote:
> On 05/18/2017 11:59 AM, Jeff Law wrote:
> > On 05/18/2017 11:41 AM, Martin Sebor wrote:
> > > I just tried to push a change and got the error below.  git
> > > pull says my tree is up to date.  I wonder if it's caused by
> > > my commit conflicting with another commit (in this case
> > > r248244) that git-svn doesn't see because it lags behind SVN.
> > > I brushed this (and other strange errors) off before, not
> > > bothering to try to understand it but it's happened enough
> > > times that I'd like to bring it up.  I expect some (maybe
> > > even most) of these issues would not exist if we were using
> > > Git directly rather than the git-svn wrapper. Has any more
> > > progress been made on the Git integration project?  Is there
> > > something I/we can do to help get it done?
> > That just means something changed upstream betwen your last git svn
> > rebase and your local commit.
> > 
> > Just "git svn rebase", resolve conflicts (the ChangeLogs are the most
> > common source of conflicts)  and you should be good to go.
> 
> The main issue is that there tend to be errors that wouldn't
> happen without the extra layer between Git and SVN.  The two
> are out of sync by minutes (I don't know exactly how many but
> it seems like at least 10), so clearing these things up takes
> time.  Some (I'd say most) of the errors I've seen out of
> Git-svn are also not completely intuitive so it's not always
> clear what or where the problem is.
> 
> So I'd like to see if there's something that can be done to
> move the migration forward.

The same issue also happen with git when several people push at the same
time.

-- 
Markus

Re: git-svn error due to out-of-sync changes?

2017-05-18 Thread Markus Trippelsdorf

On 2017.05.18 at 13:42 -0600, Martin Sebor wrote:
> On 05/18/2017 12:55 PM, Markus Trippelsdorf wrote:
> > On 2017.05.18 at 12:41 -0600, Martin Sebor wrote:
> > > On 05/18/2017 11:59 AM, Jeff Law wrote:
> > > > On 05/18/2017 11:41 AM, Martin Sebor wrote:
> > > > > I just tried to push a change and got the error below.  git
> > > > > pull says my tree is up to date.  I wonder if it's caused by
> > > > > my commit conflicting with another commit (in this case
> > > > > r248244) that git-svn doesn't see because it lags behind SVN.
> > > > > I brushed this (and other strange errors) off before, not
> > > > > bothering to try to understand it but it's happened enough
> > > > > times that I'd like to bring it up.  I expect some (maybe
> > > > > even most) of these issues would not exist if we were using
> > > > > Git directly rather than the git-svn wrapper. Has any more
> > > > > progress been made on the Git integration project?  Is there
> > > > > something I/we can do to help get it done?
> > > > That just means something changed upstream betwen your last git svn
> > > > rebase and your local commit.
> > > > 
> > > > Just "git svn rebase", resolve conflicts (the ChangeLogs are the most
> > > > common source of conflicts)  and you should be good to go.
> > > 
> > > The main issue is that there tend to be errors that wouldn't
> > > happen without the extra layer between Git and SVN.  The two
> > > are out of sync by minutes (I don't know exactly how many but
> > > it seems like at least 10), so clearing these things up takes
> > > time.  Some (I'd say most) of the errors I've seen out of
> > > Git-svn are also not completely intuitive so it's not always
> > > clear what or where the problem is.
> > > 
> > > So I'd like to see if there's something that can be done to
> > > move the migration forward.
> > 
> > The same issue also happen with git when several people push at the same
> > time.
> 
> Yes, it can.  The major difference, I suspect, is due to Git-svn
> asynchronous, delayed updates.  My guess is that Git-svn pull
> requests are based on updates from SVN that happen only every
> few minutes, but pushes happen in real time.  So when we pull,
> we're likely to get outdated sources (changes committed since
> the last Git update are not included).  But when we push, we're
> likely to run into (at a minimum) ChangeLog conflicts with those
> already committed changes that Git-svn hasn't been updated with.
> This is just a wild guess based on the errors I've seen and
> their increased incidence since 7 has been released.

"git svn dcommit" will run "git svn rebase" automatically, so you are
already on the latest svn revision. It doesn't matter if git lags behind
or not, only svn counts here.
The situation will be exactly the same after the switch to git. If you
are unlucky and several people push at the same time, you could be
forced to pull/rebase and retry your push request several times.

-- 
Markus

Re: git-svn error due to out-of-sync changes?

2017-05-23 Thread Markus Trippelsdorf

On 2017.05.23 at 05:26 -0400, Aldy Hernandez wrote:
> Jason Merrill  writes:
> 
> > Yes, the git mirror can lag the SVN repo by a few minutes, that's why
> > you need to 'git svn rebase' to pull directly from SVN before a
> > commit.
> >
> > Jason
> 
> Markus just said upthread that:
> 
> "git svn dcommit" will run "git svn rebase" automatically
> 
> Is `git svn rebase' run automatically or from `git svn dcommit' or not?
> I'm trying to save keystrokes here :).

`git svn rebase' is run automatically _after_ "git svn dcommit".
So to minimize potential conflicts, it is a good idea to run "svn
rebase" before committing.

-- 
Markus

Re: gcc behavior on memory exhaustion

2017-08-09 Thread Markus Trippelsdorf

On 2017.08.09 at 14:05 +0100, Andrew Roberts wrote:
> I routinely build the weekly snapshots and RC's, on x64, arm and aarch64.
> 
> The last gcc 8 snapshot and the two recent 7.2 RC's have failed to build on
> aarch64 (Raspberry Pi 3, running Arch Linux ARM). I have finally traced this
> to the system running out of memory. I guess a recent kernel update had
> changed the memory page size and the swap file was no longer being used
> because the page sizes didn't match.
> 
> Obviously this is my issue, but the error's I was getting from gcc did not
> help. I was getting ICE's, thus:
> 
> /usr/local/gcc/bin/g++ -Wall -Wextra -Wno-ignored-qualifiers
> -Wno-sign-compare -Wno-write-strings -std=c++14 -pipe -march=armv8-a
> -mcpu=cortex-a53 -mtune=cortex-a53 -ftree-vectorize -O3 -DUNAME_S=\"linux\"
> -DUNAME_M=\"aarch64\" -DOSMESA=1 -I../libs/include -DRASPBERRY_PI
> -I/usr/include/freetype2 -I/usr/include/harfbuzz -I/usr/include/unicode   -c
> -o glerr.o glerr.cpp
> {standard input}: Assembler messages:
> {standard input}: Warning: end of file not at end of a line; newline
> inserted
> {standard input}:204: Error: operand 1 must be an integer register -- `mov'
> {standard input}: Error: open CFI at the end of file; missing .cfi_endproc
> directive
> g++: internal compiler error: Killed (program cc1plus)
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <https://gcc.gnu.org/bugs/> for instructions.
> make: *** [: glerr.o] Error 4
> make: *** Waiting for unfinished jobs
> 
> I was seeing the problem when building using make -j2. Both building gcc and
> building large user projects.
> 
> There are two issues here:
> 
> 1) There was discussion about increasing the amount of memory gcc would
> reserve to help speed up compilation of large source files, I wondered if
> this could be a factor.
> 
> 2) It would be nice to see some sort of out of memory error, rather than
> just an ICE.

"internal compiler error: Killed" is almost always an out of memory
error. dmesg will show that the OOM killer kicked in and killed the
cc1plus process.

> The system has 858Mb of  RAM without the swap file.
> 
> Building a single source file seems to use up to 97% of the available memory
> (for a 2522 line C++ source file).
> 
> make -j2 is enough to cause the failure.

Well, you should really use a cross compiler for this setup.

-- 
Markus

Re: Quantitative analysis of -Os vs -O3

2017-08-26 Thread Markus Trippelsdorf

On 2017.08.26 at 01:39 -0700, Andrew Pinski wrote:
> 
> First let me put into some perspective on -Os usage and some history:
> 1) -Os is not useful for non-embedded users
> 2) the embedded folks really need the smallest code possible and
> usually will be willing to afford the performance hit
> 3) -Os was a mistake for Apple to use in the first place; they used it
> and then GCC got better for PowerPC to use the string instructions
> which is why -Oz was added :)
> 4) -Os is used heavily by the arm/thumb2 folks in bare metal applications.
> 
> Comparing -O3 to -Os is not totally fair on x86 due to the many
> different instructions and encodings.
> Compare it on ARM/Thumb2 or MIPS/MIPS16 (or micromips) where size is a
> big issue.
> I soon have a need to keep overall (bare-metal) application size down
> to just 256k.
> Micro-controllers are places where -Os matters the most.
> 
> This comment does not help my application usage.  It rather hurts it
> and goes against what -Os is really about.  It is not about reducing
> icache pressure but overall application code size.  I really need the
> code to fit into a specific size.

For many applications using -flto does reduce code size more than just
going from -O2 to -Os.

-- 
Markus

Re: Quantitative analysis of -Os vs -O3

2017-08-26 Thread Markus Trippelsdorf

On 2017.08.26 at 12:40 +0200, Allan Sandfeld Jensen wrote:
> On Samstag, 26. August 2017 10:56:16 CEST Markus Trippelsdorf wrote:
> > On 2017.08.26 at 01:39 -0700, Andrew Pinski wrote:
> > > First let me put into some perspective on -Os usage and some history:
> > > 1) -Os is not useful for non-embedded users
> > > 2) the embedded folks really need the smallest code possible and
> > > usually will be willing to afford the performance hit
> > > 3) -Os was a mistake for Apple to use in the first place; they used it
> > > and then GCC got better for PowerPC to use the string instructions
> > > which is why -Oz was added :)
> > > 4) -Os is used heavily by the arm/thumb2 folks in bare metal applications.
> > > 
> > > Comparing -O3 to -Os is not totally fair on x86 due to the many
> > > different instructions and encodings.
> > > Compare it on ARM/Thumb2 or MIPS/MIPS16 (or micromips) where size is a
> > > big issue.
> > > I soon have a need to keep overall (bare-metal) application size down
> > > to just 256k.
> > > Micro-controllers are places where -Os matters the most.
> > > 
> > > This comment does not help my application usage.  It rather hurts it
> > > and goes against what -Os is really about.  It is not about reducing
> > > icache pressure but overall application code size.  I really need the
> > > code to fit into a specific size.
> > 
> > For many applications using -flto does reduce code size more than just
> > going from -O2 to -Os.
> 
> I added the option to optimize with -Os in Qt, and it gives an average 15% 
> reduction in binary size, somtimes as high as 25%. Using lto gives almost the 
> same (slightly less), but the two options combine perfectly and using both 
> can 
> reduce binary size from 20 to 40%. And that is on a shared library, not even 
> a 
> statically linked binary.
> 
> Only real minus is that some of the libraries especially QtGui would benefit 
> from a auto-vectorization, so it would be nice if there existed an -O3s 
> version which vectorized the most obvious vectorizable functions, a few 
> hundred bytes for an additional version here and there would do good. 
> Fortunately it doesn't too much damage as we have manually vectorized 
> routines 
> for to have good performance also on MSVC, if we relied more on auto-
> vectorization it would be worse.

In that case using profile guided optimizations will help. It will
optimize cold functions with -Os and hot functions with -O3 (when using
e.g.: "-flto -O3 -fprofile-use"). Of course you will have to compile
twice and also collect training data from your library in between.

-- 
Markus

Re: Regression with gcc 7.2 ? Undefined references ?

2017-08-26 Thread Markus Trippelsdorf

On 2017.08.26 at 13:04 +0200, Sylvestre Ledru wrote:
> Hello,
> 
> I have been trying to build the llvm toolchain with gcc 7.2 using the
> Debian packages.
> However, it is currently failing with some undefined reference.
> Seems that some symbols are removed during the build phase (too strong
> optim?)
> 
> I haven't seen something relevant to this in the gcc release notes.
> 
> More information here:
> https://bugs.llvm.org/show_bug.cgi?id=34140
> 
> Does it ring a bell for anyone?

Try building without -gsplit-dwarf?

-- 
Markus

Re: Regression with gcc 7.2 ? Undefined references ?

2017-08-26 Thread Markus Trippelsdorf

On 2017.08.26 at 17:18 +0200, Sylvestre Ledru wrote:
> 
> 
> On 26/08/2017 13:10, Markus Trippelsdorf wrote:
> > On 2017.08.26 at 13:04 +0200, Sylvestre Ledru wrote:
> >> Hello,
> >>
> >> I have been trying to build the llvm toolchain with gcc 7.2 using the
> >> Debian packages.
> >> However, it is currently failing with some undefined reference.
> >> Seems that some symbols are removed during the build phase (too strong
> >> optim?)
> >>
> >> I haven't seen something relevant to this in the gcc release notes.
> >>
> >> More information here:
> >> https://bugs.llvm.org/show_bug.cgi?id=34140
> >>
> >> Does it ring a bell for anyone?
> > Try building without -gsplit-dwarf?
> >
> Indeed, this fixed the issue.
> Should I report a bug or it is a know issue?

I don't think the issue is already known. So please report a bug.
Thanks.

-- 
Markus

Re: Segfault generated by gcc-7

2017-08-29 Thread Markus Trippelsdorf

On 2017.08.29 at 12:31 +0200, Marco Varlese wrote:
> Hi,
> 
> I got a SEGFAULT in my program when compiling it with gcc-7 but it
> is/was all good when using gcc-6.
> 
> The SEGFAULT happens due to the line below:
> d_point = *p;
> 
> And a fix for it (with gcc-7) has been:
> memcpy(&d_point, 
>   p, 
>   sizeof(d_point));
> 
> Does this make any sense to anybody?

No. Please open a bug and attach the full program that causes the crash.
Otherwise the issue is impossible to debug.

-- 
Markus

Re: Segfault generated by gcc-7

2017-08-29 Thread Markus Trippelsdorf

On 2017.08.29 at 12:35 +0200, Markus Trippelsdorf wrote:
> On 2017.08.29 at 12:31 +0200, Marco Varlese wrote:
> > Hi,
> > 
> > I got a SEGFAULT in my program when compiling it with gcc-7 but it
> > is/was all good when using gcc-6.
> > 
> > The SEGFAULT happens due to the line below:
> > d_point = *p;
> > 
> > And a fix for it (with gcc-7) has been:
> > memcpy(&d_point, 
> > p, 
> > sizeof(d_point));
> > 
> > Does this make any sense to anybody?
> 
> No. Please open a bug and attach the full program that causes the crash.
> Otherwise the issue is impossible to debug.

But my guess would be a misaligned address. Try building with
-fsanitize=undefined and fix all issues the sanitizer points out.

-- 
Markus

Re: Building on gcc112 is stuck in msgfmt

2017-08-29 Thread Markus Trippelsdorf

On 2017.08.29 at 12:42 +0200, Martin Liška wrote:
> On 08/29/2017 12:39 PM, Martin Liška wrote:
> > (gdb) bt
> > #0  0x3fff950e58e4 in syscall () from /lib64/libc.so.6
> > #1  0x3fff94dbbdc4 in __cxxabiv1::__cxa_guard_acquire (g=0x3fff94f26d40 
> >  > namespace)::__future_category_instance()::__fec>) at 
> > ../../../../libstdc++-v3/libsupc++/guard.cc:302
> > #2  0x3fff94dfaf80 in __future_category_instance () at 
> > ../../../../../libstdc++-v3/src/c++11/future.cc:65
> > #3  std::future_category () at 
> > ../../../../../libstdc++-v3/src/c++11/future.cc:79
> > #4  0x3fff94da98e8 in __static_initialization_and_destruction_0 
> > (__priority=65535, __initialize_p=1) at 
> > ../../../../libstdc++-v3/src/c++11/compatibility-thread-c++0x.cc:50
> > #5  _GLOBAL__sub_I_compatibility_thread_c__0x.cc(void) () at 
> > ../../../../libstdc++-v3/src/c++11/compatibility-thread-c++0x.cc:200
> > #6  0x3fff96105c74 in _dl_init_internal () from /lib64/ld64.so.2
> > #7  0x3fff960f19cc in _dl_start_user () from /lib64/ld64.so.2
> 
> strace says about it:
> 
> futex(0x3fffa5226d40, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily 
> unavailable)

To me the whole issue looks related to PR81931.
Are you using a clean build directory? Make sure you have the r251328
fix.

-- 
Markus

Re: Building on gcc112 is stuck in msgfmt

2017-08-29 Thread Markus Trippelsdorf

On 2017.08.29 at 12:53 +0200, Martin Liška wrote:
> On 08/29/2017 12:47 PM, Markus Trippelsdorf wrote:
> > On 2017.08.29 at 12:42 +0200, Martin Liška wrote:
> >> On 08/29/2017 12:39 PM, Martin Liška wrote:
> >>> (gdb) bt
> >>> #0  0x3fff950e58e4 in syscall () from /lib64/libc.so.6
> >>> #1  0x3fff94dbbdc4 in __cxxabiv1::__cxa_guard_acquire 
> >>> (g=0x3fff94f26d40  >>> namespace)::__future_category_instance()::__fec>) at 
> >>> ../../../../libstdc++-v3/libsupc++/guard.cc:302
> >>> #2  0x3fff94dfaf80 in __future_category_instance () at 
> >>> ../../../../../libstdc++-v3/src/c++11/future.cc:65
> >>> #3  std::future_category () at 
> >>> ../../../../../libstdc++-v3/src/c++11/future.cc:79
> >>> #4  0x3fff94da98e8 in __static_initialization_and_destruction_0 
> >>> (__priority=65535, __initialize_p=1) at 
> >>> ../../../../libstdc++-v3/src/c++11/compatibility-thread-c++0x.cc:50
> >>> #5  _GLOBAL__sub_I_compatibility_thread_c__0x.cc(void) () at 
> >>> ../../../../libstdc++-v3/src/c++11/compatibility-thread-c++0x.cc:200
> >>> #6  0x3fff96105c74 in _dl_init_internal () from /lib64/ld64.so.2
> >>> #7  0x3fff960f19cc in _dl_start_user () from /lib64/ld64.so.2
> >>
> >> strace says about it:
> >>
> >> futex(0x3fffa5226d40, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource 
> >> temporarily unavailable)
> > 
> > To me the whole issue looks related to PR81931.
> > Are you using a clean build directory? Make sure you have the r251328
> > fix.>
> 
> Yep. Directory was cleaner. I don't have the revision. Let me re-test it.
> How can it be related to the PR as msgfmt is taken from system?

It may load a broken libstdc++.so.6 that was build without the fix.

-- 
Markus

Re: Building on gcc112 is stuck in msgfmt

2017-08-29 Thread Markus Trippelsdorf

On 2017.08.29 at 15:22 +, Jason Mancini wrote:
> Been doing stability testing on my x86_64 Ryzen cpu using openSUSE's
> (Tumbleweed) "gcc7.1.1 20170802" + compiling Linux kernel source.
> Every so often, the build curiously stalls on a futex between cc1 and
> as.  cc1 is on the futex.  as is waiting to read.  Could that hang be
> related to what's being discussed here?

No. Ryzen is buggy. If you have a chip that was produced before June
this year, your only option is to RMA it.

-- 
Markus

Re: RFC: Improving GCC8 default option settings

2017-09-14 Thread Markus Trippelsdorf

On 2017.09.14 at 11:57 +0200, Richard Biener wrote:
> On Wed, Sep 13, 2017 at 6:11 PM, Nikos Chantziaras  wrote:
> > On 12/09/17 16:57, Wilco Dijkstra wrote:
> >>
> >> [...] As a result users are
> >> required to enable several additional optimizations by hand to get good
> >> code.
> >> Other compilers enable more optimizations at -O2 (loop unrolling in LLVM
> >> was
> >> mentioned repeatedly) which GCC could/should do as well.
> >> [...]
> >>
> >> I'd welcome discussion and other proposals for similar improvements.
> >
> >
> > What's the status of graphite? It's been around for years. Isn't it mature
> > enough to enable these:
> >
> > -floop-interchange -ftree-loop-distribution -floop-strip-mine -floop-block
> >
> > by default for -O2? (And I'm not even sure those are the complete set of
> > graphite optimization flags, or just the "useful" ones.)
> 
> It's not on by default at any optimization level.  The main issue is the
> lack of maintainance and a set of known common internal compiler errors
> we hit.  The other issue is that there's no benefit of turning those on for
> SPEC CPU benchmarking as far as I remember but quite a bit of extra
> compile-time cost.

Not to mention the numerous wrong-code bugs. IMHO graphite should
deprecated as soon as possible.

-- 
Markus

Re: RFC: Improving GCC8 default option settings

2017-09-14 Thread Markus Trippelsdorf

On 2017.09.14 at 14:48 +0200, Richard Biener wrote:
> On Thu, Sep 14, 2017 at 12:42 PM, Martin Liška  wrote:
> > On 09/14/2017 12:37 PM, Bin.Cheng wrote:
> >> On Thu, Sep 14, 2017 at 11:24 AM, Richard Biener
> >>  wrote:
> >>> On Thu, Sep 14, 2017 at 12:18 PM, Martin Liška  wrote:
> >>>> On 09/14/2017 12:07 PM, Markus Trippelsdorf wrote:
> >>>>> On 2017.09.14 at 11:57 +0200, Richard Biener wrote:
> >>>>>> On Wed, Sep 13, 2017 at 6:11 PM, Nikos Chantziaras  
> >>>>>> wrote:
> >>>>>>> On 12/09/17 16:57, Wilco Dijkstra wrote:
> >>>>>>>>
> >>>>>>>> [...] As a result users are
> >>>>>>>> required to enable several additional optimizations by hand to get 
> >>>>>>>> good
> >>>>>>>> code.
> >>>>>>>> Other compilers enable more optimizations at -O2 (loop unrolling in 
> >>>>>>>> LLVM
> >>>>>>>> was
> >>>>>>>> mentioned repeatedly) which GCC could/should do as well.
> >>>>>>>> [...]
> >>>>>>>>
> >>>>>>>> I'd welcome discussion and other proposals for similar improvements.
> >>>>>>>
> >>>>>>>
> >>>>>>> What's the status of graphite? It's been around for years. Isn't it 
> >>>>>>> mature
> >>>>>>> enough to enable these:
> >>>>>>>
> >>>>>>> -floop-interchange -ftree-loop-distribution -floop-strip-mine 
> >>>>>>> -floop-block
> >>>>>>>
> >>>>>>> by default for -O2? (And I'm not even sure those are the complete set 
> >>>>>>> of
> >>>>>>> graphite optimization flags, or just the "useful" ones.)
> >>>>>>
> >>>>>> It's not on by default at any optimization level.  The main issue is 
> >>>>>> the
> >>>>>> lack of maintainance and a set of known common internal compiler errors
> >>>>>> we hit.  The other issue is that there's no benefit of turning those 
> >>>>>> on for
> >>>>>> SPEC CPU benchmarking as far as I remember but quite a bit of extra
> >>>>>> compile-time cost.
> >>>>>
> >>>>> Not to mention the numerous wrong-code bugs. IMHO graphite should
> >>>>> deprecated as soon as possible.
> >>>>>
> >>>>
> >>>> For wrong-code bugs we've got and I recently went through, I fully agree 
> >>>> with this
> >>>> approach and I would do it for GCC 8. There are PRs where order of 
> >>>> simple 2 loops
> >>>> is changed, causing wrong-code as there's a data dependence.
> >>>>
> >>>> Moreover, I know that Bin was thinking about selection whether to use 
> >>>> classical loop
> >>>> optimizations or Graphite (depending on options provided). This would 
> >>>> simplify it ;)
> >>>
> >>> I don't think removing graphite is warranted, I still think it is the
> >>> approach to use when
> >>> handling non-perfect nests.
> >> Hi,
> >> IMHO, we should not be in a hurry to remove graphite, though we are
> >> introducing some traditional transformations.  It's a quite standalone
> >> part in GCC and supports more transformations.  Also as it gets more
> >> attention, never know if somebody will find time to work on it.
> >
> > Ok. I just wanted to express that from user's perspective I would not 
> > recommend it to use.
> > Even if it improves some interesting (and for classical loop optimization 
> > hard) loop nests,
> > it can still blow up on a quite simple data dependence in between loops. 
> > That said, it's quite
> > risky to use it.
> 
> We only have a single wrong-code bug in bugzilla with a testcase and I
> just fixed it (well,
> patch in testing).  We do have plenty of ICEs, yes.

Even tramp3d-v4, which is cited in several graphite papers, gets
miscompiled: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68823.

-- 
Markus

Re: GCC Buildbot

2017-09-20 Thread Markus Trippelsdorf

On 2017.09.20 at 18:01 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Sep 20, 2017 at 05:01:55PM +0200, Paulo Matos wrote:
> > This mail's intention is to gauge the interest of having a buildbot for
> > GCC.
> 
> +1.  Or no, +100.
> 
> > - which machines we can use as workers: we certainly need more worker
> > (previously known as slave) machines to test GCC in different
> > archs/configurations;
> 
> I think this would use too much resources (essentially the full machines)
> for the GCC Compile Farm.  If you can dial it down so it only uses a
> small portion of the machines, we can set up slaves there, at least on
> the more unusual architectures.  But then it may become too slow to be
> useful.

There is already a buildbot that uses GCC compile farm resources:
http://toolchain.lug-owl.de/buildbot/

And it has the basic problem of all automatic testing: that in the long
run everyone simply ignores it.
The same thing would happen with the proposed new buildbot. It would use
still more resources on the already overused machines without producing
useful results.

The same thing is true for the regression mailing list
https://gcc.gnu.org/ml/gcc-regression/current/.
It is obvious that nobody pays any attention to it, e.g. PGO bootstrap
is broken for several months on x86_64 and i686 bootstrap is broken for
a long time, too.

Only a mandatory pre-commit hook that would reject commits that break
anything would work. But running the testsuite takes much to long to
make this approach feasible.

-- 
Markus

Re: RFC: Update top level libtool files

2017-10-10 Thread Markus Trippelsdorf

On 2017.10.10 at 12:45 +0100, Nick Clifton wrote:
> Hi Guys,
> 
>   I would like to update the top level libtool files (libtool.m4,
>   ltoptions.m4, ltsugar.m4, ltversion.m4 and lt~obsolete.m4) used by
>   gcc, gdb and binutils.  Currently we have version 2.2.7a installed in
>   the source trees and I would like to switch to the latest official
>   version: 2.4.6.
> 
>   The motivation for doing this is an attempt to reduce the number of
>   patches being carried round by the Fedora binutils releases.
>   Currently one of the patches there is to fix a bug in the 2.2.7a
>   libtool which causes it to select /lib and /usr/lib as the system
>   library search paths even for 64-bit hosts.  Rather than just bring
>   this patch into the sources however, I thought that it would be better
>   to upgrade to the latest official libtool release and use that
>   instead.
> 
>   I have successfully run an x86_64 gcc bootstrap, built and tested lots
>   of different binutils configurations, and built and run an x86_64 gdb.
>   One thing that worries me though, is why hasn't this been done before?
>   Ie is there a special reason for staying with the old 2.2.7a libtool ?
>   If not, then does anyone object to my upgrading the gcc, gdb and
>   binutils mainline sources ?

Last time I've looked in 2011, libtool's "with_sysroot" was not
compatible with gcc's. So a naive copy doesn't work. But reverting
commit 3334f7ed5851ef1 in libtool before copying should work.

-- 
Markus

Re: GCC Buildbot Update - Definition of regression

2017-10-10 Thread Markus Trippelsdorf

On 2017.10.10 at 21:45 +0200, Paulo Matos wrote:
> Hi all,
> 
> It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:
> 
> * 3 x86_64 workers from CF are now installed;
> * There's one scheduler for trunk doing fresh builds for every Daily bump;
> * One scheduler doing incremental builds for each active branch;
> * An IRC bot which is currently silent;

Using -j8 for the bot on a 8/16 (core/thread) machine like gcc67 is not
acceptable, because it will render it unusable for everybody else.
Also gcc67 has a buggy Ryzen CPU that causes random gcc crashes. Not the
best setup for a regression tester...

-- 
Markus

Re: GCC Buildbot Update - Definition of regression

2017-10-10 Thread Markus Trippelsdorf

On 2017.10.11 at 08:22 +0200, Paulo Matos wrote:
> 
> 
> On 11/10/17 06:17, Markus Trippelsdorf wrote:
> > On 2017.10.10 at 21:45 +0200, Paulo Matos wrote:
> >> Hi all,
> >>
> >> It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:
> >>
> >> * 3 x86_64 workers from CF are now installed;
> >> * There's one scheduler for trunk doing fresh builds for every Daily bump;
> >> * One scheduler doing incremental builds for each active branch;
> >> * An IRC bot which is currently silent;
> > 
> > Using -j8 for the bot on a 8/16 (core/thread) machine like gcc67 is not
> > acceptable, because it will render it unusable for everybody else.
> 
> I was going to correct you on that given what I read in
> https://gcc.gnu.org/wiki/CompileFarm#Usage
> 
> but it was my mistake. I assumed that for an N-thread machine, I could
> use N/2 processes but the guide explicitly says N-core, not N-thread.
> Therefore I should be using 4 processes for gcc67 (or 0 given what follows).
> 
> I will fix also the number of processes used by the other workers.

Thanks. And while you are at it please set the niceness to 19.

> > Also gcc67 has a buggy Ryzen CPU that causes random gcc crashes. Not the
> > best setup for a regression tester...
> > 
> 
> Is that documented anywhere? I will remove this worker.

https://community.amd.com/thread/215773

-- 
Markus

Re: GCC Buildbot Update

2017-12-14 Thread Markus Trippelsdorf

On 2017.12.14 at 21:32 +0100, Christophe Lyon wrote:
> On 14 December 2017 at 09:56, Paulo Matos  wrote:
> > I got an email suggesting I add some aarch64 workers so I did:
> > 4 workers from CF (gcc113, gcc114, gcc115 and gcc116);
> >
> Great, I thought the CF machines were reserved for developpers.
> Good news you could add builders on them.

I don't think this is good news at all. 

Once a buildbot runs on a CF machine it immediately becomes impossible
to do any meaningful measurement on that machine. That is mainly because
of the random I/O (untar, rm -fr, etc.) of the bot. As a result variance
goes to the roof and all measurements drown in noise.

So it would be good if there was a strict separation of machines used
for bots and machines used by humans. In other words bots should only
run on dedicated machines.

-- 
Markus

Re: GCC Buildbot Update

2017-12-15 Thread Markus Trippelsdorf

On 2017.12.15 at 10:21 +0100, Paulo Matos wrote:
> 
> 
> On 15/12/17 08:42, Markus Trippelsdorf wrote:
> > 
> > I don't think this is good news at all. 
> > 
> 
> As I pointed out in a reply to Chris, I haven't seeked permission but I
> am pretty sure something similar runs in the CF machines from other
> projects.
> 
> The downside is that if we can't use the CF, I have no extra machines to
> run the buildbot on.
> 
> > Once a buildbot runs on a CF machine it immediately becomes impossible
> > to do any meaningful measurement on that machine. That is mainly because
> > of the random I/O (untar, rm -fr, etc.) of the bot. As a result variance
> > goes to the roof and all measurements drown in noise.
> > 
> > So it would be good if there was a strict separation of machines used
> > for bots and machines used by humans. In other words bots should only
> > run on dedicated machines.
> > 
> 
> I understand your concern though. Do you know who this issue could be
> raised with? FSF?

I think the best place would be the CF user mailing list
.
(All admins and users should be subscribed.)

-- 
Markus

Re: Announce: GNU MPFR 4.0.0 is released

2017-12-26 Thread Markus Trippelsdorf

On 2017.12.25 at 13:27 +0100, Vincent Lefevre wrote:
> GNU MPFR 4.0.0 ("dinde aux marrons"), a C library for
> multiple-precision floating-point computations with correct rounding,
> is now available for download from the MPFR web site:
> 
>   http://www.mpfr.org/mpfr-4.0.0/

Unfortunately it is incompatible with mpc-1.0.3.
Once a new mpc version gets released contrib/download_prerequisites
could be updated.

-- 
Markus

Re: PATCH RFA: Build stages 2 and 3 with C++

2011-07-17 Thread Markus Trippelsdorf

On 2011.07.17 at 18:30 +0200, Richard Guenther wrote:
> On Sun, Jul 17, 2011 at 1:30 PM, Eric Botcazou  wrote:
> >> I have measured it at some point and IIRC it was about 10% slower
> >> (comparing C bootstrap with C++ in stag1 languages with C++ bootstrap,
> >> not sure if that included bootstrapping libstdc++ for the former).
> >
> > IMO acceptable now that the build time of libjava has been halved.
> 
> Actually the penalty for using C++ was only 1.5%, that of bootstrapping C++ 
> and
> libstdc++ was 15%.  For reference:

I've tested the difference today on an average 4 CPU machine with 8GB
RAM. This is the result of otherwise identical LTO+PGO builds:

--enable-build-with-cxx  make -j4 profiledbootstrap  3384.20s user 177.02s 
system 291% cpu 20:23.12 total
make -j4 profiledbootstrap  3011.03s user 144.30s 
system 297% cpu 17:41.59 total

That's a ~15% increase in build time.

(I couldn't test --enable-build-poststage1-with-cxx, because it doesn't
seem to work with this configuration. Maybe the patch needs to be updated
to also cover LTO or PGO builds?)

Configured with: ../gcc/configure --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/4.7.0
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.0/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.7.0
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.7.0/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.7.0/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.0/include/g++-v4
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --disable-altivec
--disable-fixed-point --without-ppl --without-cloog --enable-lto
--enable-nls --without-included-gettext --with-system-zlib
--disable-werror --with-gold --enable-secureplt --disable-multilib
--enable-libmudflap --disable-libssp --enable-libgomp --enable-cld
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/4.7.0/python
--enable-checking=release --disable-libgcj --enable-languages=c,c++
--enable-shared --enable-threads=posix --enable-__cxa_atexit
--enable-clocale=gnu --with-build-config=bootstrap-lto

And built with:
make -j4 BOOT_CFLAGS="-march=native -O2 -pipe"
STAGE1_CFLAGS="-march=native -O2 -pipe" CFLAGS_FOR_TARGET="-march=native
-O2 -pipe" profiledbootstrap

-- 
Markus

Re: PATCH RFA: Build stages 2 and 3 with C++

2011-07-17 Thread Markus Trippelsdorf

On 2011.07.17 at 18:54 +0200, Markus Trippelsdorf wrote:
> On 2011.07.17 at 18:30 +0200, Richard Guenther wrote:
> > On Sun, Jul 17, 2011 at 1:30 PM, Eric Botcazou  
> > wrote:
> > >> I have measured it at some point and IIRC it was about 10% slower
> > >> (comparing C bootstrap with C++ in stag1 languages with C++ bootstrap,
> > >> not sure if that included bootstrapping libstdc++ for the former).
> > >
> > > IMO acceptable now that the build time of libjava has been halved.
> > 
> > Actually the penalty for using C++ was only 1.5%, that of bootstrapping C++ 
> > and
> > libstdc++ was 15%.  For reference:
> 
> I've tested the difference today on an average 4 CPU machine with 8GB
> RAM. This is the result of otherwise identical LTO+PGO builds:
> 
> --enable-build-with-cxx  make -j4 profiledbootstrap  3384.20s user 177.02s 
> system 291% cpu 20:23.12 total
> make -j4 profiledbootstrap  3011.03s user 144.30s 
> system 297% cpu 17:41.59 total
> 
> That's a ~15% increase in build time.

And I guess that most of it comes from building libstdc++ to train the
instrumented compiler. This doesn't happen in the default case AFAICS.

-- 
Markus

Revision 176335 (removal of #include in thr-posix.h) cause numerous compile failures

2011-08-02 Thread Markus Trippelsdorf

Revisions 176335 removed the traditional "#include " from
gthr-posix.h. This breaks the build of many programs (Firefox, Chromium,
etc.) that implicitly rely on it. 
I'm not sure that the gain is worth the pain in this case.

-- 
Markus

Re: libtool.m4 update?

2011-10-24 Thread Markus Trippelsdorf

On 2011.10.25 at 06:39 +0200, Andreas Tobler wrote:
> Is it preferred to sync libtool.m4 completely? Or do we want to shift 
> this update for a later time? I'm aware of the closing stage one.

An libtool update is also needed for bootstrap-lto with slim lto object
files. So a complete sync with upstream would be the best option IMO.

-- 
Markus

Re: gcc-4.6.2 fails to build on fedora 17

2012-01-23 Thread Markus Trippelsdorf

On 2012.01.23 at 16:45 +0100, Ralf Corsepius wrote:
> Hi,
> 
> Crossbuilding gcc-4.6.2 for rtems targets succeeds on Fedora 15, 16, 
> openSUSE-11.4 and 12.1, but fails with this error on Fedora rawhide 
> (aka. Fedora 17):
> 
> ...
> # gcc -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
> -fstack-protector --param=ssp-buffer-size=4  -m64 -mtune=generic -c   -g 
> -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE  -W -Wall -Wwrite-strings 
> -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes 
> -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros 
> -Wno-overlength-strings -Wold-style-definition -Wc++-compat 
> -DHAVE_CONFIG_H -I. -I. -I../../gcc-4.6.2/gcc -I../../gcc-4.6.2/gcc/. 
> -I../../gcc-4.6.2/gcc/../include -I../../gcc-4.6.2/gcc/../libcpp/include 
>   -I../../gcc-4.6.2/gcc/../libdecnumber 
> -I../../gcc-4.6.2/gcc/../libdecnumber/dpd -I../libdecnumber 
> gtype-desc.c -o gtype-desc.o
> gtype-desc.c:8735:18: error: subscripted value is neither array nor 
> pointer nor vector
> gtype-desc.c:8854:36: error: subscripted value is neither array nor 
> pointer nor vector
> gtype-desc.c:8938:31: error: subscripted value is neither array nor 
> pointer nor vector
> gtype-desc.c:8959:31: error: subscripted value is neither array nor 
> pointer nor vector
> gtype-desc.c:8966:31: error: subscripted value is neither array nor 
> pointer nor vector
> gtype-desc.c:8973:31: error: subscripted value is neither array nor 
> pointer nor vector
> make[2]: *** [gtype-desc.o] Error 1
> 
> 
> 
> Poking around inside of the build-trees, I found this difference in the 
> GCC-generated sources between Fedora 16 and rawhide, which I am inclined 
> to believe to be related to this breakdown:
> 
> diff -u 
> /var/lib/mock/fedora-16-x86_64-rtems/root/builddir/build/BUILD/rtems-4.11-sparc-rtems4.11-gcc-4.6.2/build/gcc/gt-cp-mangle.h
>  
> /var/lib/mock/fedora-17-x86_64-rtems/root/builddir/build/BUILD/rtems-4.11-sparc-rtems4.11-gcc
> --- 
> /var/lib/mock/fedora-16-x86_64-rtems/root/builddir/build/BUILD/rtems-4.11-sparc-rtems4.11-gcc-4.6.2/build/gcc/gt-cp-mangle.h
>  
> 2012-01-23 08:26:19.056369396 +0100
> +++ 
> /var/lib/mock/fedora-17-x86_64-rtems/root/builddir/build/BUILD/rtems-4.11-sparc-rtems4.11-gcc-4.6.2/build/gcc/gt-cp-mangle.h
>  
> 2012-01-23 08:26:38.648665206 +0100
> @@ -39,7 +39,7 @@
> {
>   &G.substitutions,
>   1,
> -sizeof (G.substitutions),
> +sizeof (G),
>   >_ggc_mx_VEC_tree_gc,
>   >_pch_nx_VEC_tree_gc
> },
> diff -u 
> /var/lib/mock/fedora-16-x86_64-rtems/root/builddir/build/BUILD/rtems-4.11-sparc-rtems4.11-gcc-4.6.2/build/gcc/gtype-desc.c
>  
> /var/lib/mock/fedora-17-x86_64-rtems/root/builddir/build/BUILD/rtems-4.11-sparc-rtems4.11-gcc-4
> --- 
> /var/lib/mock/fedora-16-x86_64-rtems/root/builddir/build/BUILD/rtems-4.11-sparc-rtems4.11-gcc-4.6.2/build/gcc/gtype-desc.c
>  
>   2012-01-23 08:26:19.058369427 +0100
> +++ 
> /var/lib/mock/fedora-17-x86_64-rtems/root/builddir/build/BUILD/rtems-4.11-sparc-rtems4.11-gcc-4.6.2/build/gcc/gtype-desc.c
>  
>   2012-01-23 08:26:38.652665266 +0100
> @@ -8375,7 +8375,7 @@
> {
>   &ipa_escaped_pt.vars,
>   1,
> -sizeof (ipa_escaped_pt.vars),
> +sizeof (ipa_escaped_pt),
>   >_ggc_mx_bitmap_head_def,
>   >_pch_nx_bitmap_head_def
> },
> ...
> 
> 
> Any ideas about the cause?

I've opened a bug for this issue:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51969

It only happens if you try to build gcc-4.6 with gcc-4.7 on my machine.

-- 
Markus

Re: Memory corruption due to word sharing

2012-02-01 Thread Markus Trippelsdorf

On 2012.02.01 at 16:19 +0100, Jan Kara wrote:
>   we've spotted the following mismatch between what kernel folks expect
> from a compiler and what GCC really does, resulting in memory corruption on
> some architectures. Consider the following structure:
> struct x {
> long a;
> unsigned int b1;
> unsigned int b2:1;
> };
> 
> We have two processes P1 and P2 where P1 updates field b1 and P2 updates
> bitfield b2. The code GCC generates for b2 = 1 e.g. on ia64 is:
>0:   09 00 21 40 00 21   [MMI]   adds r32=8,r32
>6:   00 00 00 02 00 e0   nop.m 0x0
>c:   11 00 00 90 mov r15=1;;
>   10:   0b 70 00 40 18 10   [MMI]   ld8 r14=[r32];;
>   16:   00 00 00 02 00 c0   nop.m 0x0
>   1c:   f1 70 c0 47 dep r14=r15,r14,32,1;;
>   20:   11 00 38 40 98 11   [MIB]   st8 [r32]=r14
>   26:   00 00 00 02 00 80   nop.i 0x0
>   2c:   08 00 84 00 br.ret.sptk.many b0;;
> 
> Note that gcc used 64-bit read-modify-write cycle to update b2. Thus if P1
> races with P2, update of b1 can get lost. BTW: I've just checked on x86_64
> and there GCC uses 8-bit bitop to modify the bitfield.
> 
> We actually spotted this race in practice in btrfs on structure
> fs/btrfs/ctree.h:struct btrfs_block_rsv where spinlock content got
> corrupted due to update of following bitfield and there seem to be other
> places in kernel where this could happen.
> 
> I've raised the issue with our GCC guys and they said to me that: "C does
> not provide such guarantee, nor can you reliably lock different
> structure fields with different locks if they share naturally aligned
> word-size memory regions.  The C++11 memory model would guarantee this,
> but that's not implemented nor do you build the kernel with a C++11
> compiler."
> 
> So it seems what C/GCC promises does not quite match with what kernel
> expects. I'm not really an expert in this area so I wanted to report it
> here so that more knowledgeable people can decide how to solve the issue...

FYI, the gcc bug can be found here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080

-- 
Markus

Modelling of multiple condition code registers

2009-07-24 Thread Markus L

Hi,

I have a question about modelling of condition codes in GCC. The
target I am considering has the following characteristics:

Associated with each register is a set of CC flags that are updated
whenever that register is used as a destination of an operation that
would normally update the CC register in a single CC machine. Example:

add r0, r1, r2  // updates r2's condition codes
cmp r0, r1 // updates r1's condition codes

Any instruction can be predicated on these condition codes, e.g.

if (r2:ge) ld [r0], r1
if(r1:ne) mpy r1, r2, r3

So we have several CC registers and the one that will be updated for a
given instruction directly depends on the destination register chosen
for that instruction.

My question is; would it be possible for GCC take advantage of these
"extra" CC registers? And if that should be the case how would I go
about modelling it?

BR
/Markus

Address as HImode when Pmode is QImode?

2009-08-13 Thread Markus L

ecl=0xb7d18400) at
../../trunk/gcc-4.4.1/gcc/tree-optimize.c:420
...

thanks and best regards
/Markus

Re: Address as HImode when Pmode is QImode?

2009-08-14 Thread Markus L

Hi Adam,

Looks like you were right! My SIZE_TYPE was undefined so it defaulted
to "long unsigned int". Setting it to "unsigned int" solved my
problems.

Thank you very much!
/Markus


2009/8/13 Adam Nemet :
> Markus L  writes:
>> I run into an assert in convert_memory_address not beeing able to
>> convert the address rtx (beeing HImode) into to Pmode (i.e. QImode). A
>> few frames up the I can dump the tree node and it looks like the
>> address calculations are done in HImode. Why is the address beeing
>> calculated as unsigned long int and not as unsigned int which would be
>> Pmode for my target? Is this expected (and my problem originates from
>> elsewhere) or am I missing something obvious here?
>
> What's your sizetype?  This could be related to:
>
> http://gcc.gnu.org/ml/gcc/2008-05/msg00313.html
>
> Adam
>

Storing 16bit values in upper part of 32bit registers

2009-10-15 Thread Markus L

Hi,

I am working with an architecture where the 32bit registers (rN) are
divided into high (rNh) and low (rNl) 16bit sub registers that can in
principle be individually accessed by the instructions in the IS.
However the IS is designed so that it is beneficial to to store 16bit
values in the high part of the registers (rNh) and also the calling
conventions that we want follow require 16bit values to be passed and
returned in rNh.

What would be the "proper way" make the compiler use the upper parts
of these registers for the 16bit operands?

Currently this is done by having the registers in two register classes
('full' and 'high_only') and printing the 'h' in the output template
when the constraint matches the 'high_only' class. This however causes
problems when converting between 16 and 32bit operands. One annoying
example is returning scalar values. E.g. assume that a 32bit variable
(long) is assigned to a 16bit variable (int) like in

int foo(void)
{
  long sum;
  ...
  return (int)sum;
}

then we want the low part of sum to be moved to the high part of the
return register r0h. However TARGET_FUNCTION_VALUE only seem to allow
me to return the RTX for register r0 but not the subreg for r0h so the
compiler will not emit the necessary RTL to move the value from the
low part of sum to r0h before the return. This (and probably many
other issues that I am about to discover) makes me think that maybe
this is not the ideal way to do this. I have searched the available
ports but have not been able to find any which seem to use its
registers in a similar way. Any advice or pointers to code to look
into would be much appreciated.

Thanks in advance.

/Markus

Problems with selective scheduling

2009-10-27 Thread Markus L

Hi,

I recently read the articles about the selective scheduling
implementation and found it quite interesting, I especially liked the
idea of how neatly software pipelining is integrated. The target I am
working on is a VLIW DSP so obviously these things are very important
for good code generation.

However when compiling the following C function with
-fselective-scheduling2 and -fsel-sched-pipelining I face a few
problems.

long dotproduct2(int *a, int *b)
{
int i;
long s=0;

for (i = 0; i < 256; i++)
   s += (long)*a++**b++;
return s;
}

The output I get from sched2 pass is:
...
Scheduling region 0

Scheduling on fences: (uid:32;seqno:6;)
scanning new insn with uid = 80.
deleting insn with uid = 80.
Scheduled 0 bookkeeping copies, 0 insns needed bookkeeping, 0 insns
renamed, 0 insns substituted
Scheduling region 1

Scheduling on fences: (uid:72;seqno:1;)
scanning new insn with uid = 81.
deleting insn with uid = 81.
Scheduled 0 bookkeeping copies, 0 insns needed bookkeeping, 0 insns
renamed, 0 insns substituted
Scheduling region 2

Scheduling on fences: (uid:65;seqno:1;)
scanning new insn with uid = 82.
deleting insn with uid = 82.
Scheduled 0 bookkeeping copies, 0 insns needed bookkeeping, 0 insns
renamed, 0 insns substituted

(note 26 27 65 2 NOTE_INSN_FUNCTION_BEG)

(insn:TI 65 26 30 2 dotprod2.c:2 (set (mem:QI (pre_dec (reg/f:QI 32
sp)) [0 S1 A16])
(reg/f:QI 32 sp)) 12 {pushqi1} (nil))

(insn 30 65 62 2 dotprod2.c:2 (set (reg/v:HI 16 a0l [orig:62 s ] [62])
(const_int 0 [0x0])) 6 {*zero_load_hi} (expr_list:REG_EQUAL
(const_int 0 [0x0])
(nil)))

(insn 62 30 66 2 dotprod2.c:2 (set (reg:QI 2 r2 [70])
(const_int 256 [0x100])) 5 {*constant_load_qi}
(expr_list:REG_EQUAL (const_int 256 [0x100])
(nil)))

(insn:TI 66 62 67 2 dotprod2.c:2 (set (mem:QI (pre_dec (reg/f:QI 32
sp)) [0 S1 A16])
(reg/f:QI 33 dp)) 12 {pushqi1} (nil))

(insn:TI 67 66 69 2 dotprod2.c:2 (set (reg/f:QI 33 dp)
(reg/f:QI 32 sp)) 10 {*move_regs_qi} (nil))

(note 69 67 39 2 NOTE_INSN_PROLOGUE_END)

(code_label 39 69 31 3 2 "" [1 uses])

(note 31 39 34 3 [bb 3] NOTE_INSN_BASIC_BLOCK)

(note 34 31 32 3 NOTE_INSN_DELETED)

(insn:TI 32 34 33 3 dotprod2.c:10 (set (reg:QI 19 a1h [67])
(mem:QI (post_inc:QI (reg/v/f:QI 1 r1 [orig:65 b ] [65])) [2
S1 A16])) 3 {*load_word_qi_with_post_inc} (expr_list:REG_INC
(reg/v/f:QI 1 r1 [orig:65 b ] [65])
(nil)))

(insn 33 32 35 3 dotprod2.c:10 (set (reg:QI 18 a1l [68])
(mem:QI (post_inc:QI (reg/v/f:QI 0 r0 [orig:64 a ] [64])) [2
S1 A16])) 3 {*load_word_qi_with_post_inc} (expr_list:REG_INC
(reg/v/f:QI 0 r0 [orig:64 a ] [64])
(nil)))

(insn 35 33 61 3 dotprod2.c:10 (set (reg/v:HI 16 a0l [orig:62 s ] [62])
(plus:HI (mult:HI (sign_extend:HI (reg:QI 19 a1h [67]))
(sign_extend:HI (reg:QI 18 a1l [68])))
(reg/v:HI 16 a0l [orig:62 s ] [62]))) 23 {multacc}
(expr_list:REG_DEAD (reg:QI 19 a1h [67])
(expr_list:REG_DEAD (reg:QI 18 a1l [68])
(nil

(jump_insn:TI 61 35 75 3 dotprod2.c:8 (parallel [
(set (pc)
(if_then_else (ne (reg:QI 2 r2 [70])
(const_int 1 [0x1]))
(label_ref:QI 39)
(pc)))
(set (reg:QI 2 r2 [70])
(plus:QI (reg:QI 2 r2 [70])
(const_int -1 [0x])))
(use (const_int 255 [0xff]))
(use (const_int 255 [0xff]))
(use (const_int 1 [0x1]))
]) 43 {doloop_end_internal} (expr_list:REG_BR_PROB (const_int
9899 [0x26ab])
(nil)))

(note 75 61 70 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

(note 70 75 72 4 NOTE_INSN_EPILOGUE_BEG)

...

The loop body is not correctly scheduled, the TImode flags indicate
that the entire loop-body will be executed in a single cycle as a VLIW
packet and this will not work since no loop-prologue code has been
emitted.

My (probably quite limited) understanding of what should happen is that:

1. the fence is placed at (before) uid 32.
2. Instructions uid 32 and uid 33 are scheduled in this vliw group
3. The fence is advanced to to uid 35.
4. Instruction uid 35 is scheduled and instructions uid 32 and 33 are
moved up and scheduled in this group also. In the process of moving up
uid 32 and 33 bookkeeping copies are created on the loop entry edge.

I've tried to debug this without much success and would very much
appreciate any comments on what to look for or what I might be doing
wrong.

The GCC version that I am using is 4.4.1.

BR
/Markus

1 2 >

1 - 100 of 131 matches

Mail list logo