Re: building gcc with macro support for gdb?

2015-12-03 Thread Andreas Schwab
Ryan Burn  writes:

> Is there any way to easily build a stage1 gcc with macro support for 
> debugging?

Set STAGE1_CFLAGS.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: Identifying a pointer to a structure

2015-12-03 Thread Richard Biener
On Thu, Dec 3, 2015 at 8:54 AM, Uday P. Khedker  wrote:
> We are implementing points-to analysis in GCC 4.7.2 and need to distinguish
> between
> pointers to scalars and the pointers to structures. This distinction by
> using the TYPE (TREE_TYPE)
> hierarchy of the tree node of the pointer. We have two questions:
>
> (a) Is it sufficient to check for the presence of RECORD_TYPE in type
> hierarchy?
> (b) Is it safe to assume that the RECORD_TYPE always appears as a leaf node
> in
> the type description of any pointer to structure?
>
> As an example, the tree nodes of a pointer to an integer (y) and a pointer
> to a structure (f)
> below. It seems to support our hunch.

Yes, your observations are correct with respect to types.

But you can't rely on the pointer type for determining what kind of
object apointer points to.
First because of int *p = &s.i;  with struct { int i; ... } s; points
to an int but it also points
to the start of an aggregate (and so can be trivially casted to a
pointer to s).  Second because
GCCs middle-end (thus GIMPLE) ignores pointer types completely so you can have
an SSA name a_1 of type int *X and a dereference a_1->b.c (thus a_1 points to a
structure object even though it is of type int *).

Richard.

> For example, the tree node of y which is a pointer to an integer is as
> follows:
>
>  type  type  size 
> unit size 
> align 32 symtab 0 alias set 3 canonical type 0x40524360
> precision 32 min  max  0x405148c0 2147483647>
> pointer_to_this >
> unsigned SI
> size 
> unit size 
> align 32 symtab 0 alias set 2 canonical type 0x40524a80
> pointer_to_this >
> used static unsigned SI file struct0.c line 15 col 11
> size 
> constant 32>
> unit size  sizetype> constant 4>
> align 32 context 
> (mem/f/c:SI (symbol_ref:SI ("y") [flags 0x2]  )
> [2 y+0 S4 A32])>
> _
> _ Similarly, the tree node of f which is a pointer to a structure is as
> follows:
>
>  type  type <*record_type* 0x405cf060 list BLK
> size 
> unit size 
> align 32 symtab 0 alias set 4 canonical type 0x405cf060 fields
>  context  D.2319>
> pointer_to_this  chain  0x40529870 list>>
> public unsigned SI
> size 
> unit size 
> align 32 symtab 0 alias set 2 canonical type 0x405cf1e0
> pointer_to_this >
> used static unsigned SI file struct0.c line 14 col 25
> size 
> constant 32>
> unit size  sizetype> constant 4>
> align 32 context 
> (mem/f/c:SI (symbol_ref:SI ("f") [flags 0x2]  )
> [2 f+0 S4 A32])>
>
> Thanks and regards,
>
> Uday Khedker.
>


Re: basic asm and memory clobbers - Proposed solution

2015-12-03 Thread Paul_Koning

> On Dec 3, 2015, at 12:29 AM, Bernd Edlinger  wrote:
> 
>> ...
>> If the goal is to order things wrt x, why wouldn't you just reference x?
>> 
>>   x = 1;
>>   asm volatile("nop":"+m"(x));
>>   x = 0;
>> 
> 
> Exactly, that is what I mean.  Either the asm can use memory clobber
> or it can use output and/or input clobbers or any combination of that.
> 
> The problem with basic asm is not that it is basic, but that it does not
> have any outputs nor any inputs and it has no way to specify what it
> might clobber.
> 
> Therefore I think the condition for the warning should not be that the
> asm is "basic", but that has zero outputs, zero inputs and zero clobbers.
> That would make more sense to me.

I don't think so.

Basic asm has a somewhat documented specification, in particular it is defined 
to be volatile.  Some revs of GCC got this wrong, but such cases are obviously 
bugs.  It does omit any statement about clobbers, true.  And the difficulty is 
that people have made assumptions, not necessarily supported by documentation.  
And those assumptions may have been invalid on some platforms in some revs by 
new optimizations.  The prevalence of confusion about basic asm is the reason 
why warning about it is potentially useful.

On the other hand, asm volatile ("foo":::) has a different meaning.  That 
specifically says that "foo" doesn't clobber anything (and, implicitly, that it 
does not rely on any program variable being up to date in memory).  While that 
isn't all that common, it is certainly possible.  For example, if I want to 
turn on an LED on the device front panel, I might use such a statement (on 
machines where the instruction set allows me to do this without using 
registers).  Since I explicitly stated "this doesn't clobber anything" I would 
not expect or want a warning.

paul


Re: building gcc with macro support for gdb?

2015-12-03 Thread Martin Sebor

On 12/02/2015 06:48 PM, Peter Bergner wrote:

On Wed, 2015-12-02 at 20:05 -0500, Ryan Burn wrote:

Is there any way to easily build a stage1 gcc with macro support for debugging?

I tried setting CFLAGS, and CXXFLAGS to specify "-O0 -g3" via the
command line before running configure, but that only includes those
flags for some of the compilation steps.

I was only successful after I manually edited the makefile to replace
"-g" with "-g3".


Try CFLAGS_FOR_TARGET='-O0 -g3 -fno-inline' and CXXFLAGS_FOR_TARGET='-O0 -g3 
-fno-inline'


I've been using STAGE1_CFLAGS as Andreas suggested.  The tree
checks in GCC make heavy use of macros that GDB unfortunately
often has trouble with.  See GDB bugs 19111, 1888, and 18881
for some of the problems.  To get around these, I end up using
info macro to print the macro definition and using whatever it
expands to instead.  I wonder if someone has found a more
convenient workaround.

Martin



Instruction scheduler rewriting instructions?

2015-12-03 Thread Steve Ellcey
Can the instruction scheduler actually rewrite instructions?  I didn't
think so but when I compile some code on MIPS with:

-O2 -fno-ivopts -fno-peephole2 -fno-schedule-insns2

I get:

$L4:
lbu $3,0($4)
addiu   $4,$4,1
lbu $2,0($5)
beq $3,$0,$L7
addiu   $5,$5,1

beq $3,$2,$L4
subu$2,$3,$2

When I changed -fno-schedule-insns2 to -fschedule-insns2, I get:

$L4:
lbu $3,0($4)
addiu   $5,$5,1
lbu $2,-1($5)
beq $3,$0,$L7
addiu   $4,$4,1

beq $3,$2,$L4
subu$2,$3,$2

I.e. The addiu of $5 and the load using $5 have been swapped around
and the load uses a different offset to compensate.  I can't see
where in the instruction scheduler that this would happen.  Any 
help?  This is on MIPS if that matters, though I didn't see any
MIPS specific code for this.  This issue is related to my earlier
question about PR 48814 and ivopts (thus the -fno-ivopts option).

The C code I am looking at is the strcmp function from glibc:

int
strcmp (const char *p1, const char *p2)
{
  const unsigned char *s1 = (const unsigned char *) p1;
  const unsigned char *s2 = (const unsigned char *) p2;
  unsigned char c1, c2;

  do
{
  c1 = (unsigned char) *s1++;
  c2 = (unsigned char) *s2++;
  if (c1 == '\0')
return c1 - c2;
}
  while (c1 == c2);

  return c1 - c2;
}


Steve Ellcey
sell...@imgtec.com


Re: Instruction scheduler rewriting instructions?

2015-12-03 Thread Ramana Radhakrishnan
On Thu, Dec 3, 2015 at 7:01 PM, Steve Ellcey  wrote:
> Can the instruction scheduler actually rewrite instructions?  I didn't
> think so but when I compile some code on MIPS with:
>
> -O2 -fno-ivopts -fno-peephole2 -fno-schedule-insns2
>
> I get:
>
> $L4:
> lbu $3,0($4)
> addiu   $4,$4,1
> lbu $2,0($5)
> beq $3,$0,$L7
> addiu   $5,$5,1
>
> beq $3,$2,$L4
> subu$2,$3,$2
>
> When I changed -fno-schedule-insns2 to -fschedule-insns2, I get:
>
> $L4:
> lbu $3,0($4)
> addiu   $5,$5,1
> lbu $2,-1($5)
> beq $3,$0,$L7
> addiu   $4,$4,1
>
> beq $3,$2,$L4
> subu$2,$3,$2
>
> I.e. The addiu of $5 and the load using $5 have been swapped around
> and the load uses a different offset to compensate.  I can't see
> where in the instruction scheduler that this would happen.  Any
> help?  This is on MIPS if that matters, though I didn't see any
> MIPS specific code for this.  This issue is related to my earlier
> question about PR 48814 and ivopts (thus the -fno-ivopts option).

IIRC it's because the scheduler *thinks* it can get a tighter schedule
- probably because it thinks it can dual issue the lbu from $4 and the
addiu to $5. Can it think so ? This may be related -
https://gcc.gnu.org/ml/gcc-patches/2012-08/msg00155.html

regards
Ramana



>
> The C code I am looking at is the strcmp function from glibc:
>
> int
> strcmp (const char *p1, const char *p2)
> {
>   const unsigned char *s1 = (const unsigned char *) p1;
>   const unsigned char *s2 = (const unsigned char *) p2;
>   unsigned char c1, c2;
>
>   do
> {
>   c1 = (unsigned char) *s1++;
>   c2 = (unsigned char) *s2++;
>   if (c1 == '\0')
> return c1 - c2;
> }
>   while (c1 == c2);
>
>   return c1 - c2;
> }
>
>
> Steve Ellcey
> sell...@imgtec.com


Re: basic asm and memory clobbers - Proposed solution

2015-12-03 Thread Bernd Edlinger


Am 03.12.2015 um 16:24 schrieb paul_kon...@dell.com:
> On the other hand, asm volatile ("foo":::) has a different meaning. 
> That specifically says that "foo" doesn't clobber anything. 


Well, not exactly, see the md_asm_adjust target callback.

On i386, rs6000, visium, cris and mn10300 targets asm volatile ("foo":::)
implicitly clobbers the "cc" register, even if that is not in the 
clobber list


Bernd.


Proposal to deprecate: mep (Toshiba Media Processor)

2015-12-03 Thread DJ Delorie

Given a combination of "I have new responsibilities" and "nothing has
happened with mep for a long time" I would like to step down as mep
maintainer.

If someone would like to pick up maintainership of this target, please
contact me and/or the steering committee.  Otherwise, I propose this
target be deprecated in GCC 6 and removed in 7.

DJ


Re: Instruction scheduler rewriting instructions?

2015-12-03 Thread Steve Ellcey
On Thu, 2015-12-03 at 19:56 +, Ramana Radhakrishnan wrote:

> IIRC it's because the scheduler *thinks* it can get a tighter schedule
> - probably because it thinks it can dual issue the lbu from $4 and the
> addiu to $5. Can it think so ? This may be related -
> https://gcc.gnu.org/ml/gcc-patches/2012-08/msg00155.html
> 
> regards
> Ramana

No, the system I am tuning for (MIPS 24k) is single issue according to
its description.  At least I do see now where the instruction is getting
rewritten in the instruction scheduler, so that is helpful.  I am no
longer sure the scheduler is where the problem lies though.  If I
compile with -O2 -mtune=24kc I get this loop:

addiu   $4,$4,1
$L8:
addiu   $5,$5,1
lbu $3,-1($4)
beq $3,$0,$L7
lbu $2,-1($5)

beq $3,$2,$L8
addiu   $4,$4,1

If I use -O2 -fno-ivopts -mtune=24kc I get:

lbu $3,0($4)
$L8:
lbu $2,0($5)
addiu   $4,$4,1
beq $3,$0,$L7
addiu   $5,$5,1

beql$3,$2,$L8
lbu $3,0($4)

This second loop is better because there is more time between the loads
and where the loaded values are used in the beq instructions.  So I
think there is something missing or wrong in the cost analysis that
ivopts is doing that it decides to do the adds before the loads instead
of visa versa.

I have tried tweaking the cost of loads in mips_rtx_costs and in the
instruction descriptions in 24k.md but that didn't seem to have any
affect on the ivopts code.

Steve Ellcey
sell...@imgtec.com




Re: Question about PR 48814 and ivopts and post-increment

2015-12-03 Thread Bin.Cheng
On Wed, Dec 2, 2015 at 5:11 AM, Steve Ellcey  wrote:
>
> I have a question involving ivopts and PR 48814, which was a fix for
> the post increment operation.  Prior to the fix for PR 48814, MIPS
> would generate this loop for strcmp (C code from glibc):
>
> $L4:
> lbu $3,0($4)
> lbu $2,0($5)
> addiu   $4,$4,1
> beq $3,$0,$L7
> addiu   $5,$5,1# This is a branch delay slot
> beq $3,$2,$L4
> subu$2,$3,$2   # This is a branch delay slot (only used after 
> loop)
>
>
> With the current top-of-tree we now generate:
>
> addiu   $4,$4,1
> $L8:
> lbu $3,-1($4)
> addiu   $5,$5,1
> beq $3,$0,$L7
> lbu $2,-1($5)  # This is a branch delay slot
> beq $3,$2,$L8
> addiu   $4,$4,1# This is a branch delay slot
>
> subu$2,$3,$2   # Done only once now after exiting loop.
>
> The main problem with the new loop is that the beq comparing $2 and $3
> is right before the load of $2 so there can be a delay due to the time
> that the load takes.  The ideal code would probably be:
>
> addiu   $4,$4,1
> $L8:
> lbu $3,-1($4)
> lbu $2,0($5)  # This is a branch delay slot
> beq $3,$0,$L7
> addiu   $5,$5,1
> beq $3,$2,$L8
> addiu   $4,$4,1# This is a branch delay slot
>
> subu$2,$3,$2   # Done only once now after exiting loop.
>
> Where we load $2 earlier (using a 0 offset instead of a -1 offset) and
> then do the increment of $5 after using it in the load.  The problem
> is that this isn't something that can just be done in the instruction
> scheduler because we are changing one of the instructions (to modify the
> offset) in addition to rearranging them and I don't think the instruction
> scheduler supports that.
Hmm, I think Bernd introduced sched_flag !DONT_BREAK_DEPENDENCIES to
resolve dependence by modifying address expression.  I think this is
the same problem, what's needed is to model dependence using that
framework.  Maybe delay slot is special here?

>
> It looks like is the ivopts code that decided to increment the registers
> first and use the -1 offsets in the loads after instead of using 0 offsets
> and then incrementing the offsets after the loads but I can't figure out
> how or why ivopts made that decision.
>
> Does anyone have any ideas on how I could 'fix' GCC to make it generate
> the ideal code?  Is there some way to do it in the instruction scheduler?
> Is there some way to modify ivopts to fix this by modifying the cost
It's likely IVO just peaks the first candidate when it runs into a
tie.  Could you please post preprocessed source code so that I can
have a look?  I am not familiar with glibc.  Thanks.

> analysis somehow?  Could I (partially) undo the fix for PR 48814?
> According to the final comment in that bugzilla report the change is
> really only needed for C11 and that the change does degrade the optimizer
> so could we go back to the old behaviour for C89/C99?  The code in ivopts
I saw this change caused code size regression on arm embedded processors.

Thanks,
bin

> has changed enough since the patch was applied I couldn't immediately see
> how to do that in the ToT sources.
>
> Steve Ellcey
> sell...@imgtec.com


Re: Question about PR 48814 and ivopts and post-increment

2015-12-03 Thread Bin.Cheng
On Fri, Dec 4, 2015 at 10:48 AM, Bin.Cheng  wrote:
> On Wed, Dec 2, 2015 at 5:11 AM, Steve Ellcey  wrote:
>>
>> I have a question involving ivopts and PR 48814, which was a fix for
>> the post increment operation.  Prior to the fix for PR 48814, MIPS
>> would generate this loop for strcmp (C code from glibc):
>>
>> $L4:
>> lbu $3,0($4)
>> lbu $2,0($5)
>> addiu   $4,$4,1
>> beq $3,$0,$L7
>> addiu   $5,$5,1# This is a branch delay slot
>> beq $3,$2,$L4
>> subu$2,$3,$2   # This is a branch delay slot (only used after 
>> loop)
>>
>>
>> With the current top-of-tree we now generate:
>>
>> addiu   $4,$4,1
>> $L8:
>> lbu $3,-1($4)
>> addiu   $5,$5,1
>> beq $3,$0,$L7
>> lbu $2,-1($5)  # This is a branch delay slot
>> beq $3,$2,$L8
>> addiu   $4,$4,1# This is a branch delay slot
>>
>> subu$2,$3,$2   # Done only once now after exiting loop.
>>
>> The main problem with the new loop is that the beq comparing $2 and $3
>> is right before the load of $2 so there can be a delay due to the time
>> that the load takes.  The ideal code would probably be:
>>
>> addiu   $4,$4,1
>> $L8:
>> lbu $3,-1($4)
>> lbu $2,0($5)  # This is a branch delay slot
>> beq $3,$0,$L7
>> addiu   $5,$5,1
>> beq $3,$2,$L8
>> addiu   $4,$4,1# This is a branch delay slot
>>
>> subu$2,$3,$2   # Done only once now after exiting loop.
>>
>> Where we load $2 earlier (using a 0 offset instead of a -1 offset) and
>> then do the increment of $5 after using it in the load.  The problem
>> is that this isn't something that can just be done in the instruction
>> scheduler because we are changing one of the instructions (to modify the
>> offset) in addition to rearranging them and I don't think the instruction
>> scheduler supports that.
> Hmm, I think Bernd introduced sched_flag !DONT_BREAK_DEPENDENCIES to
> resolve dependence by modifying address expression.  I think this is
> the same problem, what's needed is to model dependence using that
> framework.  Maybe delay slot is special here?
>
>>
>> It looks like is the ivopts code that decided to increment the registers
>> first and use the -1 offsets in the loads after instead of using 0 offsets
>> and then incrementing the offsets after the loads but I can't figure out
>> how or why ivopts made that decision.
>>
>> Does anyone have any ideas on how I could 'fix' GCC to make it generate
>> the ideal code?  Is there some way to do it in the instruction scheduler?
>> Is there some way to modify ivopts to fix this by modifying the cost
> It's likely IVO just peaks the first candidate when it runs into a
> tie.  Could you please post preprocessed source code so that I can
> have a look?  I am not familiar with glibc.  Thanks.
Oh, I saw the example in another thread of yours.

Thanks,
bin