Re: Undefined behavior due to 6.5.16.1p3

2015-03-10 Thread Richard Biener
On Mon, Mar 9, 2015 at 8:26 PM, Robbert Krebbers
 wrote:
> I was wondering whether GCC uses 6.5.16.1p3 of the C11 standard as a license
> to perform certain optimizations. If so, could anyone provide me an example
> program.
>
> In particular, I am interested about the "then the overlap shall be exact"
> part of 6.5.16.1p3:
>
>   If the value being stored in an object is read from another
>   object that overlaps in any way the storage of the first
>   object, then the overlap shall be exact and the two objects
>   shall have qualified or unqualified versions of a compatible
>   type; otherwise, the behavior is undefined.

struct X { int i; int j; };

int foo (struct X *p, struct X *q)
{
  q->j = 1;
  p->i = 0;
  return q->j;
}

will optimize to return 1.  If *p and *q were allowed to overlap
(&p->i == &q->j)
this would invoke undefined behavior.

Richard.


Re: Undefined behavior due to 6.5.16.1p3

2015-03-10 Thread Robbert Krebbers

Dear Richard,

On 03/10/2015 09:51 AM, Richard Biener wrote:

struct X { int i; int j; };

int foo (struct X *p, struct X *q)
{
   q->j = 1;
   p->i = 0;
   return q->j;
}

will optimize to return 1.  If *p and *q were allowed to overlap
(&p->i == &q->j)
this would invoke undefined behavior.

Thanks for the example!

I guess you are considering the case where q.j and p.i overlap. For example:

  int main() {
assert(sizeof(struct X) == 2 * sizeof(int));
unsigned char *p = malloc(3 * sizeof(int));
return foo ((struct X*)(p + sizeof(int)), (struct X*)p);
  }

In a naive memory model, one would indeed expect this program to return 
0 instead of 1 (which GCC does).


However, this program already invokes undefined behavior due to C11's 
notion of effective types, namely 6.5p6 and 6.5p7.


So, let me rephrase my question. Is anyone aware of situations in which 
GCC uses 6.5.16.1p3 as a license to perform certain optimizations where 
effective types alone do not suffice.


Robbert


Re: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-10 Thread Richard Biener
On Mon, Mar 9, 2015 at 11:59 PM, Steven Bosscher  wrote:
> On Mon, Mar 9, 2015 at 7:59 PM, vax mzn wrote:
>> w.r.t, https://gcc.gnu.org/wiki/Speedup_areas where we want to improve the 
>> performance of splay trees.
>>
>> The function `splay_tree_node splay_tree_lookup (splay_tree, 
>> splay_tree_key);'
>> updates the nodes every time a lookup is done.
>>
>> IIUC, There are places where we call this function in a loop i.e., we lookup 
>> different elements every time.
>> e.g.,
>> In this exaple we are looking for a different `t' in each iteration.
>
>
> If that's really true, then a splay tree is a horrible choice of data
> structure. The splay tree will simply degenerate to a linked list. The
> right thing to do would be, not to "break" one of the key features of
> splay trees (i.e. the latest lookup is always on top), but to use
> another data structure.

I agree with Steven here and wanted to say the same.  If you don't
benefit from splay trees LRU scheme then use a different data structure.

Richard.

> Ciao!
> Steven


Re: Undefined behavior due to 6.5.16.1p3

2015-03-10 Thread Richard Biener
On Tue, Mar 10, 2015 at 10:11 AM, Robbert Krebbers
 wrote:
> Dear Richard,
>
> On 03/10/2015 09:51 AM, Richard Biener wrote:
>>
>> struct X { int i; int j; };
>>
>> int foo (struct X *p, struct X *q)
>> {
>>q->j = 1;
>>p->i = 0;
>>return q->j;
>> }
>>
>> will optimize to return 1.  If *p and *q were allowed to overlap
>> (&p->i == &q->j)
>> this would invoke undefined behavior.
>
> Thanks for the example!
>
> I guess you are considering the case where q.j and p.i overlap. For example:
>
>   int main() {
> assert(sizeof(struct X) == 2 * sizeof(int));
> unsigned char *p = malloc(3 * sizeof(int));
> return foo ((struct X*)(p + sizeof(int)), (struct X*)p);
>   }
>
> In a naive memory model, one would indeed expect this program to return 0
> instead of 1 (which GCC does).
>
> However, this program already invokes undefined behavior due to C11's notion
> of effective types, namely 6.5p6 and 6.5p7.
>
> So, let me rephrase my question. Is anyone aware of situations in which GCC
> uses 6.5.16.1p3 as a license to perform certain optimizations where
> effective types alone do not suffice.

No, both are quite tied together (well, you might run into interpretation
issues with respect to "effective type" and the overlap constraint might
save you).

Richard.

> Robbert


try_merge_delay_insn with delay list > 1

2015-03-10 Thread BELBACHIR Selim
Hi,

I'm still working on a private backend on gcc 4.9.2. My processor provides 
instructions with 2 delay slots. I'm well aware that this feature is very 
uncommon and not fully tested. Nevertheless I submit the problem and the 
solution I've found.

The bug is located in the function try_merge_delay_insns(INSN, THREAD) in 
reorg.c. In there, gcc " tries to merge insns starting at THREAD which match 
exactly the insns in INSN's delay list ".


Suppose INSN is a 'delayed jump' filled with 2 delayed insns :

jmp_if_EQ_delayed .L1  # jump to L1 if condition code indicates equality
   mov r1, r2
   mov r3, mem(r1++)  # move r0 in memory pointed by r6 and post 
increment r6


and TARGET is a 'delayed jump if zero' filled with 1 delayed insn :

jmpz_delayed  --r7, L2# decrement r7 and jump L2 if r7 == 0
   mov r1, r2


The current implementation of try_merge_delay_insns(INSN, THREAD) will delete 
'mov r1,r2' from the delay slot of TARGET because it matches the 'mov r1,r2' 
inside INSN delay list. No check verifies that r1 has changed between the 2 
'mov r1,r2' insns.

I attached a patch that tries to solve this problem.

   Regards,

Selim




try_merge_patch
Description: try_merge_patch


RE: try_merge_delay_insn with delay list > 1

2015-03-10 Thread BELBACHIR Selim
Me again :)

I enhanced my patch because it was not generalized for instructions with N 
delay_slots.

Selim


try_merge_patch2
Description: try_merge_patch2


Newlib/Cygwin now under GIT

2015-03-10 Thread Corinna Vinschen
Hi fellow developers,


I'm happy to inform you that the move of Newlib/Cygwin from the src CVS
repository to the new, combined GIT repository is now final.

Here's how to access the new GIT repository:

Read-only:

  git clone git://sourceware.org/git/newlib-cygwin.git

Read/Write:  

  git clone sourceware.org:/git/newlib-cygwin.git

Web view:

  https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git

Commit messages go to the newlib-cvs and/or cygwin-cvs mailing lists,
just as before.  Commit messages also create a message to the freenode
IRC channel #cygwin-developers.

If you find problems, don't hesitate to report them, preferredly  on the
mailing list

  newlib AT sourceware DOT org
  
I'm not a git wizard, rather a wizzard (pardon the discworld reference)
so help in case of problems is highly appreciated.

Newlib list:  Jeff is on the road for the next couple of days so a
discussion of this point has to wait a while, but for the time being,
patches should continue to be accompanied by a ChangeLog entry.


Have fun,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat


pgpD7U2UcjwY6.pgp
Description: PGP signature


Re: Undefined behavior due to 6.5.16.1p3

2015-03-10 Thread Martin Sebor

On 03/09/2015 01:26 PM, Robbert Krebbers wrote:

I was wondering whether GCC uses 6.5.16.1p3 of the C11 standard as a
license to perform certain optimizations. If so, could anyone provide me
an example program.

In particular, I am interested about the "then the overlap shall be
exact" part of 6.5.16.1p3:

   If the value being stored in an object is read from another
   object that overlaps in any way the storage of the first
   object, then the overlap shall be exact and the two objects
   shall have qualified or unqualified versions of a compatible
   type; otherwise, the behavior is undefined.


I suspect every compiler relies on this requirement in certain
cases otherwise copying would require making use of temporary
storage. Here's an example:

  struct A {
int a [32];
  };

  union {
struct A a;
struct {
  char b1;
  struct A b2;
} b;
  } u;

  void foo (void) {
u.b.b2 = u.a;
  }

Martin


Re: Newlib/Cygwin now under GIT

2015-03-10 Thread Joel Sherrill
Thank you for doing this! It cloned for me on the first try.

Any particular reason, the repo is called newlib-cygwin.git
and not the more general newlib.git. Cygwin isn't the
only user of newlib.

--joel

On 3/10/2015 10:38 AM, Corinna Vinschen wrote:
> Hi fellow developers,
>
>
> I'm happy to inform you that the move of Newlib/Cygwin from the src CVS
> repository to the new, combined GIT repository is now final.
>
> Here's how to access the new GIT repository:
>
> Read-only:
>
>   git clone git://sourceware.org/git/newlib-cygwin.git
>
> Read/Write:  
>
>   git clone sourceware.org:/git/newlib-cygwin.git
>
> Web view:
>
>   https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git
>
> Commit messages go to the newlib-cvs and/or cygwin-cvs mailing lists,
> just as before.  Commit messages also create a message to the freenode
> IRC channel #cygwin-developers.
>
> If you find problems, don't hesitate to report them, preferredly  on the
> mailing list
>
>   newlib AT sourceware DOT org
>   
> I'm not a git wizard, rather a wizzard (pardon the discworld reference)
> so help in case of problems is highly appreciated.
>
> Newlib list:  Jeff is on the road for the next couple of days so a
> discussion of this point has to wait a while, but for the time being,
> patches should continue to be accompanied by a ChangeLog entry.
>
>
> Have fun,
> Corinna
>

-- 
Joel Sherrill, Ph.D. Director of Research & Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985



Re: Undefined behavior due to 6.5.16.1p3

2015-03-10 Thread Robbert Krebbers

On 03/10/2015 05:18 PM, Martin Sebor wrote:

I suspect every compiler relies on this requirement in certain
cases otherwise copying would require making use of temporary
storage. Here's an example:
Thanks, this example is indeed not already undefined by effective types, 
nor 6.2.6.1p6.


An entirely worked out example is:

#include

struct A { int a [32]; };

union {
  struct A a;
  struct { char b1; struct A b2; } b;
} u;

void init(struct A *p) {
  for (int i = 0; i < 32; i++) { p->a[i] = i; }
}

int test(struct A *p) {
  int b = 0;
  for (int i = 0; i < 32; i++) {
printf("%d=%d\n", i, p->a[i]);
if (p->a[i] != i) b = 1;
  }
  return b;
}

int main() {
  init(&u.a);
  u.b.b2 = u.a;
  return test(&u.b.b2);
}

The return value is 1 instead of 0 when compiled with GCC and clang with 
optimizations enabled.


Re: Undefined behavior due to 6.5.16.1p3

2015-03-10 Thread Robbert Krebbers

On 03/10/2015 05:44 PM, Robbert Krebbers wrote:

On 03/10/2015 05:18 PM, Martin Sebor wrote:

I suspect every compiler relies on this requirement in certain
cases otherwise copying would require making use of temporary
storage. Here's an example:

Thanks, this example is indeed not already undefined by effective types,
nor 6.2.6.1p6.
Now to make it more subtle. As far as I understand 6.5.16.1p3, undefined 
behavior can already occur without the use of union types or malloc. For 
example:


  struct S { int x, y; };
  int main() {
struct S p = (struct S){ .x = 10, .y = 12 };
p = (struct S){ .x = p.x, .y = 13 };
return p.x;
  }

is undefined AFAIK.

Is anyone aware of an example program that does not use unions or 
malloc, but where GCC performs optimizations justified by only 
6.5.16.1p3 of C11?


Re: [gomp4] Questions about "declare target" and "target update" pragmas

2015-03-10 Thread Ilya Verbin
Hi Jakub,

I have one more question :)
This testcase seems to be correct... or not?

#pragma omp declare target
extern int G;
#pragma omp end declare target

int G;

int main ()
{
  #pragma omp target update to(G)

  return 0;
}

If yes, then we have a problem that the decl of G in varpool_node::get_create
doesn't have "omp declare target" attribute.

Thanks,
  -- Ilya


Re: Newlib/Cygwin now under GIT

2015-03-10 Thread Joseph Myers
On Tue, 10 Mar 2015, Corinna Vinschen wrote:

> Hi fellow developers,
> 
> 
> I'm happy to inform you that the move of Newlib/Cygwin from the src CVS
> repository to the new, combined GIT repository is now final.

I note that this repository includes the include/ directory, in its larger 
binutils-gdb form rather than the smaller GCC form.

How much of this is actually relevant for newlib?  Mostly it relates to 
libiberty and object file formats, for use of code that's not included in 
this repository (which does not include libiberty).  If little or none of 
this code is actually used in newlib, it might make sense to remove the 
unused files so it's clear they do not need merging from the other 
repositories.

(Apart from include/, various shared toplevel files and directories are 
out of sync between the three repositories - GCC, binutils-gdb, 
newlib-cygwin - and could do with someone identifying unmerged changes and 
applying them to the repositories missing them.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Newlib/Cygwin now under GIT

2015-03-10 Thread Corinna Vinschen
On Mar 10 11:20, Joel Sherrill wrote:
> Thank you for doing this! It cloned for me on the first try.
> 
> Any particular reason, the repo is called newlib-cygwin.git
> and not the more general newlib.git. Cygwin isn't the
> only user of newlib.

No, but Cygwin is part of the repo.  It's a *combined* repo.  Along the
same lines the Cygwin folks might ask why it's not called cygwin.git or
cygwin-newlib.git.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat


pgpKtmhvzQzmy.pgp
Description: PGP signature


Re: Newlib/Cygwin now under GIT

2015-03-10 Thread DJ Delorie

> This is a common problem.  I guess newlib/cygwin got the oldest set
> and, afaik, the GCC toplevel stuff is kind of the master.  It would
> be nice if we had some automatism in place to keep all former src
> repos in sync.

There was never any agreement on who the "master" was for toplevel
sources - no repo was willing to give up control to the other, so no
automatic mirroring was ever done, unlike the libiberty/include
mirror, where src agreed to let gcc be the master.

Also, for the record, I do not wish to, nor do I intend to, provide
any automated merging services for git repos.  I don't like git and
I'd rather not use it if I don't have to.


Potential builtin memcpy bug in 4.9

2015-03-10 Thread Zan Lynx

I am trying to track down a bug that I only see on Fedora 21 with the
GCC 4.9.2 compiler building x86_64 code. It might have started happening
earlier. GCC 4.8 built without this problem.

I am building the c-ares library as part of a larger project and getting
malloc failures. Valgrind claims that code is writing outside its
allocated blocks. I traced it to the memcpy call

memcpy(query->tcpbuf + 2, qbuf, qlen);

In that call qlen == 35. I checked the malloc and it allocates 37 bytes
for tcpbuf. And it has worked on older compilers for a long time.

As best I can tell the builtin memcpy that is being used here (and it is
definitely the builtin because turning off builtins builds working code)
is writing way past the end of the buffer.

But for whatever reason I can't seem to build a stand-alone example.

Looking for some ideas. Maybe someone could audit the ASM code for the
memcpy builtin, see if anything jumps out at you. I haven't tried that
yet. Is it all one piece, or is it multiple chunks? Could it have bad
ASM specifications which are allowing the optimizer to write into a
register that should be preserve?

Here's the asm for the function along with some commentary:

001b1a57 :
  1b1a57:41 57push   %r15
  1b1a59:41 56push   %r14
  1b1a5b:41 55push   %r13
  1b1a5d:41 54push   %r12
  1b1a5f:55   push   %rbp
  1b1a60:53   push   %rbx
  1b1a61:48 83 ec 28  sub$0x28,%rsp
  1b1a65:89 d5mov%edx,%ebp
  1b1a67:49 89 ce mov%rcx,%r14
  1b1a6a:4d 89 c5 mov%r8,%r13
  1b1a6d:8d 42 f4 lea-0xc(%rdx),%eax
  1b1a70:3d f3 ff 00 00   cmp$0xfff3,%eax
  1b1a75:76 21jbe1b1a98 
  1b1a77:45 31 c0 xor%r8d,%r8d
  1b1a7a:31 c9xor%ecx,%ecx
  1b1a7c:31 d2xor%edx,%edx
  1b1a7e:be 07 00 00 00   mov$0x7,%esi
  1b1a83:4c 89 ef mov%r13,%rdi
  1b1a86:41 ff d6 callq  *%r14
  1b1a89:48 83 c4 28  add$0x28,%rsp
  1b1a8d:5b   pop%rbx
  1b1a8e:5d   pop%rbp
  1b1a8f:41 5cpop%r12
  1b1a91:41 5dpop%r13
  1b1a93:41 5epop%r14
  1b1a95:41 5fpop%r15
  1b1a97:c3   retq  
  1b1a98:49 89 fc mov%rdi,%r12
  1b1a9b:49 89 f7 mov%rsi,%r15
  1b1a9e:bf c8 00 00 00   mov$0xc8,%edi
  1b1aa3:e8 78 f4 ea ff   callq  60f20 
  1b1aa8:48 89 c3 mov%rax,%rbx
  1b1aab:48 85 c0 test   %rax,%rax
  1b1aae:0f 84 a0 02 00 00je 1b1d54 
  1b1ab4:8d 45 02 lea0x2(%rbp),%eax
  1b1ab7:89 44 24 0c  mov%eax,0xc(%rsp)
  1b1abb:48 63 f8 movslq %eax,%rdi
  1b1abe:e8 5d f4 ea ff   callq  60f20 
  1b1ac3:48 89 43 78  mov%rax,0x78(%rbx)
  1b1ac7:48 85 c0 test   %rax,%rax
  1b1aca:0f 84 7c 02 00 00je 1b1d4c 
  1b1ad0:48 89 04 24  mov%rax,(%rsp)
  1b1ad4:49 63 bc 24 98 00 00 movslq 0x98(%r12),%rdi
  1b1adb:00
  1b1adc:89 7c 24 08  mov%edi,0x8(%rsp)
  1b1ae0:48 c1 e7 03  shl$0x3,%rdi
  1b1ae4:e8 37 f4 ea ff   callq  60f20 
  1b1ae9:48 89 c7 mov%rax,%rdi
  1b1aec:48 89 83 b0 00 00 00 mov%rax,0xb0(%rbx)
  1b1af3:48 85 c0 test   %rax,%rax
  1b1af6:8b 4c 24 08  mov0x8(%rsp),%ecx
  1b1afa:48 8b 14 24  mov(%rsp),%rdx
  1b1afe:0f 84 40 02 00 00je 1b1d44 
  1b1b04:41 0f b6 07  movzbl (%r15),%eax
  1b1b08:c1 e0 08 shl$0x8,%eax
  1b1b0b:45 0f b6 47 01   movzbl 0x1(%r15),%r8d
  1b1b10:44 09 c0 or %r8d,%eax
  1b1b13:66 89 03 mov%ax,(%rbx)
  1b1b16:48 c7 43 08 00 00 00 movq   $0x0,0x8(%rbx)
  1b1b1d:00
  1b1b1e:48 c7 43 10 00 00 00 movq   $0x0,0x10(%rbx)
  1b1b25:00
  1b1b26:89 e8mov%ebp,%eax
  1b1b28:c1 f8 08 sar$0x8,%eax
  1b1b2b:88 02mov%al,(%rdx)
  1b1b2d:40 88 6a 01  mov%bpl,0x1(%rdx)



*** HERE IS WHERE IT STARTS: Getting query->tcpbuf + 2 ***
*** ebp has the length value 35 ***
*** memcpy(query->tcpbuf + 2, qbuf, qlen); ***

  1b1b31:4c 8d 4a 02  lea0x2(%rdx),%r9
  1b1b35:89 e8mov%ebp,%eax
  1b1b37:4d 89 c8 

Re: Undefined behavior due to 6.5.16.1p3

2015-03-10 Thread Joseph Myers
On Tue, 10 Mar 2015, Robbert Krebbers wrote:

> On 03/10/2015 05:44 PM, Robbert Krebbers wrote:
> > On 03/10/2015 05:18 PM, Martin Sebor wrote:
> > > I suspect every compiler relies on this requirement in certain
> > > cases otherwise copying would require making use of temporary
> > > storage. Here's an example:
> > Thanks, this example is indeed not already undefined by effective types,
> > nor 6.2.6.1p6.
> Now to make it more subtle. As far as I understand 6.5.16.1p3, undefined
> behavior can already occur without the use of union types or malloc. For
> example:
> 
>   struct S { int x, y; };
>   int main() {
> struct S p = (struct S){ .x = 10, .y = 12 };
> p = (struct S){ .x = p.x, .y = 13 };
> return p.x;
>   }
> 
> is undefined AFAIK.

I don't see why that would be undefined.  The value stored in p isn't read 
from something overlapping with p, it's read from the compound literal 
object that in turn was initialized with { .x = p.x, .y = 13 }.

Similarly, in the union case in 
, if you had

struct A f (struct A arg) { return arg; }

and did "u.b.b2 = f (u.a);" instead of "u.b.b2 = u.a;", that would not be 
undefined (see 6.8.6.4 and GCC PR 43784).

-- 
Joseph S. Myers
jos...@codesourcery.com


Questions about dynamic stack realignment

2015-03-10 Thread Steve Ellcey
This email is a follow-up to some earlier email I sent about
alignment of spills and fills but did not get any replies to.

https://gcc.gnu.org/ml/gcc/2015-03/msg00028.html

After looking into that I have decided to look more into dynamically
realigning the stack so that my spills and fills would be aligned and I have
done some experiments with stack realignment and I am trying to understand
what hooks already exist and how to use them.

Currently mips just has:

#define STACK_BOUNDARY (TARGET_NEWABI ? 128 : 64)

I added:

#define MAX_STACK_ALIGNMENT 128
#define PREFERRED_STACK_BOUNDARY (TARGET_MSA ? 128 : STACK_BOUNDARY)
#define INCOMING_STACK_BOUNDARY STACK_BOUNDARY

To try and get GCC to realign the stack to 128 bits if we are compiling
with the -mmsa option.  After doing this I found I needed to create a
TARGET_GET_DRAP_RTX that would return a register rtx when a drap was
needed so I did that and I got things to compile but I don't see any
code that actually realigned the stack.  It is not clear to me from the
documentation if there is shared code somewhere that should be trying to
realign the stack by changing the stack pointer given these definitions
or if I also need to add my own code to exand_prologue to do the stack
realignment myself.

I am also not sure if I understand the drap (Dynamic Realign Argument Pointer)
register functionality correctly.  My guess/understanding was that the drap
was used to access arguments in cases where the regular stack pointer may have
been changed in order to be aligned.  Is that correct?

Any help/advice on how the hooks for dynamically realigned stack are supposed
to all work together would be appreciated.

Steve Ellcey