fp emulation libraries in assembly for armv6-m architecture

2014-01-30 Thread Mallikarjun Goudar
Hi,
I notice that, libgcc has fp emulation libraries written in C for
armv6-m architecture which is quite big in size. Also I see that other
architecture like armv7-m has these libraries written in assembly for
smaller size lib.
I wanted to know, is there anybody working on this feature supporting
fp lib in assembly language for armv6-m arch? If not, Please let me
know if i can contribute this feature?

Thanks,
Mallikarjun


Spelling Error in gcc/README.Portability

2014-01-30 Thread Alangi Derick
GCC Version 4.9.0
email: alangider...@gmail.com

Index: gcc/README.Portability
===
--- gcc/README.Portability  (revision 206579)
+++ gcc/README.Portability  (working copy)
@@ -6,7 +6,7 @@

 The problem is that many ISO-standard constructs are not accepted by
 either old or buggy compilers, and we keep getting bitten by them.
-This knowledge until know has been sparsely spread around, so I
+This knowledge until now has been sparsely spread around, so I
 thought I'd collect it in one useful place.  Please add and correct
 any problems as you come across them.


proposal: remove thread_local from supported C++11 features

2014-01-30 Thread Conrad S
The page covering C++0x/C++11 support in GCC, ie.
http://gcc.gnu.org/projects/cxx0x.html
states that the "thread_local" keyword is supported since GCC 4.8.

However, thread_local is currently (gcc 4.8.2) too broken to be of real use:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59364
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58672
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55800
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57163

Unless there are plans to fix it in gcc 4.8.3, I propose that
thread_local be removed from the list of supported C++11 features, or
at the very least changed to "partial". Having the feature as
"supported" is plainly false advertising.


Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Paulo Matos
Hello,

I am tracking a performance and size regression from 4.5.4 present in trunk.
Consider the following function:
==
extern short delayLength;
typedef int Sample;
extern Sample *temp_ptr;
extern Sample x;

void
foo (short blockSize)
{
  short i;
  unsigned short loopCount;

  loopCount = (unsigned short) (blockSize + delayLength) % 8;
  for (i = 0; i < loopCount; i++)
  *temp_ptr++ = x ^ *temp_ptr++;
}
==

For v850, before the commit
commit e0ae2fe2a0bebe9de31e3d8eb4feace4909ef009
Author: vries 
Date:   Fri May 20 19:32:30 2011 +

2011-05-20  Tom de Vries  

PR target/45098
* tree-ssa-loop-ivopts.c: Include expmed.h.
(get_shiftadd_cost): New function.
(force_expr_to_var_cost): Declare forward.  Use get_shiftadd_cost.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@173976 
138bc75d-0d04-0410-961f-82ee72b054a4

gcc generated for -O2:
_foo:
movhi hi(_delayLength),r0,r10
ld.h lo(_delayLength)[r10],r10
add r10,r6
andi 7,r6,r10
be .L1
movhi hi(_temp_ptr),r0,r16
ld.w lo(_temp_ptr)[r16],r15
mov r10,r17
shl 2,r17
mov r15,r14
movhi hi(_x),r0,r13
mov r15,r10
add r17,r14
movea lo(_x),r13,r13
.L3:
ld.w 0[r10],r11
ld.w 0[r13],r12
xor r12,r11
st.w r11,0[r10]
add 4,r10
cmp r14,r10
bne .L3
mov r15,r10
add r17,r10
st.w r10,lo(_temp_ptr)[r16]
.L1:
jmp [r31]

After the commit it generates:
_foo:
movhi hi(_delayLength),r0,r10
ld.h lo(_delayLength)[r10],r16
add r16,r6
andi 7,r6,r16
be .L1
movhi hi(_temp_ptr),r0,r17
ld.w lo(_temp_ptr)[r17],r18
movhi hi(_x),r0,r14
mov r18,r11
mov r16,r15
mov 0,r10
movea lo(_x),r14,r14
.L3:
ld.w 0[r11],r12
ld.w 0[r14],r13
add 1,r10
xor r13,r12
shl 16,r10
st.w r12,0[r11]
sar 16,r10
add 4,r11
cmp r15,r10
bne .L3
shl 2,r16
add r18,r16
st.w r16,lo(_temp_ptr)[r17]
.L1:
jmp [r31]

The problem is inside the loop: 
shl 16,r10
st.w r12,0[r11]
sar 16,r10
add 4,r11
cmp r15,r10


shl followed by sar is used to sign extend r10 which was in previous gcc 
versions not being done and it is unnecessary.
At the point of commit v850 didn't have e3v5 support or zero overhead loops but 
now it does and this blocks generation of zero overhead loops. (with trunk and 
-mv850e3v5, gcc generates a sxh instruction instead of the shift pattern but 
the point is the same).

For mep the situation repeats. mep generates:
foo:
# frame: 8   8 regs
lh  $10, %sdaoff(delayLength)($gp)
add $sp, -8
add3$1, $1, $10
and3$10, $1, 0x7
beqz$10, .L1
lw  $2, %sdaoff(temp_ptr)($gp)
mov $1, 0
add3$11, $gp, %sdaoff(x)
bra .L5
.L3:
mov $2, $9
.L5:
lw  $0, ($11)
lw  $3, 4($2)
add $1, 1
exth$1
xor $3, $0
slt3$0, $1, $10
add3$9, $2, 8
sw  $3, ($2)
bnez$0, .L3
sw  $9, %sdaoff(temp_ptr)($gp)
.L1:
add $sp, 8
ret


Again exth signextends $1 and blocks generation of zero overhead loop because 
suddenly loop is not simple anymore. Unfortunately I cannot test mep before the 
patch as at the time mep was not in mainline.

Does anyone understand why the mentioned patch is forcing the generation of the 
sign extend inside the loop? Is this just a problem with cost calculation in 
the backends or some issue lurking in tree-ssa-loop-ivopts?

Thanks,

Paulo Matos




Re: proposal: remove thread_local from supported C++11 features

2014-01-30 Thread Jonathan Wakely
On 30 January 2014 13:40, Conrad S wrote:
> The page covering C++0x/C++11 support in GCC, ie.
> http://gcc.gnu.org/projects/cxx0x.html
> states that the "thread_local" keyword is supported since GCC 4.8.
>
> However, thread_local is currently (gcc 4.8.2) too broken to be of real use:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59364
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58672
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55800
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57163
>
> Unless there are plans to fix it in gcc 4.8.3, I propose that
> thread_local be removed from the list of supported C++11 features, or
> at the very least changed to "partial". Having the feature as
> "supported" is plainly false advertising.

Only if you don't read the pages properly.

"Important: GCC's support for C++11 is still experimental. "

"GCC provides experimental support for the 2011 ISO C++ standard."

Anyway, removing it from the list would achieve nothing.


Re: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Andreas Schwab
Paulo Matos  writes:

> void
> foo (short blockSize)
> {
>   short i;
>   unsigned short loopCount;
>
>   loopCount = (unsigned short) (blockSize + delayLength) % 8;
>   for (i = 0; i < loopCount; i++)
>   *temp_ptr++ = x ^ *temp_ptr++;
> }

You know that this is undefined code?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: proposal: remove thread_local from supported C++11 features

2014-01-30 Thread Paolo Carlini
.. if you are willing to concretely help, please open a meta-bug with 
"[meta-bug] thread_local" in the summary and blocked by all the issues 
you mentioned.


Thanks,
Paolo.


RE: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Paulo Matos
> -Original Message-
> From: Andreas Schwab [mailto:sch...@linux-m68k.org]
> Sent: 30 January 2014 14:29
> To: Paulo Matos
> Cc: gcc@gcc.gnu.org
> Subject: Re: Regression [v850,mep...]: sign_extend in loop breaks 
> zero-overhead
> loop generation
> 
> Paulo Matos  writes:
> 
> > void
> > foo (short blockSize)
> > {
> >   short i;
> >   unsigned short loopCount;
> >
> >   loopCount = (unsigned short) (blockSize + delayLength) % 8;
> >   for (i = 0; i < loopCount; i++)
> >   *temp_ptr++ = x ^ *temp_ptr++;
> > }
> 
> You know that this is undefined code?
>

Correct, my apologies. I didn't notice the undefined behaviour that the reducer 
introduced in the original code.
However, the issue persists.
If instead I write:
void
foo (short blockSize)
{
  short i;
  unsigned short loopCount;
  loopCount = (unsigned short) (blockSize + delayLength) % 8;
  for (i = 0; i < loopCount; i++)
  *temp_ptr++ = x ^ *temp_ptr;
}

the sign extend is still generated from the referenced commit and persists 
until trunk. This causes the zero overhead loop not to be generated.

Thanks,

Paulo Matos
 
> Andreas.
> 
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."


Re: proposal: remove thread_local from supported C++11 features

2014-01-30 Thread Conrad S
Paolo Carlini wrote:
> .. if you are willing to concretely help, please open a meta-bug with
> "[meta-bug] thread_local" in the summary and blocked by all the issues you
> mentioned.

Done:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59994


Re: proposal: remove thread_local from supported C++11 features

2014-01-30 Thread Conrad S
Jonathan Wakely wrote:
> Only if you don't read the pages properly.
> "Important: GCC's support for C++11 is still experimental. "
> "GCC provides experimental support for the 2011 ISO C++ standard."
> Anyway, removing it from the list would achieve nothing.

Eh?  thread_local doesn't work. Stating that it works (as the C++11
support page states) is not helpful either.  Removing will at the very
least communicate that this feature is not ready.


Re: proposal: remove thread_local from supported C++11 features

2014-01-30 Thread Richard Biener
On Thu, Jan 30, 2014 at 3:57 PM, Conrad S  wrote:
> Jonathan Wakely wrote:
>> Only if you don't read the pages properly.
>> "Important: GCC's support for C++11 is still experimental. "
>> "GCC provides experimental support for the 2011 ISO C++ standard."
>> Anyway, removing it from the list would achieve nothing.
>
> Eh?  thread_local doesn't work. Stating that it works (as the C++11
> support page states) is not helpful either.  Removing will at the very
> least communicate that this feature is not ready.

"doesn't work" doesn't seem to be accurate, we have at least two dozen
testcases excercising thread_local that work.

The cited bugs seem to boil down to a single issue which makes it not
work for you.

Richard.


Re: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Andreas Schwab
Paulo Matos  writes:

> If instead I write:
> void
> foo (short blockSize)
> {
>   short i;
>   unsigned short loopCount;
>   loopCount = (unsigned short) (blockSize + delayLength) % 8;
>   for (i = 0; i < loopCount; i++)
>   *temp_ptr++ = x ^ *temp_ptr;
> }

This is still undefined.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


RE: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Paulo Matos
> -Original Message-
> From: Andreas Schwab [mailto:sch...@linux-m68k.org]
> Sent: 30 January 2014 15:15
> To: Paulo Matos
> Cc: gcc@gcc.gnu.org
> Subject: Re: Regression [v850,mep...]: sign_extend in loop breaks 
> zero-overhead
> loop generation
> 
> Paulo Matos  writes:
> 
> > If instead I write:
> > void
> > foo (short blockSize)
> > {
> >   short i;
> >   unsigned short loopCount;
> >   loopCount = (unsigned short) (blockSize + delayLength) % 8;
> >   for (i = 0; i < loopCount; i++)
> >   *temp_ptr++ = x ^ *temp_ptr;
> > }
> 
> This is still undefined.
> 

OK, of course. Don't know what I am doing today.
It's undefined because 'i' might overflow... I will get back to this. Thanks 
for pointing this out.

> Andreas.
> 
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."


Re: Spelling Error in gcc/README.Portability

2014-01-30 Thread Paolo Carlini

On 01/30/2014 12:41 PM, Alangi Derick wrote:

GCC Version 4.9.0
email: alangider...@gmail.com

Index: gcc/README.Portability
===
--- gcc/README.Portability  (revision 206579)
+++ gcc/README.Portability  (working copy)
@@ -6,7 +6,7 @@

  The problem is that many ISO-standard constructs are not accepted by
  either old or buggy compilers, and we keep getting bitten by them.
-This knowledge until know has been sparsely spread around, so I
+This knowledge until now has been sparsely spread around, so I
  thought I'd collect it in one useful place.  Please add and correct
  any problems as you come across them.

Applied, thanks.

Paolo.


Re: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Jeff Law

On 01/30/14 08:19, Paulo Matos wrote:

-Original Message-
From: Andreas Schwab [mailto:sch...@linux-m68k.org]
Sent: 30 January 2014 15:15
To: Paulo Matos
Cc: gcc@gcc.gnu.org
Subject: Re: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead
loop generation

Paulo Matos  writes:


If instead I write:
void
foo (short blockSize)
{
   short i;
   unsigned short loopCount;
   loopCount = (unsigned short) (blockSize + delayLength) % 8;
   for (i = 0; i < loopCount; i++)
   *temp_ptr++ = x ^ *temp_ptr;
}


This is still undefined.



OK, of course. Don't know what I am doing today.
It's undefined because 'i' might overflow... I will get back to this. Thanks 
for pointing this out.
When you've got it sorted out, go ahead and file a BZ, include the 
regression markers so that it shows up in the searches most of us are 
paying the most attention to right now.


jeff



Fwd: [gomp4] Questions about "declare target" and "target update" pragmas

2014-01-30 Thread Ilya Verbin
One more question.  Is it valid to use arr[MAX/2..MAX] on target?

#define MAX 20
void foo ()
{
  int arr[MAX];
  #pragma omp target map(from: arr[0:MAX/2])
{
  int i;
  for (i = 0; i < MAX; i++)
arr[i] = i;
}
}

In this case GOMP_target gets sizes[0]==40 as input.  Due to this,
gomp_map_vars allocates 40 bytes of memory on target for 'arr',
instead of 80 bytes.

  -- Ilya


proposal to add -Wheader-guard option

2014-01-30 Thread Prathamesh Kulkarni
Hi, I was wondering if it's a good idea to add -Wheader-guard option
that warns on mismatches between #ifndef and #define lines
in header guard, similar to -Wheader-guard in clang-3.4 ?
(http://llvm.org/releases/3.4/tools/clang/docs/ReleaseNotes.html)

I have implemented patch for -Wheader-guard (please find it attached).
Consider a file having the following format:
#ifndef cmacro (or #if !defined(cmacro) )
#define dmacro
// rest of the stuff
#endif

The warning is triggered if the edit distance
(http://en.wikipedia.org/wiki/Levenshtein_distance), between cmacro
and dmacro
is <= max (len(cmacro), len(dmacro)) / 2
If the edit distance is more than half, I assume that cmacro
and dmacro are "very different", and the intent
was probably not to define header guard (This is what clang does too).

Example:
#ifndef FOO_H
#define _FOO_H
#endif

foo.h:1:0: warning: FOO_H used as header guard followed by #define of
different macro [-Wheader-guard]
 #ifndef FOO_H
 ^
foo.h:2:0: note: FOO_H is defined here, did you mean _FOO_H ?
 #define _FOO_H
 ^

Warning is not triggered in the following cases:

1] The edit distance between #ifndef (or #!defined) macro
and #define macro is > half of maximum length between two macros

Example:
#ifndef FOO
#define BAR
#endif

2] #ifndef and #define are not on consecutive lines (blank lines/ comment lines
are not ignored).

3] dmacro gets undefined
Example:
#ifndef cmacro
#define dmacro
#undef dmacro
#endif

However the following warning gets generated during the build:
../../src/libcpp/directives.c: In function 'void _cpp_pop_buffer(cpp_reader*)':
../../src/libcpp/directives.c:
2720:59: warning: 'inc_type' may be used
uninitialized in this function [-Wmaybe-uninitialized]
   _cpp_pop_file_buffer (pfile, inc, to_free, inc_type);
   ^
_cpp_pop_buffer(): http://pastebin.com/aLYLnXJa
I have defined inc_type only if inc is not null (ie buffer is file
buffer) in 1st if()
and used it (passed it to _cpp_pop_file_buffer() ), only if inc is not null
in 2nd if(). I guess this warning could be considered harmless ?
How should I should rewrite it to avoid the warning ?

Thanks and Regards,
Prathamesh
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 207299)
+++ gcc/c-family/c-common.c	(working copy)
@@ -9558,6 +9558,7 @@ static const struct reason_option_codes_
   {CPP_W_WARNING_DIRECTIVE,		OPT_Wcpp},
   {CPP_W_LITERAL_SUFFIX,		OPT_Wliteral_suffix},
   {CPP_W_DATE_TIME,			OPT_Wdate_time},
+  {CPP_W_HEADER_GUARD,  OPT_Wheader_guard},
   {CPP_W_NONE,0}
 };
 
Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c	(revision 207299)
+++ gcc/c-family/c-opts.c	(working copy)
@@ -430,6 +430,10 @@ c_common_handle_option (size_t scode, co
   cpp_opts->cpp_warn_traditional = value;
   break;
 
+case OPT_Wheader_guard:
+  cpp_opts->cpp_warn_header_guard = value;
+  break;
+
 case OPT_Wtrigraphs:
   cpp_opts->warn_trigraphs = value;
   break;
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 207299)
+++ gcc/c-family/c.opt	(working copy)
@@ -736,6 +736,10 @@ Wtraditional
 C ObjC Var(warn_traditional) Warning
 Warn about features not present in traditional C
 
+Wheader-guard
+C ObjC C++ ObjC++ Warning
+Warn of header guard
+
 Wtraditional-conversion
 C ObjC Var(warn_traditional_conversion) Warning
 Warn of prototypes causing type conversions different from what would happen in the absence of prototype
Index: libcpp/directives.c
===
--- libcpp/directives.c	(revision 207299)
+++ libcpp/directives.c	(working copy)
@@ -565,6 +565,27 @@ lex_macro_node (cpp_reader *pfile, bool
   return NULL;
 }
 
+// return true if top of if_stack is cmacro
+static bool
+cmacro_ifs_top_p(cpp_reader *pfile)
+{
+  struct if_stack *ifs = pfile->buffer->if_stack;
+  return ifs && (ifs->next == NULL) && (ifs->mi_cmacro != NULL);
+}
+
+static linenum_type
+linediff (struct line_maps *maps, source_location loc1, source_location loc2)
+{
+  linenum_type temp;
+
+  if (loc1 < loc2)
+temp = loc1, loc1 = loc2, loc2 = temp;
+
+  const struct line_map *m1 = linemap_lookup (maps, loc1);
+  const struct line_map *m2 = linemap_lookup (maps, loc2);
+  return SOURCE_LINE (m1, loc1) - SOURCE_LINE (m2, loc2);
+}
+
 /* Process a #define directive.  Most work is done in macro.c.  */
 static void
 do_define (cpp_reader *pfile)
@@ -586,6 +607,15 @@ do_define (cpp_reader *pfile)
 	  pfile->cb.define (pfile, pfile->directive_line, node);
 
   node->flags &= ~NODE_USED;
+  
+  // possibly #define following #ifndef in the include guard
+  if (pfile->buffer->dmacro == NULL && cmacro_ifs_top_p (pfile)
+  && linediff (pfile->line_table, pfile->directive_

gcc-4.8-20140130 is now available

2014-01-30 Thread gccadmin
Snapshot gcc-4.8-20140130 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140130/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch 
revision 207325

You'll find:

 gcc-4.8-20140130.tar.bz2 Complete GCC

  MD5=418121766ce02c324820b384deb6a288
  SHA1=fca745a13b37efac945d046b3e7726942ac69aaa

Diffs from 4.8-20140123 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.