Re: libmvec simd math functions in fortran

2017-11-03 Thread Richard Biener
On Thu, Nov 2, 2017 at 7:42 PM, Toon Moene  wrote:
> On 11/01/2017 05:26 PM, Jakub Jelinek wrote:
>
>> On Wed, Nov 01, 2017 at 04:23:11PM +, Szabolcs Nagy wrote:
>>>
>>> is there a way to get vectorized math functions in fortran?
>>>
>>> in c code there is attribute simd declarations or openmp
>>> declare simd pragma to tell the compiler which functions
>>> have simd variant, but i see no such thing in fortran.
>>
>>
>> !$omp declare simd should work fine in fortran (with -fopenmp
>> or -fopenmp-simd).
>
>
> Note that - if you don't want to change the Fortran code, this - almost two
> years old - proposal would work:
>
> https://gcc.gnu.org/ml/gcc/2016-01/msg00025.html
>
> Obviously, I'll only be able to implement this once retirement comes around
> (i.e., after 2023).

I think a better user-interface would be

gfortran ... -include mathvec

which pulls in a pre-def for the vectorized math functions as available.
Somewhere that mathvec "module" needs to reside, ideally it came
from the local glibc install (but module files are GCC version dependent?).

Not sure if gfortran supports -include for module files, OTOH a
-include of fortran source should also work in all contexts(?).

The implementation challenge is

1) actually being able to declare the relevant math functions appropriately
(say, for -ffixed-form f77 style source)
2) make gfortran pick the FUNCTION_DECLs from those declarations
instead of those generated from the mathbuiltins.def.

Solving 2) would be nice to make this work "manually" anyway.
I'm not quite sure we properly override / merge (for other attributes)
with the builtins.

Richard.

> Kind regards,
>
> --
> Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
> Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news


GCC 8.0.0 Status Report (2017-11-03)

2017-11-03 Thread Richard Biener

Status
==

The feature development phase of GCC 8, Stage 1, is coming to its ends
at Friday, Nov. 17th (as usual you can use your local timezone to your
own advantage).

This means that from Saturday, Nov. 18th we will be in Stage 3 which
allows for general bugfixing.  All feature implementations posted
before Stage 3 might be considered during a brief period after we
entered Stage 3.

As usual in this time not all regressions have been prioritized,
usual rules apply -- regressions new in GCC 8 will end up as P1
unless they do not affect primary or secondary targets or languages.
Regressions that we shipped with in GCC 7.2 can only be at most P2.
Regressions that only affect non-primary/secondary targets or
languages will be demoted to P4/5.


Quality Data


Priority  #   Change from last report
---   ---
P1   14   +   8
P2  163   +  44
P3  134   + 126
P4  135   -  11
P5   27   -   3
---   ---
Total P1-P3 311   + 178
Total   473   + 164


Previous Report
===

https://gcc.gnu.org/ml/gcc/2017-04/msg00084.html


Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-03 Thread Qing Zhao
Hi, 

This is the first time I am asking for a design review for fixing a GCC 
enhancement request, Let me know if I need to send this email to other mailing 
list as well.

I have been studying PR 78809 for some time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 


with a lot of help from Wilco and other people, and detailed study about the 
previous discussion and current GCC behavior, I was able to come up with the 
following writeup
(basically serve as a design doc), and ready for implementation. 

Please take a look at it, and let me know any comments and suggestions:

thanks a lot.

Qing


str(n)cmp and memcmp optimization in gcc
-- A design document for PR78809

11/01/2017

Qing Zhao
===

0. Summary:

   For PR 78809 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809,
   Will add the following str(n)cmp and memcmp optimizations into GCC 8:

   A. for strncmp (s1, s2, n) 
  if one of "s1" or "s2" is a constant string, "n" is a constant, and 
larger than the length of the constant string:
  change strncmp (s1, s2, n) to strcmp (s1, s2);

   B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
  if the result is ONLY used to do a simple equality test against zero, one 
of "s1" or "s2" is a small constant string, n is a constant, and the other 
non-constant string is guaranteed to not read beyond the end of the string:
  change strncmp (s1, s2, n) or strcmp (s1, s2) to corresponding memcmp 
(s1, s2, n); 

  (NOTE, currently, memcmp(s1, s2, N) (!)=0 has been optimized to a simple 
sequence to access all bytes and accumulate the overall result in GCC by 
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171
  ) 
  as a result, such str(n)cmp call would like to be replaced by simple 
sequences to access all types and accumulate the overall results. 

   C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
  if the result is NOT used to do simple equality test against zero, one of 
"s1" or "s2" is a small constant string, n is a constant, and the Min value of 
the length of the constant string and "n" is smaller than a predefined 
threshold T, 
  inline the call by a byte-to-byte comparision sequence to avoid calling 
overhead. 

   A would like to be put in gimple fold phase (in routine 
"gimple_fold_builtin_string_compare" of gimple-fold.c)
   B would like to be put in strlen phase (add new "handle_builtin_str(n)cmp" 
routines in tree-ssa-strlen.c)
   C would like to be put in expand phase (tree-ssa-strlen.c or builtins.c): 
run-time performance testing is needed to decide the predefined threshold T. 

   The reasons to put the optimizations in the above order are:

   * A needs very simple information from source level, and A should be BEFORE 
B to simplify algorithm in B, gimple-fold phase should be a good place to do 
it. (Currently, similar transformation such as replacing strncat with strcat is 
put in this phase);
   * B needs type information (size and alignment) for the safety checking, and 
B should be BEFORE expand phase to utilize the available memcmp optimization in 
expand phase, strlen phase is a good place to do it. 
   * C would like to replace the call to byte-to-byte comparision, since we 
want some high-level optimization been applied first on the calls, the inlining 
of the call might be better to be done in the late stage of the tree 
optimization stage.  expand phase is a good place to do it. 
 
   These 3 optimization can be implemented seperated, 3 patches might be 
needed. 
 
the following are details to explain the above.

1. some background:

   #include 
   int strcmp(const char *s1, const char *s2);
   int strncmp(const char *s1, const char *s2, size_t n);
   int memcmp(const void *s1, const void *s2, size_t n);

   • strcmp compares null-terminated C strings
   • strncmp compares at most N characters of null-terminated C strings
   • memcmp compares binary byte buffers of N bytes.

The major common part among these three is:

   * they all return an integer less than, equal to, or greater than zero if s1 
is found, respectively, to be less than, to match, or be greater than s2.

The major different part among these three is:

   * both strcmp and strncmp might early stop at NULL terminator of the 
compared strings. but memcmp will NOT early stop, it means to compare exactly N 
bytes of both buffers.
   * strcmp compare the whole string, but strncmp only compare the first n 
chars (or fewer, if the string ends sooner) of the string.  

So, when optimizing memcmp and str(n)cmp, we need to consider the following:

   * The compiler can compare multiple bytes at the same time and doesn't have 
to worry about beyond the end of a string for memcmp, but have to worry about 
read beyond the end of a string for str(n)cmp when comparing multiple bytes at 
the same time becau

testsuite question

2017-11-03 Thread Steve Kargl
One of the tests for gfortran has been XPASSing 
on newer versions of FreeBSD.  The testcase has
the line

! { dg-xfail-if "" { "*-*-freebsd*" } { "*" }  { "" } }

I know the tests passes on *-*-freebsd12.0.  I should
pass on *-*-freebsd11.* and perhaps *-*-freebsd10.*.
I don't know if it passes on older FreeBSD.  So, I
should I modified the dejagnu pattern?

-- 
Steve


Re: testsuite question

2017-11-03 Thread Jeff Law

On 11/03/2017 12:34 PM, Steve Kargl wrote:

One of the tests for gfortran has been XPASSing
on newer versions of FreeBSD.  The testcase has
the line

! { dg-xfail-if "" { "*-*-freebsd*" } { "*" }  { "" } }

I know the tests passes on *-*-freebsd12.0.  I should
pass on *-*-freebsd11.* and perhaps *-*-freebsd10.*.
I don't know if it passes on older FreeBSD.  So, I
should I modified the dejagnu pattern?


GIven that FreeBSD 9.x has reached EOL,  I'd just remove the xfail.

jeff


Re: testsuite question

2017-11-03 Thread Steve Kargl
On Fri, Nov 03, 2017 at 12:41:30PM -0600, Jeff Law wrote:
> On 11/03/2017 12:34 PM, Steve Kargl wrote:
> > One of the tests for gfortran has been XPASSing
> > on newer versions of FreeBSD.  The testcase has
> > the line
> > 
> > ! { dg-xfail-if "" { "*-*-freebsd*" } { "*" }  { "" } }
> > 
> > I know the tests passes on *-*-freebsd12.0.  I should
> > pass on *-*-freebsd11.* and perhaps *-*-freebsd10.*.
> > I don't know if it passes on older FreeBSD.  So, I
> > should I modified the dejagnu pattern?
> > 
> GIven that FreeBSD 9.x has reached EOL,  I'd just remove the xfail.
> 

Thanks for the quick response and guidance.

-- 
Steve


GCC testing, precompiled headers, and CFLAGS_FOR_TARGET question

2017-11-03 Thread Steve Ellcey
I have a question about gcc testing, precompiled header tests and the
CFLAGS_FOR_TARGET option to RUNTESTFLAGS.

I am building a complete native aarch64 toolchain (binutils, gcc, glibc) in
a non-standard location, I configure binutils and gcc with
--sysroot=/mylocation, and I want to run the gcc testsuite.

Everything mostly works, but when gcc tests are actually run, some tests
fail because the default dynamic linker, libc, and libm are used instead
of the ones I have in /mylocation/lib64.

I can work around this with:

make check RUNTESTFLAGS="CFLAGS_FOR_TARGET='-Wl,--dynamic-linker=/mylocation/lib
/ld-linux-aarch64.so.1 -Wl,-rpath=/mylocation/lib64'"

But when I do this, I noticed that a number of pch tests fail.  What I found
is that when I run the pch testsuite, it executes:

/home/sellcey/tot/obj/gcc/gcc/xgcc -B/home/sellcey/tot/obj/gcc/gcc/ ./common-1.h
 -fno-diagnostics-show-caret -fdiagnostics-color=never -O0 -g -Wl,--dynamic-link
er=/mylocation/lib/ld-linux-aarch64.so.1 -Wl,-rpath=/mylocation/lib64 -o common-
1.h.gch

And this tries to create an executable instead of a pre-compiled header.
If I run the same command without the -Wl flags then GCC creates the
pre-compiled header that I need for testing.

Is it excpected that GCC changes from creating a pch to creating an executable
when it see -Wl flags?  Is there a flag that we can use to explicitly tell GCC
that we want to create a precompiled header in this instance?

Steve Ellcey
sell...@cavium.com