from:"tkoenig at gcc dot gnu.org via Gcc\-bugs"

[Bug fortran/99345] [11 Regression] ICE in doloop_contained_procedure_code, at fortran/frontend-passes.c:2464 since r11-2578-g27eac9ee6137a6b5

2021-03-14 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99345

Thomas Koenig  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |tkoenig at gcc dot 
gnu.org

--- Comment #11 from Thomas Koenig  ---
Harald, thanks for reducing it!

[Bug fortran/99345] [11 Regression] ICE in doloop_contained_procedure_code, at fortran/frontend-passes.c:2464 since r11-2578-g27eac9ee6137a6b5

2021-03-15 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99345

Thomas Koenig  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #12 from Thomas Koenig  ---
Fixed with https://gcc.gnu.org/g:52654036a544389fb66855bf3972f2a8013bec59 .

Thanks for the bug report!

[Bug web/99598] New: Commits are not transferred to bugzilla

2021-03-15 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99598

Bug ID: 99598
   Summary: Commits are not transferred to bugzilla
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

See https://gcc.gnu.org/pipermail/gcc-cvs/2021-March/343081.html ,
which is not distributed to bugzilla and the gcc-bugs mailing list,
despite the ChangeLog entry reading


Handle EXEC_IOLENGTH in doloop_contained_procedure_code.

This rather obvious patch fixes an ICE on valid which came about
because I did not handle EXEC_IOLENGTH as start of an I/O statement
when checking for the DO loop variable.  This is an 11 regression.

gcc/fortran/ChangeLog:

PR fortran/99345
* frontend-passes.c (doloop_contained_procedure_code):
Properly handle EXEC_IOLENGTH.

gcc/testsuite/ChangeLog:

PR fortran/99345
* gfortran.dg/do_check_16.f90: New test.
* gfortran.dg/do_check_17.f90: New test.

[Bug target/100045] New: Precomputing division

2021-04-12 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100045

Bug ID: 100045
   Summary: Precomputing division
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

Created attachment 50567
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50567&action=edit
Test case

We use the method given in "Division by Invariant Integers using
Multiplication"
by Granlund and Montgomery for optimizing division by divisors known to
be constant at compile time.

There can also be an advantage if many numbers are divided by the
same numbers; in this case, the invariant inverse can be moved out of
the loop.  This is target-dependent.

The attached test case performs 1000 unsigned divisions of uint32_t
values read in randomly by a constant randomly chosen to be 12345678

- using the method from figure 4.1 from the publication cited above
  (timing in seconds given as pre_divide)

- using a simple loop with divisions (timing in seconcs given as divide).

On a AMD Ryzen 7 1700X, the timings are

pre_divide: t = 0.013330 s
divide: t = 0.052511 s

OTOH, on POWER (gcc135), the difference is so small so that is very probably
not worth the bother:

pre_divide: t = 0.015183 s
divide: t = 0.017454 s

[Bug fortran/94978] [8/9/10/11 Regression] Bogus warning "Array reference at (1) out of bounds in loop beginning at (2)"

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94978

Thomas Koenig  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

[Bug libfortran/98076] Increase speed of integer I/O

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98076

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/82215] Feature request to better support two pass compiling with gfortran

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82215

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/92913] Add argument-mismatch check for INTERFACE for non-module procedures in the same file

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92913

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/97345] FE passes do_subscript leaks gmp memory

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97345

Thomas Koenig  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

[Bug fortran/93114] Use span passing components of derived types

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93114

Thomas Koenig  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

[Bug fortran/96216] Gap in interface checking

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96216

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/30609] Calculating masks twice

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30609

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/97454] Decls for Fortran library procedures

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97454

Thomas Koenig  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

[Bug fortran/83927] Type-Bound Procedure on element of Derived Type PARAMETER Array

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83927

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/67202] Fortran FE should load scalar pass-by-reference intent-in arguments at the beginning of a function

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67202

Thomas Koenig  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

[Bug fortran/90536] Spurious (?) warning when using -Wconversion with -fno-range-check

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90536

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/93956] Wrong array creation with p => array_dt(1:n)%component

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93956

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|REOPENED|NEW

[Bug libfortran/95101] Optimize libgfortran library handling of arrays

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95101

Thomas Koenig  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

[Bug fortran/40976] Merge DECL of procedure call with DECL of gfc_get_function_type

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40976

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/68289] Missing diagnostic pragmas

2021-04-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68289

Thomas Koenig  changed:

   What|Removed |Added

   Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug fortran/92422] [9 Regression] Warning with character and optimisation flags

2020-10-14 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92422

Thomas Koenig  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|WAITING |RESOLVED

--- Comment #4 from Thomas Koenig  ---
Well, it's not fixed on gcc9, but I don't think it makes sense
to try to find out what fixed this.

Hence, closing as FIXED.

[Bug fortran/97454] New: Decls for Fortran library procedures

2020-10-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97454

Bug ID: 97454
   Summary: Decls for Fortran library procedures
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

Currently, the decls for Fortran library procedures are inconsistent,
which causes, among other things, segfaults on Darwin for ARM
(PR96168).

We should fix them all.  For maxval, findloc and friends, I
am working on a patch (see

https://gcc.gnu.org/pipermail/fortran/2020-October/055170.html

). For cshift etc, we have to be more general, because we
use the same routines for different types.

[Bug fortran/97454] Decls for Fortran library procedures

2020-10-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97454

Thomas Koenig  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=96168
   Last reconfirmed||2020-10-16
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |tkoenig at gcc dot 
gnu.org

[Bug rtl-optimization/97459] New: __uint128_t remainder for division by 3

2020-10-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

Bug ID: 97459
   Summary: __uint128_t remainder for division by 3
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

The following two functions are equivalent:

unsigned r3_128u_v1 (__uint128_t n)
{
  unsigned long a;
  a = (n >> 64) + (n & 0x);
  return a % 3;
}

unsigned r3_128u_v2 (__uint128_t n)
{
  return (unsigned) (n%3);
}

and the first one is definitely faster.

(The approach is due to Hacker's Delight, 2nd edition, "Remainder by
Summing Digits". There are also other interesting approaches there.)

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-10-17 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #4 from Thomas Koenig  ---
Here's a complete program for benchmarks on x86_64, using Jakub's
functions (so they are indeed correct):

#include 
#include 
#include 
#include 
#include 
#include 

unsigned r3_128u_v2 (__uint128_t n)
{
  return (unsigned) (n%3);
}

unsigned r3_128u_v3 (__uint128_t n)
{
  unsigned long a;
  a = (n >> 88);
  a += (n >> 44) & 0xfffULL;
  a += (n & 0xfffULL);
  return a % 3;
}

unsigned r3_128u_v4 (__uint128_t n)
{
  unsigned long a;
  a = (n >> 96);
  a += (n >> 64) & 0xULL;
  a += (n >> 32) & 0xULL;
  a += (n & 0xULL);
  return a % 3;
}

#define N 100

int main()
{
  __uint128_t *a;
  unsigned int s;
  unsigned long t1, t2;
  int fd;
  int i;
  a = malloc (sizeof (*a) * N);
  fd = open ("/dev/random", O_RDONLY);
  read (fd, a, sizeof (*a) * N);
  s = 0;
  t1 = __rdtsc();
  for (i=0; i

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-10-18 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #5 from Thomas Koenig  ---
OK, so here is a benchmark with its function names corrected. It also
includes one version (_v5) which is a bit faster.

(Note I increased the number of iterations to get more accuracy out
of the cycle count, which leads to numbers not being comparable
to the previous benchmark.)

#include 
#include 
#include 
#include 
#include 
#include 

unsigned r3_128u_v2 (__uint128_t n)
{
  return (unsigned) (n%3);
}

unsigned r3_128u_v3 (__uint128_t n)
{
  unsigned long a;
  a = (n >> 88);
  a += (n >> 44) & 0xfffULL;
  a += (n & 0xfffULL);
  return a % 3;
}

unsigned r3_128u_v4 (__uint128_t n)
{
  unsigned long a;
  a = (n >> 96);
  a += (n >> 64) & 0xULL;
  a += (n >> 32) & 0xULL;
  a += (n & 0xULL);
  return a % 3;
}

unsigned r3_128u_v5 (__uint128_t n)
{
  unsigned long a, b, c;
  b = n >> 64;
  c = n;
  if (__builtin_add_overflow (b, c, &a))
a++;

  return a%3;
}

#define N 1

int main()
{
  __uint128_t *a;
  unsigned int s;
  unsigned long t1, t2;
  int fd;
  int i;
  a = malloc (sizeof (*a) * N);
  fd = open ("/dev/random", O_RDONLY);
  read (fd, a, sizeof (*a) * N);
  s = 0;
  t1 = __rdtsc();
  for (i=0; i

[Bug fortran/95119] [9/10 Regression] CLOSE hangs when -fopenmp is specified in compilation

2020-10-18 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95119

--- Comment #13 from Thomas Koenig  ---
(In reply to Bill Long from comment #12)
> Original submitter asking which GCC version(s) have / will have the fix.

10.2 already has been released with the fix. 9.4 and 11.1 will have it in when
they are released.

[Bug libfortran/95104] [9/10 Regression] Segfault on a legal WAIT statement

2020-10-18 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95104

Thomas Koenig  changed:

   What|Removed |Added

 CC||tkoenig at gcc dot gnu.org

--- Comment #19 from Thomas Koenig  ---
Fixed for 10.2. 9.4 and 11.1 will have the fix in.

[Bug fortran/95037] gfortran fails to compile a simple subroutine, issues an opaque message

2020-10-18 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95037

--- Comment #5 from Thomas Koenig  ---
Fixed in 10.2, 9.4 and 11.1 will have it.

[Bug fortran/97491] New: Wrong restriction for VALUE arguments of pure procedures

2020-10-19 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97491

Bug ID: 97491
   Summary: Wrong restriction for VALUE arguments of pure
procedures
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

$ cat pure.f90 
pure function foo(x) result (ret)
  integer :: ret
  integer, value :: x
  x = x / 2
  ret = x
end function foo
$ gfortran pure.f90 
pure.f90:4:2:

4 |   x = x / 2
  |  1
Error: Variable 'x' cannot appear in a variable definition context (assignment)
at (1) in PURE procedure

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-10-20 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #9 from Thomas Koenig  ---
(In reply to Jakub Jelinek from comment #7)
> So, can we use this for anything but modulo 3, or 5, or 17, or 257 (all of
> those have 2^32 mod N == 2^64 mod N == 2^128 mod N == 1)

I think so, too.

> probably also
> keyed on the target having corresponding uaddv4_optab handler, normal
> expansion not being able to handle it and emitting a libcall?

Again, yes.

This can also be used as a building block for handling division
and remainder base 10.

Here's a benchmark for this (it uses the sum of digits base 10
instead). qsum1 uses the standard method, which you can find
(for example) in libgfortran.

div_rem5_v2 first calculates the remainder of the division by 5 using this
method, then does an exact division by multiplying with its modular inverse
for 2^128.

div_rem10_v2 then uses div_rem5_v2 to calculate the value and
remainder of the division by 10, and qsum_v2 uses that to
calculate the sum of digits.

The timings are about a factor of 2 faster than the straightforward
libcall version:

s = 360398898 qsum_v1: 1.091621 s
s = 360398898 qsum_v2: 0.485509 s


#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define ONE ((__uint128_t) 1)
#define TWO_64 (ONE << 64)

typedef __uint128_t mytype;

double this_time ()
{
  struct timeval tv;
  gettimeofday (&tv, NULL);
  return tv.tv_sec + tv.tv_usec * 1e-6;
}

unsigned
qsum_v1 (mytype n)
{
  unsigned ret;
  ret = 0;
  while (n > 0)
{
  ret += n % 10;
  n = n / 10;
}
  return ret;
}

static void inline __attribute__((always_inline))
div_rem_5_v2 (mytype n, mytype *div, unsigned *rem)
{
  unsigned long a, b, c;
  /* The modular inverse to 5 modulo 2^128  */
  const mytype magic = (0x * TWO_64 + 0xCCCD *
ONE);
  b = n >> 64;
  c = n;
  if (__builtin_add_overflow (b, c, &a))
a++;

  *rem = a % 5;
  *div = (n-*rem) * magic;
}

static void inline __attribute__((always_inline))
div_rem_10_v2 (mytype n, mytype *div, unsigned *rem)
{
  mytype n5;
  unsigned rem5;
  div_rem_5_v2 (n, &n5, &rem5);
  *rem = rem5 + (n5 % 2) * 5;
  *div = n5/2;
}

unsigned
qsum_v2 (mytype n)
{
  unsigned ret;
  unsigned rem;
  mytype n_new;
  ret = 0;
  while (n > 0)
{
  div_rem_10_v2 (n, &n_new, &rem);
  ret += rem;
  n = n_new;
}
  return ret;
}

#define N 1000

int main()
{
  mytype *a;
  unsigned long int s;
  double t1, t2;
  int fd;
  long int i;
  a = malloc (sizeof (*a) * N);
  fd = open ("/dev/urandom", O_RDONLY);
  read (fd, a, sizeof (*a) * N);

  s = 0;
  t1 = this_time();
  for (i=0; i

[Bug bootstrap/97527] New: OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check

2020-10-21 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

Bug ID: 97527
   Summary: OpenBSD bootstrap fails with error: C++ preprocessor
"/lib/cpp" fails sanity check
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

Created attachment 49418
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49418&action=edit
config.log from the gcc subdirectory

Bootstrap on OpenBSD fails with a strange error: 

/home/tkoenig/trunk-bin/./prev-gcc/xg++ -B/home/tkoenig/trunk-bin/./prev-gcc/
-B/home/tkoenig/x86_64-unknown-openbsd6.8/bin/ -nostdinc++
-B/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/src/.libs
-B/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/libsupc++/.libs

-I/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/include/x86_64-unknown-openbsd6.8
 -I/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/include 
-I/home/tkoenig/trunk/libstdc++-v3/libsupc++
-L/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/src/.libs
-L/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/libsupc++/.libs
 -fno-PIE -c   -g -O2 -fchecking=1 -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -fno-common -Wno-unused -DHAVE_CONFIG_H -I. -I.
-I../../trunk/gcc -I../../trunk/gcc/. -I../../trunk/gcc/../include
-I../../trunk/gcc/../libcpp/include -I/home/tkoenig/trunk-bin/./gmp
-I/home/tkoenig/trunk/gmp -I/home/tkoenig/trunk-bin/./mpfr/src
-I/home/tkoenig/trunk/mpfr/src -I/home/tkoenig/trunk/mpc/src 
-I../../trunk/gcc/../libdecnumber -I../../trunk/gcc/../libdecnumber/dpd
-I../libdecnumber -I../../trunk/gcc/../libbacktrace
-I/home/tkoenig/trunk-bin/./isl/include -I/home/tkoenig/trunk/isl/include  -o
gimple-match.o -MT gimple-match.o -MMD -MP -MF ./.deps/gimple-match.TPo
gimple-match.c

/home/tkoenig/x86_64-unknown-openbsd6.8/bin/as: out of memory allocating 8
bytes after a total of 0 bytes

[Bug bootstrap/97527] OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check

2020-10-21 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

--- Comment #1 from Thomas Koenig  ---
Created attachment 49419
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49419&action=edit
Preprocessed source of gimple-match.ii (compressed)

[Bug bootstrap/97527] OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check

2020-10-21 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

--- Comment #2 from Thomas Koenig  ---
Created attachment 49420
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49420&action=edit
Resulting assember file (which is incomplete)

[Bug bootstrap/97527] OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check

2020-10-21 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

--- Comment #3 from Thomas Koenig  ---
Created attachment 49421
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49421&action=edit
config.log from the main build directory

[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes

2020-10-21 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

Thomas Koenig  changed:

   What|Removed |Added

 Target||x86_64-unknown-openbsd6.8

--- Comment #4 from Thomas Koenig  ---
Boostrapping compiler is

gcc220$ egcc -v
Using built-in specs.
COLLECT_GCC=egcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-openbsd6.8/8.4.0/lto-wrapper
Target: x86_64-unknown-openbsd6.8
Configured with: /usr/obj/ports/gcc-8.4.0/gcc-8.4.0/configure
--with-stage1-ldflags=-L/usr/obj/ports/gcc-8.4.0/bootstrap/lib --verbose
--program-transform-name='s,^,e,' --disable-nls --with-system-zlib
--disable-libmudflap --disable-libgomp --disable-libssp --disable-tls
--with-gnu-ld --with-gnu-as --enable-threads=posix --enable-wchar_t
--with-gmp=/usr/local --enable-languages=c,c++,fortran,objc,ada
--disable-libstdcxx-pch --enable-default-ssp --enable-default-pie --without-isl
--enable-cpp --prefix=/usr/local --sysconfdir=/etc --mandir=/usr/local/man
--infodir=/usr/local/info --localstatedir=/var --disable-silent-rules
--disable-gtk-doc
Thread model: posix
gcc version 8.4.0 (GCC) 
gcc220$ eg++ -v
Using built-in specs.
COLLECT_GCC=eg++
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-openbsd6.8/8.4.0/lto-wrapper
Target: x86_64-unknown-openbsd6.8
Configured with: /usr/obj/ports/gcc-8.4.0/gcc-8.4.0/configure
--with-stage1-ldflags=-L/usr/obj/ports/gcc-8.4.0/bootstrap/lib --verbose
--program-transform-name='s,^,e,' --disable-nls --with-system-zlib
--disable-libmudflap --disable-libgomp --disable-libssp --disable-tls
--with-gnu-ld --with-gnu-as --enable-threads=posix --enable-wchar_t
--with-gmp=/usr/local --enable-languages=c,c++,fortran,objc,ada
--disable-libstdcxx-pch --enable-default-ssp --enable-default-pie --without-isl
--enable-cpp --prefix=/usr/local --sysconfdir=/etc --mandir=/usr/local/man
--infodir=/usr/local/info --localstatedir=/var --disable-silent-rules
--disable-gtk-doc
Thread model: posix
gcc version 8.4.0 (GCC) 

Assember is

gcc220$ as --version
GNU assembler (GNU Binutils) 2.35
Copyright (C) 2020 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-unknown-openbsd6.8'.

(self-compiled)

[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes

2020-10-21 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

--- Comment #5 from Thomas Koenig  ---
Created attachment 49422
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49422&action=edit
Generated gimple-match.c

All the temporary files were generated by manually adding -save-temps
to the Makefile in the gcc subdirectory, then re-compiling.

The error is repeatable.

Here's the output of vmstat while the machine is not compiling:

gcc220$ vmstat
 procsmemory   pagedisk traps  cpu
 r   s   avm fre  flt  re  pi  po  fr  sr sd0  int   sys   cs us sy id
 1  60   35M   2616M 6785   0   0   0   0   0   2   87  5296  432  8  2 90

[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes

2020-10-21 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

--- Comment #6 from Thomas Koenig  ---
The machine is gcc220.fsffrance.org ; if anybody has an account there
and wants to peek into /home/tkoenig to look into more details, be my
guest.

[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes

2020-10-22 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

--- Comment #8 from Thomas Koenig  ---
The *.s file generated with -save-temps is attached, but it
is truncated for a reason that I do not understand.

The binutils is indeed self-compiled from source (because the LLVM
linker cannot handle gcc compilation), using the system compiler, clang.
I'll recompile this with gcc 8.4 (which is installed in /usr/local/bin
as egcc) and see what happens then.

[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes

2020-10-22 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527

--- Comment #9 from Thomas Koenig  ---
Created attachment 49423
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49423&action=edit
config.log from libgomp using binutils compiled with gcc 8.4.0

Using the binutils compiled with gcc 8.4 now leads to an error in
libgomp configure, apparently because of some collision with LTO
symbols (???)

gmake[4]: Leaving directory
'/home/tkoenig/trunk-bin/x86_64-unknown-openbsd6.8/libgcc'
gmake[3]: Leaving directory
'/home/tkoenig/trunk-bin/x86_64-unknown-openbsd6.8/libgcc'
Checking multilib configuration for libgomp...
Configuring stage 1 in x86_64-unknown-openbsd6.8/libgomp
configure: loading cache ./config.cache
checking for --enable-version-specific-runtime-libs... no
checking for --enable-generated-files-in-srcdir... no
checking build system type... x86_64-unknown-openbsd6.8
checking host system type... x86_64-unknown-openbsd6.8
checking target system type... x86_64-unknown-openbsd6.8
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /usr/local/bin/gmkdir -p
checking for gawk... gawk
checking whether gmake sets $(MAKE)... yes
checking whether gmake supports nested variables... yes
checking for x86_64-unknown-openbsd6.8-gcc...
/home/tkoenig/trunk-bin/./gcc/xgcc -B/home/tkoenig/trunk-bin/./gcc/
-B/home/tkoenig/x86_64-unknown-openbsd6.8/bin/
-B/home/tkoenig/x86_64-unknown-openbsd6.8/lib/ -isystem
/home/tkoenig/x86_64-unknown-openbsd6.8/include -isystem
/home/tkoenig/x86_64-unknown-openbsd6.8/sys-include   -fno-checking
checking whether the C compiler works... no
configure: error: in
`/home/tkoenig/trunk-bin/x86_64-unknown-openbsd6.8/libgomp':
configure: error: C compiler cannot create executables
See `config.log' for more details
gmake[2]: *** [Makefile:24794: configure-stage1-target-libgomp] Error 77
gmake[2]: Leaving directory '/home/tkoenig/trunk-bin'
gmake[1]: *** [Makefile:27002: stage1-bubble] Error 2
gmake[1]: Leaving directory '/home/tkoenig/trunk-bin'
gmake: *** [Makefile:1004: all] Error 2

The suspicious part is

configure:3910: checking whether the C compiler works
configure:3932: /home/tkoenig/trunk-bin/./gcc/xgcc
-B/home/tkoenig/trunk-bin/./gcc/ -B/home/tkoenig/x86_64-unknown-openbsd6.8/bin/
-B/home/tkoenig/x86_64-unknown-openbsd6.8/lib/ -isystem
/home/tkoenig/x86_64-unknown-openbsd6.8/include -isystem
/home/tkoenig/x86_64-unknown-openbsd6.8/sys-include   -fno-checking -g -O2  
conftest.c  >&5
Wrong dl symbols!
/home/tkoenig/x86_64-unknown-openbsd6.8/bin/ld:
/home/tkoenig/trunk-bin/./gcc/liblto_plugin.so: error loading plugin: Wrong dl
symbols!

collect2: error: ld returned 1 exit status
configure:3936: $? = 1
configure:3974: result: no
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "GNU Offloading and Multi Processing Runtime Library"
| #define PACKAGE_TARNAME "libgomp"
| #define PACKAGE_VERSION "1.0"
| #define PACKAGE_STRING "GNU Offloading and Multi Processing Runtime Library
1.0"
| #define PACKAGE_BUGREPORT ""
| #define PACKAGE_URL "http://www.gnu.org/software/libgomp/";
| #define PACKAGE "libgomp"
| #define VERSION "1.0"
| /* end confdefs.h.  */
| 
| int
| main ()
| {
| 
|   ;
|   return 0;
| }
configure:3979: error: in
`/home/tkoenig/trunk-bin/x86_64-unknown-openbsd6.8/libgomp':
configure:3981: error: C compiler cannot create executables
See `config.log' for more details

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-10-23 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #10 from Thomas Koenig  ---
There are a couple of more constants for this could be tried.

Base 7:

static unsigned 
rem_7_v2 (mytype n)
{
  unsigned long a, b, c, d;
  a = n & MASK_48;
  b = (n >> 48) & MASK_48;
  c = n >> 96;
  return (a+b+c) % 7;
}

gives the reminder with respect to 7.

The reason is that 2^48-1 = 3*3*5*7*13*17*97*241*257*673, so a shift
of 48 bits works for any combination of these factors. However, for 15,
I would have to check if it would be better to use the 64-bit shift.

For 19, it's a shift of 56 that would work.

I think I'd better make a table.

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-10-24 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #11 from Thomas Koenig  ---
Created attachment 49438
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49438&action=edit
Numbers a, b so that 2^b  ≡ 1 mod a up to b=64, larger b taken if several
solutions exist

Here is the promised table of divisors for which this optimization
is possible.

For example, the line 

9 60

means that the remainder of a division by 9 can be calculated by

#define MASK60 ((1ul << 60) - 1)

unsigned rem_9 (__uint128_t n)
{
__uint64_t a, b, c;
a = n & MASK60;
b = (n >> 60) && MASK60;
c = (n >> 120);
return (a+b+c) % 9;
}

The number of terms varies; for b=64, it is two terms; for
63 >= b >= 43, it is three terms, and for the rest, it is four terms.

67 is the first odd divisor for which there is no such shortcut, the
next one is 83. Those are the only gaps below 100.

Of course, if a is in the list, then a*2^n can be treated by shifting
(like it was shown for a=10).

Now, the interesting question, what to make of it.

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-10-24 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #12 from Thomas Koenig  ---
(In reply to Thomas Koenig from comment #11)
> Created attachment 49438 [details]
> Numbers a, b so that 2^b  ≡ 1 mod a up to b=64, larger b taken if several
> solutions exist
>

A quick check that all numbers are correct is

awk ' { print 2 "^" $2 "%" $1 } ' divisiontable.dat | bc

which shows 1 as output only.

[Bug fortran/97530] Segmentation fault compiling coarray program with option -fcoarray=shared (not with -fcoarray={lib,single})

2020-10-24 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97530

Thomas Koenig  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-10-24
 Ever confirmed|0   |1

--- Comment #2 from Thomas Koenig  ---
Here's a reduced version of the reduced version.

module types
  type local_model_state
 real, allocatable :: ps(:,:)  ! Surface pressure
 real, allocatable :: t(:,:,:) ! Temperature
  end type local_model_state
contains
  function int_mult(ms, ifactor)
type(local_model_state) :: int_mult
type(local_model_state), intent(in) :: ms
integer, intent(in) :: ifactor
int_mult % ps = ms % ps * ifactor
  end function int_mult
end module types

[Bug fortran/97530] Segmentation fault compiling coarray program with option -fcoarray=shared (not with -fcoarray={lib,single})

2020-10-24 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97530

--- Comment #3 from Thomas Koenig  ---
A little bit more reduced.

module types
  type local_model_state
 real, allocatable :: ps(:,:)  ! Surface pressure
  end type local_model_state
contains
  function int_mult(ms, ifactor)
type(local_model_state) :: int_mult
type(local_model_state), intent(in) :: ms
integer, intent(in) :: ifactor
int_mult % ps = ms % ps * ifactor
  end function int_mult
end module types

[Bug fortran/97530] Segmentation fault compiling coarray program with option -fcoarray=shared (not with -fcoarray={lib,single})

2020-10-25 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97530

Thomas Koenig  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Thomas Koenig  ---
Fixed by 

https://gcc.gnu.org/g:9dca1f29608df4bda70b33be735373ac18b8714b

Thanks a lot for the bug report!

I think we need a way to run the whole testsuite with -fcoarray=shared
to spot any other issues like this.

[Bug fortran/88076] Shared Memory implementation for Coarrays

2020-10-25 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88076
Bug 88076 depends on bug 97530, which changed state.

Bug 97530 Summary: Segmentation fault compiling coarray program with option 
-fcoarray=shared (not with -fcoarray={lib,single})
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97530

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-10-25 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

Thomas Koenig  changed:

   What|Removed |Added

  Attachment #49438|divisiontable.dat   |divisiontable.txt
   filename||

--- Comment #13 from Thomas Koenig  ---
Comment on attachment 49438
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49438
Numbers a, b so that 2^b  ≡ 1 mod a up to b=64, larger b taken if several
solutions exist

Seems the bugzilla system decided this was an MPEG file.

Well, it is not, hopefully renaming it as txt will help.

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-10-29 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Thomas Koenig  ---
Seems fixed by https://gcc.gnu.org/g:23856d2f29fd87edf724ade48ee30c869a3b1ea3 .

Thanks for the report!

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-10-29 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
   Last reconfirmed||2020-10-29
 CC||koenigni at gcc dot gnu.org
 Ever confirmed|0   |1
 Resolution|FIXED   |---

--- Comment #2 from Thomas Koenig  ---
Correction - you were referring to a runtime error, so this is not
yet fixed.

[Bug middle-end/97656] New: Specify that there is no address arithmetic on a pointer

2020-10-31 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97656

Bug ID: 97656
   Summary: Specify that there is no address arithmetic on a
pointer
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

This involves Fortran, but possibly also other languages.

Consider

$ cat alias.f90
program main
  interface
 subroutine foo(a)
   integer, intent(inout) :: a
 end subroutine foo
  end interface
  integer, dimension(2) :: x
  x(1) = 42
  x(2) = 42
  call foo(x(1))
  if (x(2) /= 42) stop "Error!"
end program main

This program specifies that foo has a scalar argument, passed by reference.
It is forbidden by Fortran's rules foo could access the element x(2),
therefore the call to stop could be optimized away, but it isn't:

$ gfortran -O3 -S alias.f90 
$ grep _gfortran_stop alias.s 
call_gfortran_stop_string
$ 

It would be good if there was a TREE_NO_POINTER_ARITHMETC (or simlar) flag
that could be set by the Fortran front end.

[Bug bootstrap/96735] --enable-maintainer-mode broken

2020-11-01 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96735

Thomas Koenig  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|WAITING |RESOLVED

--- Comment #4 from Thomas Koenig  ---
Didn't happen again, so the best bet is that make was indeed run
from the gcc subdirectory.

[Bug fortran/97320] False positive "Array reference out of bounds in loop" in a protecting if block

2020-11-02 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97320

Thomas Koenig  changed:

   What|Removed |Added

 CC||tkoenig at gcc dot gnu.org
 Status|RESOLVED|NEW
 Resolution|DUPLICATE   |---
   Severity|normal  |enhancement
 Depends on||90302

--- Comment #6 from Thomas Koenig  ---
It's not an exact duplicate of PR 94978; that bug is about
a false positive without -Wdo-subscript, whereas this one is
about a false positive with -Wdo-subscript.

The reason why this is rather difficult to resolve is one
of translation phases.

In the gfortran front end, we create a syntax tree from the
Fortran source code.  On the basis of that syntax tree (where
we still know a lot about the langauge) we issue that warning.

The next step is conversion to an intermediate language, which
gets handed to the main part of gcc for further processing
(known as the "middle end").

It is the middle which is does most of the optimizations, and
which has the tools to do so.  In this particular instance, we
would need "range propagation" (where the compiler can infer the
range of variables from previous statements).  We don't do that
in the front end, because a) it would be a major piece of work, and
b) it would duplicate a lot of what the middle end already does.

The most elegant solution would be support from the middle and
back end to put in a pseudo statement, like a __bulitin_warning
"function".

Code like

integer :: a(12)
do i=1,10
   a(i-1) = 1

could then be annotated like

   do i=1,10
 if (0 < lbound(a)) call __builtin_warning ("index out of bounds")
 if (9 > ubound(a)) call __builtin_warning ("index out of bounds")
 a(i-1) = 1

and if the compiler could not prove that these statements get removed
by dead code elimination, it would issue the warning in the final phase of
translation.

This would pretty much eliminate false positives, and would be
far superior than what we currently do.

Unfortunately, this is a part of a compiler with which I am almost
totally unfamiliar, so I cannot help there. Some preliminary work
has been done (see PR 90302), but I don't know how far it has
progressed in the meantime.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90302
[Bug 90302] Implement __builtin_warning

[Bug rtl-optimization/97738] New: Optimizing division by value & - value for HAKMEM 175

2020-11-06 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738

Bug ID: 97738
   Summary: Optimizing division by value & - value for HAKMEM 175
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

A straightforward implementation of HAKMEM 175 (returning
the next number with the same number of bits) is

unsigned int
next_same_bit (unsigned int value)
{
  unsigned int lowest_bit;
  unsigned int left_bits;
  unsigned int changed_bits;
  unsigned int right_bits;

  lowest_bit = value & - value;
  left_bits = value + lowest_bit;
  changed_bits = value ^ left_bits;
  right_bits = (changed_bits / lowest_bit) >> 2;
  return left_bits | right_bits;
}

In two's complement, this can be replaced by

unsigned int
next_s_bit (unsigned int value)
{
  unsigned int lowest_bit;
  unsigned int ctz;
  unsigned int left_bits;
  unsigned int changed_bits;
  unsigned int right_bits;

  ctz = __builtin_ctz (value);
  lowest_bit = 1u << ctz;
  left_bits = value + lowest_bit;
  changed_bits = value ^ left_bits;
  right_bits = changed_bits >> (ctz + 2);
  return left_bits | right_bits;
}

to replace the expensive division by what is known to be a
power of two by a shift.

That transformation is counter-productive (and might be done
the other way) if there is no division by lowest_bit.

[Bug middle-end/97738] Optimizing division by value & - value for HAKMEM 175

2020-11-06 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738

--- Comment #2 from Thomas Koenig  ---
Created attachment 49516
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49516&action=edit
Small benchmark

Here's a small benchmark for counting all 32-bit numbers with 16 bits set
according to the HAKMEM source.

Timing is (first float is elapsed time in seconds for version with division,
second float is for the shift):

2.319526 601080391
1.147284 601080391

with -O3 -march=native on an AMD Ryzen 7 1700X,

4.539288 601080391
2.700514 601080391

on POWER9.

[Bug middle-end/97738] Optimizing division by value & - value for HAKMEM 175

2020-11-06 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738

--- Comment #3 from Thomas Koenig  ---
Even faster code:

  ctz = __builtin_ctz (value);
  lowest_bit = value & - value;
  left_bits = value + lowest_bit;
  changed_bits = value ^ left_bits;
  right_bits = changed_bits >> (ctz + 2);
  return left_bits | right_bits;

The first two instructions get compiled directly (with -march=native)
to

blsi%edi, %edx
tzcntl  %edi, %eax

[Bug middle-end/97738] Optimizing division by value & - value for HAKMEM 175

2020-11-07 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738

--- Comment #5 from Thomas Koenig  ---
(In reply to Jakub Jelinek from comment #4)
> What about a version that still sets lowest_bit to value & -value; rather
> than 1 < ctz?

I think this would be ideal, or close to it.

> Also, I'm not sure you can safely do the (changed_bits >> ctz) >> 2 to
> changed_bits >> (ctz + 2) transformation, while because of the division one
> can count on value not being 0 (otherwise UB), value & -value can still be
> e.g. 1U << 31 and then ctz 31 too, and changed_bits >> (31 + 2) being UB,
> while
> (changed_bits >> 31) >> 2 well defined returning 0.

OK.

> So, I think we could e.g. during expansion (or isel) based on target cost
> optimize
> x / (y & -y) to x >> __builtin_ctz (y) (also assuming the optab for ctz
> exists), but anything else looks complicated.

I think this would solve the issue for the original code (which is
what people will find on the web if they google for HAKMEM 175).

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #14 from Thomas Koenig  ---
Created attachment 49520
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49520&action=edit
Numbers a, b so that 2^b  ≡ 1 mod a up to b=64, larger b taken if several
solutions exist, plus the multiplicative inverse for 2^128

I've added the multiplicative inverse to the table, calculated with
maxima by inv_mod(x,2^128). Output is in hex, to make it easier to
break down into two numbers.

Is there any more info that I could provide?

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #3 from Thomas Koenig  ---
Simplified test case:

program main
  type foo
 real, allocatable, dimension(:) :: a[:]
  end type foo
  type (foo) :: x
  sync all
  allocate (x%a(10)[*])
end program main

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #16 from Thomas Koenig  ---
(In reply to Jakub Jelinek from comment #15)
> I plan to work on this early in stage3.
> And we really shouldn't use any tables, GCC should figure it all out.
> So, for double-word modulo by constant that would be expanded using a
> libcall, go for x from the word bitsize to double-word bitsize and check if
> (1max << x) % cst
> is 1

It's probably better to search from high to low, to reduce the number
of necessary shifts for division by constants like 9 or 13.

> (and prefer what we've agreed on for 3), and fall back to
> multiplications (see #c8) if there aren't any other options and the costs
> don't say it is too costly.

I think for variants where the constants aren't power of two,

#define ONE ((__uint128_t) 1)
#define TWO_64 (ONE << 64)
#define MASK60 ((1ul << 60) - 1)

void
div_rem_13 (mytype n, mytype *div, unsigned int *rem)
{
  const mytype magic = TWO_64 * 14189803133622732012u + 5675921253449092805u *
ONE; /* 0xC4EC4EC4EC4EC4EC4EC4EC4EC4EC4EC5 */
  __uint64_t a, b, c;
  unsigned int r;

  a = n & MASK60;
  b = (n >> 60);
  b = b & MASK60;
  c = (n >> 120);
  r = (a+b+c) % 13;
  n = n - r;
  *div = n * magic;
  *rem = r;
}

should be pretty efficient; there is only one shift which spans two
words.  (The assembly generated from the function looks weird
because of quite a few move instructions, but that should not be
an issue for code generated inline).

Regarding the approach in comment #8, I think I'll run some benchmarks
to see how well that works for other constants which don't fit
the pattern of being divisors for 2^n-1.

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #17 from Thomas Koenig  ---

To be compilable, my previous code lacks

typedef __uint128_t mytype;

> #define ONE ((__uint128_t) 1)

[Bug rtl-optimization/97756] New: Inefficient handling of 128-bit arguments

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756

Bug ID: 97756
   Summary: Inefficient handling of 128-bit arguments
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

This is an offshoot from PR 97459.

The code

#define ONE ((__uint128_t) 1)
#define TWO_64 (ONE << 64)
#define MASK60 ((1ul << 60) - 1)

typedef __uint128_t mytype;

void
div_rem_13_v2 (mytype n, mytype *div, unsigned int *rem)
{
  const mytype magic = TWO_64 * 14189803133622732012u + 5675921253449092805u *
ONE;
  unsigned long a, b, c;
  unsigned int r;

  a = n & MASK60;
  b = (n >> 60);
  b = b & MASK60;
  c = (n >> 120);
  r = (a+b+c) % 13;
  n = n - r;
  *div = n * magic;
  *rem = r;
}

when compiled on x86_64 on Zen with -O3 -march=native has quite
some register shuffling at the beginning:

   0:   49 89 f0mov%rsi,%r8
   3:   48 89 femov%rdi,%rsi
   6:   49 89 d1mov%rdx,%r9
   9:   48 ba ff ff ff ff ffmovabs $0xfff,%rdx
  10:   ff ff 0f 
  13:   4c 89 c7mov%r8,%rdi
  16:   48 89 f0mov%rsi,%rax
  19:   49 89 c8mov%rcx,%r8
  1c:   48 89 f1mov%rsi,%rcx
  1f:   49 89 famov%rdi,%r10
  22:   48 0f ac f8 3c  shrd   $0x3c,%rdi,%rax
  27:   48 21 d1and%rdx,%rcx
  2a:   41 56   push   %r14
  2c:   49 c1 ea 38 shr$0x38,%r10
  30:   48 21 d0and%rdx,%rax
  33:   53  push   %rbx
  34:   48 bb c5 4e ec c4 4emovabs $0x4ec4ec4ec4ec4ec5,%rbx
  3b:   ec c4 4e 
  3e:   4c 01 d1add%r10,%rcx
  41:   45 31 dbxor%r11d,%r11d
  44:   48 01 c1add%rax,%rcx
  47:   48 89 c8mov%rcx,%rax
  4a:   48 f7 e3mul%rbx
  4d:   48 c1 ea 02 shr$0x2,%rdx
  51:   48 8d 04 52 lea(%rdx,%rdx,2),%rax
  55:   48 8d 04 82 lea(%rdx,%rax,4),%rax
  59:   48 89 camov%rcx,%rdx
  5c:   48 b9 ec c4 4e ec c4movabs $0xc4ec4ec4ec4ec4ec,%rcx
  63:   4e ec c4 
  66:   48 29 c2sub%rax,%rdx
  69:   48 29 d6sub%rdx,%rsi
  6c:   49 89 d6mov%rdx,%r14
  6f:   4c 19 dfsbb%r11,%rdi
  72:   48 0f af ce imul   %rsi,%rcx
  76:   48 89 f2mov%rsi,%rdx
  79:   48 89 f8mov%rdi,%rax
  7c:   c4 e2 cb f6 fb  mulx   %rbx,%rsi,%rdi
  81:   48 0f af c3 imul   %rbx,%rax
  85:   49 89 31mov%rsi,(%r9)
  88:   48 01 c8add%rcx,%rax
  8b:   48 01 c7add%rax,%rdi
  8e:   49 89 79 08 mov%rdi,0x8(%r9)
  92:   45 89 30mov%r14d,(%r8)
  95:   5b  pop%rbx
  96:   41 5e   pop%r14
  98:   c3  retq

[Bug libstdc++/97759] Could std::has_single_bit be faster?

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759

Thomas Koenig  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Severity|normal  |enhancement
 CC||tkoenig at gcc dot gnu.org

--- Comment #1 from Thomas Koenig  ---
Could you post the benchmark and the exact architecture where the arithmetic
version is faster?

[Bug rtl-optimization/97756] Inefficient handling of 128-bit arguments

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756

--- Comment #1 from Thomas Koenig  ---
Actually, it was on a Ryzen 1700 (for the -march=native).

I'm at odds with architecture names...

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-09 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #4 from Thomas Koenig  ---
(In reply to Thomas Koenig from comment #3)
> Simplified test case:
> 
> program main
>   type foo
>  real, allocatable, dimension(:) :: a[:]
>   end type foo
>   type (foo) :: x
>   sync all
>   allocate (x%a(10)[*])
> end program main

Correction: That does not always segfault.

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-09 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|REOPENED|NEW

--- Comment #5 from Thomas Koenig  ---
Really reduced test case, no derived types are needed to expose the bug.

program random_weather
  implicit none
  real, allocatable :: my_ps(:,:)  [:,:]
  integer :: i, npx, nxlocal, nylocal, nzglobal
  nxlocal = 23
  nylocal = 23
  nzglobal = 30
  npx = 4
  allocate (my_ps(0:nxlocal-1, 0:nylocal-1) [0:npx-1,0:*])
end program random_weather

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-09 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #6 from Thomas Koenig  ---
After

https://gcc.gnu.org/pipermail/gcc-cvs/2020-November/336787.html

the simplified test case seems to be fixed.

What I now get with the original test case is

Decomposition information on image   4 : there are   2 *   4 slabs; the slab on
this image has  45 *  21 grid cells.
Decomposition information on image   6 : there are   2 *   4 slabs; the slab on
this image has  45 *  23 grid cells.
Decomposition information on image   2 : there are   2 *   4 slabs; the slab on
this image has  45 *  23 grid cells.
Decomposition information on image   8 : there are   2 *   4 slabs; the slab on
this image has  45 *  21 grid cells.
Decomposition information on image   5 : there are   2 *   4 slabs; the slab on
this image has  45 *  23 grid cells.
Decomposition information on image   1 : there are   2 *   4 slabs; the slab on
this image has  45 *  23 grid cells.
Decomposition information on image   7 : there are   2 *   4 slabs; the slab on
this image has  45 *  23 grid cells.
Decomposition information on image   3 : there are   2 *   4 slabs; the slab on
this image has  45 *  23 grid cells.
Size mismatch for coarray allocation id 0x608460: found = 30240 != size = 33120

where the error message seems to be correct; I think all coarrays
on different images must have the same size.

Toon, could you maybe comment?

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-10 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Thomas Koenig  ---
So, this one is fixed then.

Thanks a lot for the bug report - handling coarrays as components
was actually broken in more than one way.

If you have anything else, don't hesitate to throw it at the branch :-)

[Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt

2020-11-10 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770

Thomas Koenig  changed:

   What|Removed |Added

 CC||tkoenig at gcc dot gnu.org

--- Comment #7 from Thomas Koenig  ---
Some literature:

https://arxiv.org/pdf/1611.07612

[Bug tree-optimization/21046] move memory allocation out of a loop

2020-11-11 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21046

Thomas Koenig  changed:

   What|Removed |Added

   Last reconfirmed|2014-12-25 00:00:00 |2020-11-11

--- Comment #6 from Thomas Koenig  ---
Just thought to see if this has been fixed in the meantime;
it's not optimized with current trunk.

[Bug tree-optimization/30398] memmove for string operations

2020-11-11 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30398

Thomas Koenig  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Thomas Koenig  ---
This is what we generated now for

program main
  character(len=1) :: s
  character(len=2) :: c
  s = 'a'
  c = repeat(s,2)
  call foo(c)
end program main

:

;; Function main (main, funcdef_no=1, decl_uid=3926, cgraph_uid=2,
symbol_order=1) (executed once)

__attribute__((externally_visible))
main (integer(kind=4) argc, character(kind=1) * * argv)
{
  character(kind=1) c[1:2];
  static integer(kind=4) options.3[7] = {2116, 4095, 0, 1, 1, 0, 31};

   [local count: 1073741825]:
  _gfortran_set_args (argc_2(D), argv_3(D));
  _gfortran_set_options (7, &options.3[0]);
  MEM  [(c_char * {ref-all})&c] = 24929;
  foo (&c, 2);
  c ={v} {CLOBBER};
  return 0;

}

So, everything that should be optimized is now optimized.

Fixed.

[Bug tree-optimization/38592] Optimize memmove / memcmp combination

2020-11-11 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38592

Thomas Koenig  changed:

   What|Removed |Added

   Last reconfirmed|2017-08-15 00:00:00 |2020-11-11

--- Comment #10 from Thomas Koenig  ---
For the C test case, we now get

yes:
.LFB0:
.cfi_startproc
movl$25977, %eax
movb$115, -1(%rsp)
movw%ax, -3(%rsp)
movzbl  -2(%rsp), %eax
subl$101, %eax
jne .L1
movzbl  -1(%rsp), %eax
subl$115, %eax
.L1:
ret

So, optimized further, but not folded.

clang 7 folds this completely:

yes:# @yes
.cfi_startproc
# %bb.0:
xorl%eax, %eax
retq
.Lfunc_end0:

[Bug fortran/97799] Passing CHARACTER() var(*) through ENTRY causes segfaults

2020-11-11 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97799

Thomas Koenig  changed:

   What|Removed |Added

 CC||tkoenig at gcc dot gnu.org

--- Comment #8 from Thomas Koenig  ---
Comment on attachment 49548
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49548
bugtest.f -- program evincing bug

So, commit the test case to guard against regressions
(since it is not immediately obvious if this is already
covered).

I'll do so in a short while.

[Bug fortran/97799] Passing CHARACTER() var(*) through ENTRY causes segfaults

2020-11-11 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97799

--- Comment #9 from Thomas Koenig  ---
(In reply to Thomas Koenig from comment #8)
> Comment on attachment 49548 [details]
> bugtest.f -- program evincing bug
> 
> So, commit the test case to guard against regressions
> (since it is not immediately obvious if this is already
> covered).
> 
> I'll do so in a short while.

Or as soon as bootstrap works again.

[Bug fortran/97799] Passing CHARACTER() var(*) through ENTRY causes segfaults

2020-11-12 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97799

Thomas Koenig  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|WAITING |RESOLVED

--- Comment #10 from Thomas Koenig  ---
Test case committed to master as
https://gcc.gnu.org/g:3c3beb1a8137460bc485f9fbe3be8b21ee7f91a2 and
to gcc 10 as https://gcc.gnu.org/g:910250c360291074d0908feb111403e6bb3b32ee .

Thanks for the report!

[Bug fortran/97799] [10/11 Regression] Passing CHARACTER() var(*) through ENTRY causes segfaults

2020-11-13 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97799

Thomas Koenig  changed:

   What|Removed |Added

   Target Milestone|--- |10.3
Summary|Passing CHARACTER*(*)   |[10/11 Regression] Passing
   |var(*) through ENTRY causes |CHARACTER*(*) var(*)
   |segfaults   |through ENTRY causes
   ||segfaults
 Status|VERIFIED|RESOLVED

--- Comment #12 from Thomas Koenig  ---
(In reply to George Hockney from comment #11)
> We've verified a large-scale legacy build against 
> 
> GNU Fortran (gcc8.2) 11.0.0 2020 (experimental)
> 
> and
> 
> GNU Fortran (GCC) 10.2.1 20201017
> 
> All our regressions pass these compilers.

Thanks for letting us know.

> Therefore, I'm changing the status to verified (this is per our bugzilla
> workflow; if it's not your workflow please fix)

It's not usually done, so I'll just change this back (there are
a few search masks which don't have VERIFIED in).

> Unfortunately, 10.2.0 was released with this bug.

Yep.

I have looked over the changes to gcc10 since the 10.2 release to the gfortran
front end  haven't found anything obvious that could have fixed this;
I don't think we need to do a bisection, having the test case should
be enough.

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-15 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|REOPENED|WAITING

--- Comment #10 from Thomas Koenig  ---
(In reply to Toon Moene from comment #9)
> Unfortunately, I now get the following error on the original code in the
> attachment:
> 
> (export
> LD_LIBRARY_PATH=/home/toon/compilers/install/coarray_native/lib/gcc/x86_64-
> pc-linux-gnu/11.0.0; export GFORTRAN_NUM_IMAGES=1; echo ' &config / ' |
> ./a.out)
> Decomposition information on image   1 : there are   1 *   1 slabs; the slab
> on this image has  90 *  90 grid cells.
> Fortran runtime error: Integer overflow when calculating the amount of
> memory to allocate

That should be fixed with

https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=649754c5b4a888c2c69c1a9cbeb1c356899934c1

(which just removed the overflow checks).

Anything else? :-)

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #12 from Thomas Koenig  ---
Reduced test case:

program main
  type global_model_state
 real, allocatable :: ps(:)  [:]
  end type global_model_state
  type (global_model_state) :: ms_full
  allocate (ms_full % ps(100) [*])
  ms_full %ps = 42.
end program main

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #13 from Thomas Koenig  ---
(In reply to Thomas Koenig from comment #12)
> Reduced test case:
> 
> program main
>   type global_model_state
>  real, allocatable :: ps(:)  [:]
>   end type global_model_state
>   type (global_model_state) :: ms_full
>   allocate (ms_full % ps(100) [*])
>   ms_full %ps = 42.
> end program main

That one is now fixed, but the original test case still segfaults.

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #14 from Thomas Koenig  ---
(In reply to Thomas Koenig from comment #13)
> (In reply to Thomas Koenig from comment #12)
> > Reduced test case:
> > 
> > program main
> >   type global_model_state
> >  real, allocatable :: ps(:)  [:]
> >   end type global_model_state
> >   type (global_model_state) :: ms_full
> >   allocate (ms_full % ps(100) [*])
> >   ms_full %ps = 42.
> > end program main
> 
> That one is now fixed, but the original test case still segfaults.

... with 16 images:

Decomposition information on image   1 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image  13 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image  15 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image  16 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image   6 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image  14 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image   8 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image   2 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image   3 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image   4 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image  11 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image   7 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image  10 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image   5 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image   9 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.
Decomposition information on image  12 : there are   8 *   2 slabs; the slabs
are   9 *  35 grid cells in size.

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7fb21745659f in ???
at
/usr/src/debug/glibc-2.26-lp151.19.19.1.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
#1  0x41195b in __types_MOD_global_init
at /home/ig25/Krempel/Nico/random_weather.f90:154
#2  0x4148e7 in random_weather
at /home/ig25/Krempel/Nico/random_weather.f90:494
#3  0x41576d in image_main_wrapper
at ../../../coarray_native/libgfortran/caf_shared/coarraynative.c:183
#4  0x4153d2 in main
at /home/ig25/Krempel/Nico/random_weather.f90:413
ERROR: Image 16(0x5a1d) failed

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #16 from Thomas Koenig  ---
program random_weather
  implicit none
  type global_model_state
 real, allocatable :: ps(:,:)  [:,:]
  end type global_model_state
  integer :: nxslab, nyslab

  type(global_model_state) :: ms_full
  integer :: i, time, np1, np2, npx, npy, npxy
  real, parameter :: PS = 10.0,  T = 300.0,U = 0.0,V = 0.0,W =
0.0,  Q = 0.002! Mean value

  npxy = num_images()
  nxslab = 72
  nyslab = 70
  npx = 1
  allocate( ms_full % ps(0:nxslab-1, 0:nyslab-1)[0:npx-1, 0:*] )

  ms_full % ps = PS
end program random_weather

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-16 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #15 from Thomas Koenig  ---
Next reduced test-case:

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-17 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #17 from Thomas Koenig  ---
A bit more reduced (no derived types necessary):

program random_weather
  implicit none
  real, allocatable :: ps(:,:)  [:,:]
  integer :: nxslab, nyslab

  integer :: npx
  integer :: i, j
  real, parameter :: PS1 = 10.0

  nxslab = 72
  nyslab = 70
  npx = 1
  allocate( ps(nxslab, nyslab)[npx, *] )
  ps(1,1) = PS1
end program random_weather

So, it appears that the offset for multi-co-dimensional
allocated coarrays is miscalculated.

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-22 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #18 from Thomas Koenig  ---
Hi Toon,

with https://gcc.gnu.org/pipermail/gcc-cvs/2020-November/337586.html ,
your program seems to work (at least the values look reasonable):

Decomposition information on image   2 : there are   3 *   2 slabs; the slabs
are  24 *  35 grid cells in size.
Decomposition information on image   5 : there are   3 *   2 slabs; the slabs
are  24 *  35 grid cells in size.
Decomposition information on image   3 : there are   3 *   2 slabs; the slabs
are  24 *  35 grid cells in size.
Decomposition information on image   1 : there are   3 *   2 slabs; the slabs
are  24 *  35 grid cells in size.
Decomposition information on image   4 : there are   3 *   2 slabs; the slabs
are  24 *  35 grid cells in size.
Decomposition information on image   6 : there are   3 *   2 slabs; the slabs
are  24 *  35 grid cells in size.
 Time   0  Image   5  PS=   99978.4531  T=300.364166   
  U=19.3067131  V=15.9685030  W=   0.138491884  Q=   
2.17480748E-03
 Time   0  Image   1  PS=   99985.0938  T=300.027161   
  U=   -9.06420994  V=5.92245483  W=   0.137841657  Q=   
2.10389541E-03
 Time   0  Image   3  PS=   9.3828  T=300.014618   
  U=   -4.48150349  V=   -1.37469864  W=   -8.73371959E-02  Q=   
1.81287562E-03
 Time   0  Image   2  PS=   99986.4141  T=300.200836   
  U=   -3.47342205  V=16.5930214  W=   0.205771178  Q=   
1.97321200E-03
 Time   0  Image   6  PS=   99980.4141  T=300.424133   
  U=12.8092175  V=11.5236654  W=6.01452552E-02  Q=   
1.87643641E-03
 Time   0  Image   4  PS=   100010.516  T=300.005005   
  U=11.4250631  V=3.44926071  W=  -0.227272436  Q=   
2.07653991E-03
 Time 240  Image   6  PS=   0.5781  T=300.666931   
  U=22.8395500  V=   -11.9721365  W=3.66642363E-02  Q=   
1.70292379E-03
 Time 240  Image   2  PS=   99980.1484  T=300.538757   
  U=19.1216316  V=34.7150421  W=3.16514075E-03  Q=   
2.09417334E-03
 Time 240  Image   1  PS=   99969.6641  T=300.400970   
  U=3.65581894  V=16.8670387  W=2.10290849E-02  Q=   
2.06003617E-03
 Time 240  Image   3  PS=   5.2734  T=300.354370   
  U=4.84142876  V=4.59838200  W=1.12933442E-02  Q=   
1.67453510E-03
 Time 240  Image   5  PS=   99959.9141  T=300.308228   
  U=35.2094879  V=26.3194275  W=6.13999888E-02  Q=   
2.24495190E-03
 Time 240  Image   4  PS=   100024.211  T=300.642700   
  U=   -21.4838848  V=   -5.71874714  W=   0.123860441  Q=   
1.77718676E-03
 Time 480  Image   1  PS=   99988.9688  T=300.262726   
  U=   -1.2006  V=13.3446560  W=   -1.83758438E-02  Q=   
1.98666588E-03
 Time 480  Image   5  PS=   100030.500  T=300.034546   
  U=8.11599827  V=49.5809326  W=   -1.16332620E-02  Q=   
2.18066899E-03
 Time 480  Image   3  PS=   99974.3828  T=300.171265   
  U=   -12.1284695  V=13.2599001  W=  -0.132261544  Q=   
1.64680334E-03
 Time 480  Image   6  PS=   99983.6328  T=299.253204   
  U=   -16.0964108  V=   -7.74500656  W=  -0.392248750  Q=   
1.88040221E-03
 Time 480  Image   2  PS=   99969.3672  T=299.095215   
  U=   -5.18578625  V=36.8412170  W=  -0.231359661  Q=   
1.97951938E-03
 Time 480  Image   4  PS=   100016.453  T=300.540619   
  U=   -1.72649384  V=38.7740860  W=  -0.185899958  Q=   
2.31738412E-03

Thanks again for the test case, it certainly showed up a lot of bugs :-)

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-22 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

Thomas Koenig  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #21 from Thomas Koenig  ---
Hi Toon,

yes, I can replicate this.

[Bug fortran/98016] Host association problem

2020-11-26 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98016

Thomas Koenig  changed:

   What|Removed |Added

   Last reconfirmed||2020-11-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING
 CC||tkoenig at gcc dot gnu.org

--- Comment #1 from Thomas Koenig  ---
Seems to be fixed on current trunk:

$ cat bug.f90 
program p
  real :: y(3)
  n=3
  y = func(0.)
  stop
contains
  function func(x) result (y)
real y(n)
y=x
  end function func
end program p
$ gfortran bug.f90 
$ gfortran -v
Es werden eingebaute Spezifikationen verwendet.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/home/ig25/lib/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper
Ziel: x86_64-pc-linux-gnu
Konfiguriert mit: ../trunk/configure --prefix=/home/ig25
--enable-languages=c,c++,fortran
Thread-Modell: posix
Unterstützte LTO-Kompressionsalgorithmen: zlib
gcc-Version 11.0.0 20201112 (experimental) [master revision
d33bc98f5bc:79fa060941e:87b7d45e358e4df93b6a93b2e7a55b123ea76f5d] (GCC) 

Can you confirm that?  If so, we can commit a test case and close.

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-26 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #22 from Thomas Koenig  ---
Hi Toon,

it took some time, but we finally figured out that it is actually
a bug in your program that is causing problems.

It has (shortened)

nxglobal = 72;

This sets the coarray nxglobal to 72 on every image, including image 2.

if (this_image() == 1) then
   read(*,config)

This optionally reads in nxglobal[1].

! Why won't this work as nxglobal[:] = nx ?
   do i = 2, num_images()
  nxglobal[i] = nxglobal;

This sets nxglobal[2] on image 2 from image 1. This is a race
condition: It is not clear which store comes first.

To fix this, you would need a "sync all" before the
if (this_image() = 1) statement, or you could set the default
value on image 1 only.

(Incidentally, looking at this code led to finding a bug in
namelist handling, where the implied this_image() was not honored
(namelist I/O only worked on image 1), so the time looking at this
bug was not wasted :-)

By the way, there is also a segfault with GFORTRAN_NUM_IMAGES=64, which
will need to be investigated.

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-28 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #26 from Thomas Koenig  ---
After 00c2e5d1c15c67fc2c9d9ed86bfa1f5aa13848cc ,
the segfault for too many images is now also fixed,
and your program runs as expected.

I'd say an important milestone has been reached :-)

[Bug testsuite/26183] setting environment variables in test cases

2020-11-29 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26183

Thomas Koenig  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #2 from Thomas Koenig  ---
This is now possible using dg-set-target-env-var .

[Bug fortran/98053] New: Add Fortran tests for behavior from environment variables

2020-11-29 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98053

Bug ID: 98053
   Summary: Add Fortran tests for behavior from environment
variables
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

There is dg-set-target-env-var , which we could use to check that
the runtime behavior which depends on environment variables is
indeed OK.

[Bug libfortran/98076] New: Increase speed of integer I/O

2020-11-30 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98076

Bug ID: 98076
   Summary: Increase speed of integer I/O
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libfortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

Currently, we use GFC_INTEGER_LARGEST for a lot of integer I/O.

One place where things could be improved is gfc_itoa:

const char *
gfc_itoa (GFC_INTEGER_LARGEST n, char *buffer, size_t len)
{
...
  GFC_UINTEGER_LARGEST t;
...
  t = n;
  if (n < 0)
{
  negative = 1;
  t = -n; /*must use unsigned to protect from overflow*/
}
...

  while (t != 0)
{
  *--p = '0' + (t % 10);
  t /= 10;
}

Currently, the quotient / remainder calculation is expanded into
a libcall.  This could be done by the improved remainder calculation
from PR97459 with the calculation of the quotient by multiplicative
inverse, and by switching to a smaller datatype once the value fits in
there.  Both can and should be combined, of course.

[Bug libfortran/98076] Increase speed of integer I/O

2020-11-30 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98076

Thomas Koenig  changed:

   What|Removed |Added

Version|unknown |11.0
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |tkoenig at gcc dot 
gnu.org
   Severity|normal  |enhancement
   Target Milestone|--- |11.0
   Last reconfirmed||2020-12-01
 Status|UNCONFIRMED |ASSIGNED
   Keywords||missed-optimization

[Bug libfortran/95293] Fortran not passing array by reference

2020-12-02 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95293

--- Comment #11 from Thomas Koenig  ---
(In reply to Dominique d'Humieres from comment #10)
> Could this PR be closed as INVALID?

Yes, I think so.

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-12-03 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #26 from Thomas Koenig  ---
Yep, it's implemented and works great.

For a simple "sum of digits" program in base ten, it's an acceleration
by more than a factor of two.

Thanks!

[Bug libfortran/98129] New: Failure on reading big chunk of /dev/urandom

2020-12-03 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129

Bug ID: 98129
   Summary: Failure on reading big chunk of /dev/urandom
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libfortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

The following program

program main
  implicit none
  integer, parameter :: n = 10**7
  integer :: u,v
  integer, dimension(:), allocatable :: a
  open (newunit=u,file="/dev/urandom",form="unformatted",access="stream")
  open (newunit=v,file="/dev/null",form="unformatted",access="stream")
  allocate (a(n))
  read (u) a
  write (v) a
end program main

fails on Linux with

At line 9 of file read.f90
Fortran runtime error: End of file

Error termination. Backtrace:
#0  0x7fa9c7be372f in read_block_direct
at ../../../trunk/libgfortran/io/transfer.c:664
#1  0x7fa9c7be372f in unformatted_read
at ../../../trunk/libgfortran/io/transfer.c:1127
#2  0x400bf3 in ???
#3  0x400c9c in ???
#4  0x7fa9c6e64349 in __libc_start_main
at ../csu/libc-start.c:308
#5  0x400909 in ???
at ../sysdeps/x86_64/start.S:120
#6  0x in ???

[Bug libfortran/98129] Failure on reading big chunk of /dev/urandom

2020-12-04 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129

--- Comment #2 from Thomas Koenig  ---
The problem seems to be related to an early return from the read system call:

strace -e trace=open,read,close ./a.out
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\v\2\0\0\0\0\0"...,
832) = 832
close(3)= 0
close(3)= 0
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\217\0\0\0\0\0\0"...,
832) = 832
close(3)= 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260/\0\0\0\0\0\0"...,
832) = 832
close(3)= 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@*\0\0\0\0\0\0"..., 832)
= 832
close(3)= 0
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`D\2\0\0\0\0\0"..., 832)
= 832
close(3)= 0
read(3,
":\312\275\302\373|\204[c`\20\275\230\205\326@\255Mj\263-qd\336\30\300\2G\326\215\333J"...,
4000) = 33554431
At line 9 of file read.f90
Fortran runtime error: End of file

Error termination. Backtrace:
close(5)= 0
close(5)= 0
close(6)= 0
close(5)= 0
close(5)= 0
close(5)= 0
close(6)= 0
close(5)= 0
close(6)= 0
close(5)= 0
#0  0x7fd9847c572f in read_block_direct
at ../../../trunk/libgfortran/io/transfer.c:664
#1  0x7fd9847c572f in unformatted_read
at ../../../trunk/libgfortran/io/transfer.c:1127
#2  0x400bf3 in ???
#3  0x400c9c in ???
#4  0x7fd983a46349 in __libc_start_main
at ../csu/libc-start.c:308
#5  0x400909 in ???
at ../sysdeps/x86_64/start.S:120
#6  0x in ???
close(4)= 0
close(3)= 0
+++ exited with 2 +++

[Bug libfortran/98129] Failure on reading big chunk of /dev/urandom

2020-12-04 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129

--- Comment #3 from Thomas Koenig  ---
The problem seems to be that we assume that a short read is always
an EOF, in read_block_direct:

  if (unlikely ((ssize_t) nbytes != have_read_record))
{
  /* Short read,  e.g. if we hit EOF.  For stream files,
   we have to set the end-of-file condition.  */
  hit_eof (dtp);
}
  return;
}

[Bug libfortran/98129] Failure on reading big chunk of /dev/urandom

2020-12-04 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129

Thomas Koenig  changed:

   What|Removed |Added

 Target||x86_64-pc-linux-gnu

--- Comment #5 from Thomas Koenig  ---
Question... on your respective systems, could you strace or truss it and find
if there is a short read?

On Linux, there seems to be a limitation of how many bytes
a read from /dev/urandom returns, and we assume that this is
an end of file.

However, this is not correct - we can only safely assume eof if
read() returns zero bytes.

[Bug libfortran/98129] Failure on reading big chunk of /dev/urandom

2020-12-04 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129

--- Comment #10 from Thomas Koenig  ---
(In reply to anlauf from comment #9)
> The patch seems to regtest ok, but certainly needs some wider testing.

Actually, I think the bug is in io/unix.c:raw_read. That should take
care of repeating the reads as needed.

Seems like that, if nbyte <= MAX_CHUNK, we do not take account of the
possibility of a short read.

[Bug testsuite/98156] New: [Coarray] alloc_comp_1.f90 tests for wrong condition

2020-12-05 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98156

Bug ID: 98156
   Summary: [Coarray] alloc_comp_1.f90 tests for wrong condition
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

alloc_comp_1.f90 has

! { dg-do run }
!
! Allocatable scalar corrays were mishandled (ICE)
!
type t
  integer, allocatable :: caf[:]
end type t
type(t) :: a
allocate (a%caf[3:*])
a%caf = 7
if (a%caf /= 7) STOP 1
print *,ucobound(a%caf,dim=1)
if (any (lcobound (a%caf) /= [ 3 ]) &
.or. ucobound (a%caf, dim=1) /= this_image ()+2)  &
  STOP 2
deallocate (a%caf)
end

The second test about ucobound is clearly bogus - it should be
num_images() instead of this_image().

1 2 3 4 5 6 >

1 - 100 of 521 matches

Mail list logo