Re: [PATCH, v3] Fortran: ordering of hidden procedure arguments [PR107441]

2022-11-08 Thread Mikael Morin

Hello,

Le 07/11/2022 à 22:45, Harald Anlauf via Fortran a écrit :

Dear all,

Am 04.11.22 um 10:53 schrieb Mikael Morin:

Le 03/11/2022 à 23:03, Harald Anlauf a écrit :

I've spent some time not only staring at create_function_arglist,
but trying several variations handling the declared hidden parms,
and applying the necessary adjustments to gfc_get_function_type.
(Managing linked trees is not the issue, just understanding them.)
I've been unable to get the declarations in sync, and would need
help how to debug the mess I've created.  Dropping my patch for
the time being.


If you want, we can meet on IRC somewhen (tonight?).


armed with the new knowledge, I could now understand what
(more or less) trivially went wrong with my previous patch.

The attached patch remedies that: gfc_get_function_type() now
properly separates the types of the hidden parameters so that
optional+value comes before string length and caf stuff,
while in create_function_arglist we simply need to split
the walking over the typelists so that the optional+value
stuff, which is basically just booleans, is done separately
from the other parts.

Looking at the tree-dumps, the function decls now seem to be
fine at least for the given testcases.  I've adjusted one of
the testcases to validate this.

Regtests fine on x86_64-pc-linux-gnu.  OK for mainline?


this is mostly good.
There is one last corner case that is not properly handled:


diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index 63515b9072a..94988b8690e 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc

(...)

@@ -2619,6 +2620,15 @@ create_function_arglist (gfc_symbol * sym)
 if (f->sym != NULL) /* Ignore alternate returns.  */
   hidden_typelist = TREE_CHAIN (hidden_typelist);
 
+  /* Advance hidden_typelist over optional+value argument presence flags.  */

+  optval_typelist = hidden_typelist;
+  for (f = gfc_sym_get_dummy_args (sym); f; f = f->next)
+if (f->sym != NULL
+   && f->sym->attr.optional && f->sym->attr.value
+   && !f->sym->attr.dimension && f->sym->ts.type != BT_CLASS
+   && !gfc_bt_struct (f->sym->ts.type))
+  hidden_typelist = TREE_CHAIN (hidden_typelist);
+


This new loop copies the condition guarding the handling of optional 
value presence arguments, except that the condition is in an "else if", 
and the complement of the condition in the corresponding "if" is 
missing, to have strictly the same conditions.


Admittedly, it only makes a difference for character optional value 
arguments, which are hardly working.  At least they work as long as one 
doesn't try to query their presence.  Below is a case regressing with 
your patch.


With that fixed, I think it's good for mainline.
Thanks for your patience.


! { dg-do compile }
!
! PR fortran/107441
! Check that procedure types and procedure decls match when the procedure
! has both character-typed and character-typed optional value args.
!
! Contributed by M.Morin

program p
  interface
subroutine i(c, o)
  character(*) :: c
  character(3), optional, value :: o
end subroutine i
  end interface
  procedure(i), pointer :: pp
  pp => s
  call pp("abcd", "xyz")
contains
  subroutine s(c, o)
character(*) :: c
character(3), optional, value :: o
if (o /= "xyz") stop 1
if (c /= "abcd") stop 2
  end subroutine s
end program p




New member

2022-11-08 Thread Federico Perini via Fortran
Dear all, 

 

First of all thank you for letting me in the GNU Fortran mailing list. 

 

I've been an avid gfortran user for the past 10+ years and have posted quite
a few 
issues on Bugzilla recently (one of them is the Finalization issue I see in
the recent posts: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107489)

 

I would like to begin devoting more of my free time to improving it and have
been introduced to the basics by Steve Kargl. 

I'm trying to start from this DTIO bug
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106731 ) 

but the learning curve looks pretty steep, so I need to ask for your help. 

 

So far I've found that: 

 

*   Derived types with DTIO are always SAVEd, so, when used as automatic
arrays, they cause a segfault. In trans-decl.cc it reads" 

 

  /* If derived-type variables with DTIO procedures are not made static

 some bits of code referencing them get optimized away.

 TODO Understand why this is so and fix it.  */

 

*   If I comment that chunk of code (i.e. do not force variables to be
static), all tests run fine, except for dtio_4.f90 and dtio_14.f90, only
when run with -O3.

 

In dtio_4 for example: 

 

*   If I set the failed variable to SAVE, the test no longer fails: 

 

 type (udt), save :: udt1 ! success!

 

*   The issue never arises with the extended class variable, even if not
saved.

 

Now, I need guidance to understand where should I look at in the gfortran
code to dig deeper into this problem, 

Any help from you will be greatly appreciated!

 

Thank you and I hope I can be meaningfully useful,

 

Federico Perini

 



[newlib] Generally make all 'long double complex' methods available in

2022-11-08 Thread Thomas Schwinge
..., not just '#if defined(__CYGWIN__)'.  (Exception: 'clog10l' which currently
indeed is for Cygwin only.)

This completes 2017-07-05 commit be3ca3947402827aa52709e677369bc7ad30aa1d
"Fixed warnings for some long double complex methods" after Aditya Upadhyay's
work on importing "Long double complex methods" from NetBSD.

For example, this changes GCC/nvptx libgfortran 'configure' output as follows:

[...]
checking for ccosf... yes
checking for ccos... yes
checking for ccosl... [-no-]{+yes+}
[...]

..., and correspondingly GCC/nvptx 'nvptx-none/libgfortran/config.h' as
follows:

[...]
 /* Define to 1 if you have the `ccosl' function. */
-/* #undef HAVE_CCOSL */
+#define HAVE_CCOSL 1
[...]

Similarly for 'ccoshl', 'cexpl', 'cpowl', 'csinl', 'csinhl', 'ctanl', 'ctanhl',
'cacoshl', 'cacosl', 'casinhl', 'catanhl'.  ('conjl', 'cprojl' are not
currently being used in libgfortran.)

This in turn simplifies GCC/nvptx 'libgfortran/intrinsics/c99_functions.c'
compilation such that this files doesn't have to provide its own
"Implementation of various C99 functions" for those, when in fact they're
available in newlib libm.
---

A few more words on why this is relevant for GCC.

For example, 'cexpl' usually is provided by libm, but if it isn't, the
open-coded replacement function in
'libgfortran/intrinsics/c99_functions.c' is effective if it holds that
'defined(HAVE_COSL) && defined(HAVE_SINL) && defined(HAVE_EXPL)':

long double complex
cexpl (long double complex z)
{
  long double a, b;
  long double complex v;

  a = REALPART (z);
  b = IMAGPART (z);
  COMPLEX_ASSIGN (v, cosl (b), sinl (b));
  return expl (a) * v;
}

This replacement code is active for current GCC/nvptx (... if no longer
compiling GCC/nvptx libgfortran in "minimal" mode, 'LIBGFOR_MINIMAL',
which I'm currently working on).

Comparing the preceeding to the 'c99_functions.c.188t.sincos' dump, we see for
that function:

 __attribute__((nothrow, leaf, const))
 complex long double cexpl (complex long double z)
 {
   long double b;
   long double a;
   long double _1;
   long double _2;
   long double _4;
   long double _5;
   long double _11;
+  complex long double sincostmp_13;

[local count: 1073741824]:
   a_7 = REALPART_EXPR ;
   b_8 = IMAGPART_EXPR ;
-  _1 = cosl (b_8);
-  _2 = sinl (b_8);
+  sincostmp_13 = __builtin_cexpil (b_8);
+  _1 = REALPART_EXPR ;
+  _2 = IMAGPART_EXPR ;
   _11 = expl (a_7);
   _4 = _1 * _11;
   _5 = _2 * _11;
   REALPART_EXPR <> = _4;
   IMAGPART_EXPR <> = _5;
   return ;

 }

That is, the 'cosl (b)', 'sinl (b)' sequence is replaced by
'__builtin_cexpil'.  That '__builtin_cexpil' is then later mapped back
into: 'cexpl'.  We've now got an infinitely-recursive 'cexpl' replacement
function, "implemented via itself"; GCC/nvptx libgfortran assumes there
is no 'cexpl' in libm, whereas this 'sincos' transformation does assume
that there is.  (..., which looks like an additional bug on its own.)

At the PTX-level, this leads to the following:

[...]
// BEGIN GLOBAL FUNCTION DECL: cexpl
.visible .func cexpl (.param.u64 %in_ar0, .param.f64 %in_ar1, .param.f64 
%in_ar2);

// BEGIN GLOBAL FUNCTION DEF: cexpl
.visible .func cexpl (.param.u64 %in_ar0, .param.f64 %in_ar1, .param.f64 
%in_ar2)
{
[...]
call cexpl, (%out_arg1, %out_arg2, %out_arg3);
[...]
ret;
}

[...]
// BEGIN GLOBAL FUNCTION DECL: cexpl
.extern .func cexpl (.param.u64 %in_ar0, .param.f64 %in_ar1, .param.f64 
%in_ar2);
[...]

We see the '.visible .func cexpl' declaration and definition for the
libgfortran replacement function and in the same compilation unit also
the '.extern .func cexpl' declaration that implicitly gets introduced via
the 'sincos' transformation (via the GCC/nvptx back end emitting an
explicit declaration of any function referenced), and 'ptxas' then
(rightfully so) complains about that mismatch:

ptxas c99_functions.o, line 35; error   : Inconsistent redefinition of 
variable 'cexpl'
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status
make[2]: *** [c99_functions.lo] Error 1

---
 newlib/libc/include/complex.h | 35 ---
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/newlib/libc/include/complex.h b/newlib/libc/include/complex.h
index 0a3ea97ed..ad3028e4c 100644
--- a/newlib/libc/include/complex.h
+++ b/newlib/libc/include/complex.h
@@ -20,6 +20,7 @@ __BEGIN_DECLS
 /* 7.3.5.1 The cacos functions */
 double complex cacos(double complex);
 float complex cacosf(float complex);
+long double complex cacosl(long double complex);

 /* 7.3.5.2 The casin functions */
 double complex casin(double complex);
@@ -34,44 +35,54 @@ long double complex catanl(long double complex);
 /* 7.3.5.1

Re: [newlib] Generally make all 'long double complex' methods available in

2022-11-08 Thread Jeff Johnston via Fortran
LGTM.

-- Jeff J.

On Tue, Nov 8, 2022 at 1:31 PM Thomas Schwinge 
wrote:

> ..., not just '#if defined(__CYGWIN__)'.  (Exception: 'clog10l' which
> currently
> indeed is for Cygwin only.)
>
> This completes 2017-07-05 commit be3ca3947402827aa52709e677369bc7ad30aa1d
> "Fixed warnings for some long double complex methods" after Aditya
> Upadhyay's
> work on importing "Long double complex methods" from NetBSD.
>
> For example, this changes GCC/nvptx libgfortran 'configure' output as
> follows:
>
> [...]
> checking for ccosf... yes
> checking for ccos... yes
> checking for ccosl... [-no-]{+yes+}
> [...]
>
> ..., and correspondingly GCC/nvptx 'nvptx-none/libgfortran/config.h' as
> follows:
>
> [...]
>  /* Define to 1 if you have the `ccosl' function. */
> -/* #undef HAVE_CCOSL */
> +#define HAVE_CCOSL 1
> [...]
>
> Similarly for 'ccoshl', 'cexpl', 'cpowl', 'csinl', 'csinhl', 'ctanl',
> 'ctanhl',
> 'cacoshl', 'cacosl', 'casinhl', 'catanhl'.  ('conjl', 'cprojl' are not
> currently being used in libgfortran.)
>
> This in turn simplifies GCC/nvptx 'libgfortran/intrinsics/c99_functions.c'
> compilation such that this files doesn't have to provide its own
> "Implementation of various C99 functions" for those, when in fact they're
> available in newlib libm.
> ---
>
> A few more words on why this is relevant for GCC.
>
> For example, 'cexpl' usually is provided by libm, but if it isn't, the
> open-coded replacement function in
> 'libgfortran/intrinsics/c99_functions.c' is effective if it holds that
> 'defined(HAVE_COSL) && defined(HAVE_SINL) && defined(HAVE_EXPL)':
>
> long double complex
> cexpl (long double complex z)
> {
>   long double a, b;
>   long double complex v;
>
>   a = REALPART (z);
>   b = IMAGPART (z);
>   COMPLEX_ASSIGN (v, cosl (b), sinl (b));
>   return expl (a) * v;
> }
>
> This replacement code is active for current GCC/nvptx (... if no longer
> compiling GCC/nvptx libgfortran in "minimal" mode, 'LIBGFOR_MINIMAL',
> which I'm currently working on).
>
> Comparing the preceeding to the 'c99_functions.c.188t.sincos' dump, we see
> for
> that function:
>
>  __attribute__((nothrow, leaf, const))
>  complex long double cexpl (complex long double z)
>  {
>long double b;
>long double a;
>long double _1;
>long double _2;
>long double _4;
>long double _5;
>long double _11;
> +  complex long double sincostmp_13;
>
> [local count: 1073741824]:
>a_7 = REALPART_EXPR ;
>b_8 = IMAGPART_EXPR ;
> -  _1 = cosl (b_8);
> -  _2 = sinl (b_8);
> +  sincostmp_13 = __builtin_cexpil (b_8);
> +  _1 = REALPART_EXPR ;
> +  _2 = IMAGPART_EXPR ;
>_11 = expl (a_7);
>_4 = _1 * _11;
>_5 = _2 * _11;
>REALPART_EXPR <> = _4;
>IMAGPART_EXPR <> = _5;
>return ;
>
>  }
>
> That is, the 'cosl (b)', 'sinl (b)' sequence is replaced by
> '__builtin_cexpil'.  That '__builtin_cexpil' is then later mapped back
> into: 'cexpl'.  We've now got an infinitely-recursive 'cexpl' replacement
> function, "implemented via itself"; GCC/nvptx libgfortran assumes there
> is no 'cexpl' in libm, whereas this 'sincos' transformation does assume
> that there is.  (..., which looks like an additional bug on its own.)
>
> At the PTX-level, this leads to the following:
>
> [...]
> // BEGIN GLOBAL FUNCTION DECL: cexpl
> .visible .func cexpl (.param.u64 %in_ar0, .param.f64 %in_ar1,
> .param.f64 %in_ar2);
>
> // BEGIN GLOBAL FUNCTION DEF: cexpl
> .visible .func cexpl (.param.u64 %in_ar0, .param.f64 %in_ar1,
> .param.f64 %in_ar2)
> {
> [...]
> call cexpl, (%out_arg1, %out_arg2, %out_arg3);
> [...]
> ret;
> }
>
> [...]
> // BEGIN GLOBAL FUNCTION DECL: cexpl
> .extern .func cexpl (.param.u64 %in_ar0, .param.f64 %in_ar1,
> .param.f64 %in_ar2);
> [...]
>
> We see the '.visible .func cexpl' declaration and definition for the
> libgfortran replacement function and in the same compilation unit also
> the '.extern .func cexpl' declaration that implicitly gets introduced via
> the 'sincos' transformation (via the GCC/nvptx back end emitting an
> explicit declaration of any function referenced), and 'ptxas' then
> (rightfully so) complains about that mismatch:
>
> ptxas c99_functions.o, line 35; error   : Inconsistent redefinition of
> variable 'cexpl'
> ptxas fatal   : Ptx assembly aborted due to errors
> nvptx-as: ptxas returned 255 exit status
> make[2]: *** [c99_functions.lo] Error 1
>
> ---
>  newlib/libc/include/complex.h | 35 ---
>  1 file changed, 16 insertions(+), 19 deletions(-)
>
> diff --git a/newlib/libc/include/complex.h b/newlib/libc/include/complex.h
> index 0a3ea97ed..ad3028e4c 100644
> --- a/newlib/libc/include/complex.h
> +++ b/newlib/libc/include/complex.h
> @@ -20,6 +2

Re: [PATCH, v3] Fortran: ordering of hidden procedure arguments [PR107441]

2022-11-08 Thread Harald Anlauf via Fortran

Hi Mikael,

Am 08.11.22 um 11:32 schrieb Mikael Morin:

this is mostly good.
There is one last corner case that is not properly handled:


diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index 63515b9072a..94988b8690e 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc

(...)

@@ -2619,6 +2620,15 @@ create_function_arglist (gfc_symbol * sym)
 if (f->sym != NULL)    /* Ignore alternate returns.  */
   hidden_typelist = TREE_CHAIN (hidden_typelist);

+  /* Advance hidden_typelist over optional+value argument presence
flags.  */
+  optval_typelist = hidden_typelist;
+  for (f = gfc_sym_get_dummy_args (sym); f; f = f->next)
+    if (f->sym != NULL
+    && f->sym->attr.optional && f->sym->attr.value
+    && !f->sym->attr.dimension && f->sym->ts.type != BT_CLASS
+    && !gfc_bt_struct (f->sym->ts.type))
+  hidden_typelist = TREE_CHAIN (hidden_typelist);
+


This new loop copies the condition guarding the handling of optional
value presence arguments, except that the condition is in an "else if",
and the complement of the condition in the corresponding "if" is
missing, to have strictly the same conditions.


I know, and I left that intentionally, as it is related to
PR107444, assuming that it doesn't lead to a new ICE.  Bad idea.


Admittedly, it only makes a difference for character optional value
arguments, which are hardly working.  At least they work as long as one
doesn't try to query their presence.  Below is a case regressing with
your patch.



With that fixed, I think it's good for mainline.
Thanks for your patience.


! { dg-do compile }
!
! PR fortran/107441
! Check that procedure types and procedure decls match when the procedure
! has both character-typed and character-typed optional value args.
!
! Contributed by M.Morin

program p
   interface
     subroutine i(c, o)
   character(*) :: c
   character(3), optional, value :: o
     end subroutine i
   end interface
   procedure(i), pointer :: pp
   pp => s
   call pp("abcd", "xyz")
contains
   subroutine s(c, o)
     character(*) :: c
     character(3), optional, value :: o
     if (o /= "xyz") stop 1
     if (c /= "abcd") stop 2
   end subroutine s
end program p


Well, that testcase may compile with 12-branch, but it gives
wrong code.  Furthermore, it is arguably invalid, as you are
currently unable to check the presence of the optional argument
due to PR107444.  I am therefore reluctant to have that testcase
now.

To fix that, we may have to bite the bullet and break the
documented ABI, or rather update it, as character,value,optional
is broken in all current gfortran versions, and the documentation
is not completely consistent.  I had planned to do this with the
fix for PR107444, which want to keep separate from the current
patch for good reasons.

I have modified my patch so that your testcase above compiles
and runs.  But as explained, I don't want to add it now.

Regtested again.  What do you think?

Thanks,
Harald

From 8694d1d2cbd19b5148b5d1d891b182cc3e718f40 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Fri, 28 Oct 2022 21:58:08 +0200
Subject: [PATCH] Fortran: ordering of hidden procedure arguments [PR107441]

The gfortran argument passing conventions specify a certain order for
procedure arguments that should be followed consequently: the hidden
presence status flags of optional+value scalar arguments of intrinsic type
shall come before the hidden character length, coarray token and offset.
Clarify that in the documentation.

gcc/fortran/ChangeLog:

	PR fortran/107441
	* gfortran.texi (Argument passing conventions): Clarify the gfortran
	argument passing conventions with regard to OPTIONAL dummy arguments
	of intrinsic type.
	* trans-decl.cc (create_function_arglist): Adjust the ordering of
	automatically generated hidden procedure arguments to match the
	documented ABI for gfortran.
	* trans-types.cc (gfc_get_function_type): Separate hidden parameters
	so that the presence flag for optional+value arguments come before
	string length, coarray token and offset, as required.

gcc/testsuite/ChangeLog:

	PR fortran/107441
	* gfortran.dg/coarray/pr107441-caf.f90: New test.
	* gfortran.dg/optional_absent_6.f90: New test.
	* gfortran.dg/optional_absent_7.f90: New test.
---
 gcc/fortran/gfortran.texi |  3 +-
 gcc/fortran/trans-decl.cc | 31 +++---
 gcc/fortran/trans-types.cc| 25 
 .../gfortran.dg/coarray/pr107441-caf.f90  | 27 +
 .../gfortran.dg/optional_absent_6.f90 | 60 +++
 .../gfortran.dg/optional_absent_7.f90 | 31 ++
 6 files changed, 157 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/coarray/pr107441-caf.f90
 create mode 100644 gcc/testsuite/gfortran.dg/optional_absent_6.f90
 create mode 100644 gcc/testsuite/gfortran.dg/optional_absent_7.f90

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 4b4ecd528a7..

Re: [PATCH, v3] Fortran: ordering of hidden procedure arguments [PR107441]

2022-11-08 Thread Mikael Morin

Le 08/11/2022 à 21:31, Harald Anlauf a écrit :

Hi Mikael,

Am 08.11.22 um 11:32 schrieb Mikael Morin:

this is mostly good.
There is one last corner case that is not properly handled:


diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index 63515b9072a..94988b8690e 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc

(...)

@@ -2619,6 +2620,15 @@ create_function_arglist (gfc_symbol * sym)
 if (f->sym != NULL)    /* Ignore alternate returns.  */
   hidden_typelist = TREE_CHAIN (hidden_typelist);

+  /* Advance hidden_typelist over optional+value argument presence
flags.  */
+  optval_typelist = hidden_typelist;
+  for (f = gfc_sym_get_dummy_args (sym); f; f = f->next)
+    if (f->sym != NULL
+    && f->sym->attr.optional && f->sym->attr.value
+    && !f->sym->attr.dimension && f->sym->ts.type != BT_CLASS
+    && !gfc_bt_struct (f->sym->ts.type))
+  hidden_typelist = TREE_CHAIN (hidden_typelist);
+


This new loop copies the condition guarding the handling of optional
value presence arguments, except that the condition is in an "else if",
and the complement of the condition in the corresponding "if" is
missing, to have strictly the same conditions.


I know, and I left that intentionally, as it is related to
PR107444, assuming that it doesn't lead to a new ICE.  Bad idea.


Admittedly, it only makes a difference for character optional value
arguments, which are hardly working.  At least they work as long as one
doesn't try to query their presence.  Below is a case regressing with
your patch.



With that fixed, I think it's good for mainline.
Thanks for your patience.


! { dg-do compile }
!
! PR fortran/107441
! Check that procedure types and procedure decls match when the procedure
! has both character-typed and character-typed optional value args.
!
! Contributed by M.Morin

program p
   interface
 subroutine i(c, o)
   character(*) :: c
   character(3), optional, value :: o
 end subroutine i
   end interface
   procedure(i), pointer :: pp
   pp => s
   call pp("abcd", "xyz")
contains
   subroutine s(c, o)
 character(*) :: c
 character(3), optional, value :: o
 if (o /= "xyz") stop 1
 if (c /= "abcd") stop 2
   end subroutine s
end program p


Well, that testcase may compile with 12-branch, but it gives
wrong code.  Furthermore, it is arguably invalid, as you are
currently unable to check the presence of the optional argument
due to PR107444.  I am therefore reluctant to have that testcase
now.

To fix that, we may have to bite the bullet and break the
documented ABI, or rather update it, as character,value,optional
is broken in all current gfortran versions, and the documentation
is not completely consistent.  I had planned to do this with the
fix for PR107444, which want to keep separate from the current
patch for good reasons.

I have modified my patch so that your testcase above compiles
and runs.  But as explained, I don't want to add it now.

Regtested again.  What do you think?

Let's proceed with the v3 then.  Character optional value arguments are 
corner cases anyway.


[PATCH v3 0/3] RFC: P1689R5 support

2022-11-08 Thread Ben Boeckel via Fortran
Hi,

This patch adds initial support for ISO C++'s [P1689R5][], a format for
describing C++ module requirements and provisions based on the source
code. This is required because compiling C++ with modules is not
embarrassingly parallel and need to be ordered to ensure that `import
some_module;` can be satisfied in time by making sure that the TU with
`export import some_module;` is compiled first.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html

I'd like feedback on the approach taken here with respect to the
user-visible flags. I'll also note that header units are not supported
at this time because the current `-E` behavior with respect to `import
;` is to search for an appropriate `.gcm` file which is not
something such a "scan" can support. A new mode will likely need to be
created (e.g., replacing `-E` with `-fc++-module-scanning` or something)
where headers are looked up "normally" and processed only as much as
scanning requires.

For the record, Clang has patches with similar flags and behavior by
Chuanqi Xu here:

https://reviews.llvm.org/D134269

with the same flags.

Thanks,

--Ben

---
v2 -> v3:

- changelog entries moved to commit messages
- documentation updated/added in the UTF-8 routine editing

v1 -> v2:

- removal of the `deps_write(extra)` parameter to option-checking where
  ndeeded
- default parameter of `cpp_finish(fdeps_stream = NULL)`
- unification of libcpp UTF-8 validity functions from v1
- test cases for flag parsing states (depflags-*) and p1689 output
  (p1689-*)

Ben Boeckel (3):
  libcpp: reject codepoints above 0x10
  libcpp: add a function to determine UTF-8 validity of a C string
  p1689r5: initial support

 gcc/c-family/c-opts.cc|  40 +++-
 gcc/c-family/c.opt|  12 +
 gcc/cp/module.cc  |   3 +-
 gcc/doc/invoke.texi   |  15 ++
 gcc/testsuite/g++.dg/modules/depflags-f-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-f.C |   1 +
 gcc/testsuite/g++.dg/modules/depflags-fi.C|   3 +
 gcc/testsuite/g++.dg/modules/depflags-fj-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-fj.C|   4 +
 .../g++.dg/modules/depflags-fjo-MD.C  |   4 +
 gcc/testsuite/g++.dg/modules/depflags-fjo.C   |   5 +
 gcc/testsuite/g++.dg/modules/depflags-fo-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-fo.C|   4 +
 gcc/testsuite/g++.dg/modules/depflags-j-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-j.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-jo-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-jo.C|   4 +
 gcc/testsuite/g++.dg/modules/depflags-o-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-o.C |   3 +
 gcc/testsuite/g++.dg/modules/modules.exp  |   1 +
 gcc/testsuite/g++.dg/modules/p1689-1.C|  18 ++
 gcc/testsuite/g++.dg/modules/p1689-1.exp.json |  27 +++
 gcc/testsuite/g++.dg/modules/p1689-2.C|  16 ++
 gcc/testsuite/g++.dg/modules/p1689-2.exp.json |  16 ++
 gcc/testsuite/g++.dg/modules/p1689-3.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-3.exp.json |  16 ++
 gcc/testsuite/g++.dg/modules/p1689-4.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-4.exp.json |  14 ++
 gcc/testsuite/g++.dg/modules/p1689-5.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-5.exp.json |  14 ++
 gcc/testsuite/g++.dg/modules/test-p1689.py| 222 ++
 gcc/testsuite/lib/modules.exp |  71 ++
 libcpp/charset.cc |  28 ++-
 libcpp/include/cpplib.h   |  12 +-
 libcpp/include/mkdeps.h   |  17 +-
 libcpp/init.cc|  13 +-
 libcpp/internal.h |   2 +
 libcpp/mkdeps.cc  | 149 +++-
 38 files changed, 773 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-f-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-f.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fi.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fj-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fj.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fjo-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fjo.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fo-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fo.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-j-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-j.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-jo-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-jo.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-o-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-o.C
 create mode 100644 gcc/testsuite/g++.dg/modules/p1689-1.C
 create mode 100644 gcc/testsuite/g++.dg/modules/p1689-1.exp.json
 create mode 100644 gcc/testsu

[PATCH v3 1/3] libcpp: reject codepoints above 0x10FFFF

2022-11-08 Thread Ben Boeckel via Fortran
Unicode does not support such values because they are unrepresentable in
UTF-16.

libcpp/

* charset.cc: Reject encodings of codepoints above 0x10.
UTF-16 does not support such codepoints and therefore all
Unicode rejects such values.

Signed-off-by: Ben Boeckel 
---
 libcpp/charset.cc | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index 12a398e7527..324b5b19136 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -158,6 +158,10 @@ struct _cpp_strbuf
encoded as any of DF 80, E0 9F 80, F0 80 9F 80, F8 80 80 9F 80, or
FC 80 80 80 9F 80.  Only the first is valid.
 
+   Additionally, Unicode declares that all codepoints above 0010 are
+   invalid because they cannot be represented in UTF-16. As such, all 5- and
+   6-byte encodings are invalid.
+
An implementation note: the transformation from UTF-16 to UTF-8, or
vice versa, is easiest done by using UTF-32 as an intermediary.  */
 
@@ -216,7 +220,7 @@ one_utf8_to_cppchar (const uchar **inbufp, size_t 
*inbytesleftp,
   if (c <= 0x3FF && nbytes > 5) return EILSEQ;
 
   /* Make sure the character is valid.  */
-  if (c > 0x7FFF || (c >= 0xD800 && c <= 0xDFFF)) return EILSEQ;
+  if (c > 0x10 || (c >= 0xD800 && c <= 0xDFFF)) return EILSEQ;
 
   *cp = c;
   *inbufp = inbuf;
@@ -320,7 +324,7 @@ one_utf32_to_utf8 (iconv_t bigend, const uchar **inbufp, 
size_t *inbytesleftp,
   s += inbuf[bigend ? 2 : 1] << 8;
   s += inbuf[bigend ? 3 : 0];
 
-  if (s >= 0x7FFF || (s >= 0xD800 && s <= 0xDFFF))
+  if (s > 0x10 || (s >= 0xD800 && s <= 0xDFFF))
 return EILSEQ;
 
   rval = one_cppchar_to_utf8 (s, outbufp, outbytesleftp);
-- 
2.38.1



[PATCH v3 2/3] libcpp: add a function to determine UTF-8 validity of a C string

2022-11-08 Thread Ben Boeckel via Fortran
This simplifies the interface for other UTF-8 validity detections when a
simple "yes" or "no" answer is sufficient.

libcpp/

* charset.cc: Add `_cpp_valid_utf8_str` which determines whether
a C string is valid UTF-8 or not.
* internal.h: Add prototype for `_cpp_valid_utf8_str`.

Signed-off-by: Ben Boeckel 
---
 libcpp/charset.cc | 20 
 libcpp/internal.h |  2 ++
 2 files changed, 22 insertions(+)

diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index 324b5b19136..e130bc01f48 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -1868,6 +1868,26 @@ _cpp_valid_utf8 (cpp_reader *pfile,
   return true;
 }
 
+/*  Detect whether a C-string is a valid UTF-8-encoded set of bytes. Returns
+`false` if any contained byte sequence encodes an invalid Unicode codepoint
+or is not a valid UTF-8 sequence. Returns `true` otherwise. */
+
+extern bool
+_cpp_valid_utf8_str (const char *name)
+{
+  const uchar* in = (const uchar*)name;
+  size_t len = strlen(name);
+  cppchar_t cp;
+
+  while (*in)
+{
+  if (one_utf8_to_cppchar(&in, &len, &cp))
+   return false;
+}
+
+  return true;
+}
+
 /* Subroutine of convert_hex and convert_oct.  N is the representation
in the execution character set of a numeric escape; write it into the
string buffer TBUF and update the end-of-string pointer therein.  WIDE
diff --git a/libcpp/internal.h b/libcpp/internal.h
index badfd1b40da..4f2dd4a2f5c 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -834,6 +834,8 @@ extern bool _cpp_valid_utf8 (cpp_reader *pfile,
 struct normalize_state *nst,
 cppchar_t *cp);
 
+extern bool _cpp_valid_utf8_str (const char *str);
+
 extern void _cpp_destroy_iconv (cpp_reader *);
 extern unsigned char *_cpp_convert_input (cpp_reader *, const char *,
  unsigned char *, size_t, size_t,
-- 
2.38.1



[PATCH v3 3/3] p1689r5: initial support

2022-11-08 Thread Ben Boeckel via Fortran
This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.

Support is communicated through the following three new flags:

- `-fdeps-format=` specifies the format for the output. Currently named
  `p1689r5`.

- `-fdeps-file=` specifies the path to the file to write the format to.

- `-fdep-output=` specifies the `.o` that will be written for the TU
  that is scanned. This is required so that the build system can
  correlate the dependency output with the actual compilation that will
  occur.

CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.

Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: 
https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst

TODO:

- header-unit information fields

Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.

- non-utf8 paths

The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously represetable in UTF-8" case).

- figure out why junk gets placed at the end of the file

Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.

libcpp/

* include/cpplib.h: Add cpp_deps_format enum.
(cpp_options): Add format field
(cpp_finish): Add dependency stream parameter.
* include/mkdeps.h (deps_add_module_target): Add new preprocessor
parameter used for C++ module tracking.
* init.cc (cpp_finish): Add new preprocessor parameter used for C++
module tracking.
* mkdeps.cc (mkdeps): Implement P1689R5 output.

gcc/

* doc/invoke.texi: Document -fdeps-format=, -fdep-file=, and
-fdep-output= flags.

gcc/c-family/

* c-opts.cc (c_common_handle_option): Add fdeps_file variable and
-fdeps-format=, -fdep-file=, and -fdep-output= parsing.
* c.opt: Add -fdeps-format=, -fdep-file=, and -fdep-output= flags.

gcc/cp/

* module.cc (preprocessed_module): Pass whether the module is
exported to dependency tracking.

gcc/testsuite/

* g++.dg/modules/depflags-f-MD.C: New test.
* g++.dg/modules/depflags-f.C: New test.
* g++.dg/modules/depflags-fi.C: New test.
* g++.dg/modules/depflags-fj-MD.C: New test.
* g++.dg/modules/depflags-fj.C: New test.
* g++.dg/modules/depflags-fjo-MD.C: New test.
* g++.dg/modules/depflags-fjo.C: New test.
* g++.dg/modules/depflags-fo-MD.C: New test.
* g++.dg/modules/depflags-fo.C: New test.
* g++.dg/modules/depflags-j-MD.C: New test.
* g++.dg/modules/depflags-j.C: New test.
* g++.dg/modules/depflags-jo-MD.C: New test.
* g++.dg/modules/depflags-jo.C: New test.
* g++.dg/modules/depflags-o-MD.C: New test.
* g++.dg/modules/depflags-o.C: New test.
* g++.dg/modules/p1689-1.C: New test.
* g++.dg/modules/p1689-1.exp.json: New test expectation.
* g++.dg/modules/p1689-2.C: New test.
* g++.dg/modules/p1689-2.exp.json: New test expectation.
* g++.dg/modules/p1689-3.C: New test.
* g++.dg/modules/p1689-3.exp.json: New test expectation.
* g++.dg/modules/p1689-4.C: New test.
* g++.dg/modules/p1689-4.exp.json: New test expectation.
* g++.dg/modules/p1689-5.C: New test.
* g++.dg/modules/p1689-5.exp.json: New test expectation.
* g++.dg/modules/modules.exp: Load new P1689 library routines.
* g++.dg/modules/test-p1689.py: New tool for validating P1689 output.
* lib/modules.exp: Support for validating P1689 outputs.

Signed-off-by: Ben Boeckel 
---
 gcc/c-family/c-opts.cc|  40 +++-
 gcc/c-family/c.opt|  12 +
 gcc/cp/module.cc  |   3 +-
 gcc/doc/invoke.texi   |  15 ++
 gcc/testsuite/g++.dg/modules/depflags-f