Re: GCC GSoC 2022: Call for project ideas and mentors

2022-01-07 Thread David Malcolm via Fortran
On Thu, 2022-01-06 at 17:20 +0100, Martin Jambor wrote:
> Hello,
> 
> another year is upon us and Google has announced there will be again
> Google Summer of Code 2022 (though AFAIK there is no specific timeline
> yet).  I'd like to volunteer to be the main Org Admin for GCC again so
> let me know if you think I shouldn't or that someone else should, but
> otherwise I'll assume that I will.
> 
> There will be a few important changes to the GSoC this year.  The most
> important for us is that there will be two project sizes: medium-sized
> projects which are expected to take about 175 hours to complete and
> large projects expected to take approximately 350 hours (the size from
> 2020 and earlier).  I expect that most of our projects will be large
> but
> I think we can offer one or two medium-sized ideas too.
> 
> Google will also increase timing flexibility, so the projects can run
> for longer (up to 22 weeks) allowing mentors to go on vacation and
> students to pause and focus on exams.  Talking about students, Google
> is
> going to open the program to all adults, so from now on, the
> participants working on the projects will be called GSoC contributors.
> 
> Slightly more information about these changes can be found at
> https://opensource.googleblog.com/2021/11/expanding-google-summer-of-code-in-2022.html
> I am sure we will learn more when the actual timeline is announced too.
> 
>  The most important bit:
> 
> 
> Even before that happens, I would like to ask all (moderately) seasoned
> GCC contributors to consider mentoring a student this year and ideally
> also come up with a project that they would like to lead.  I'm
> collecting proposal on our wiki page
> https://gcc.gnu.org/wiki/SummerOfCode - feel free to add yours to the
> top list there.  Or, if you are unsure, post your offer and project
> idea
> as a reply here to the mailing list.

How did it get to be 2022 already?

Thanks for organizing this.

I'd like to (again) mentor a project relating to the GCC static
analyzer:
  https://gcc.gnu.org/wiki/DavidMalcolm/StaticAnalyzer

I've updated the analyzer task ideas on:
  https://gcc.gnu.org/wiki/SummerOfCode
but the ideas there are suggestions; if any prospective candidate has
other good ideas for things worth working on within the analyzer, let
me know.

Alternatively, I'm also up for mentoring relating to diagnostics or
libgccjit, if someone can think of an idea of suitable size and scope
for a GSoC project.

Dave

> 
> ===
> ==
> 
> Eventually, each listed project idea should have a) a project
> title/description, b) more detailed description of the project (2-5
> sentences), c) expected outcomes, d) skills required/preferred, e)
> project size and difficulty and f) expected mentors.
> 
> Project ideas that come without an offer to also mentor them are always
> fun to discuss, by all means feel free to reply to this email with
> yours
> and I will attempt to find a mentor, but please be aware that we can
> only use the suggestion it if we actually find one.
> 
> Everybody in the GCC community is invited to go over
> https://gcc.gnu.org/wiki/SummerOfCode and remove any outdated or
> otherwise bad project suggestions and help improve viable ones.
> 
> Finally, please continue helping (prospective) students figure stuff
> out
> about GCC like you always do.  So far I think all of them enjoyed
> working with us, even if many sometimes struggled with GCC's
> complexity.
> 
> I will update you as more details about GSoC 2022 become available.
> 
> Thank you, let's hope we attract some new talent again this year.
> 
> Martin
> 




Question about Fortran bounds and -Wanalyzer-use-of-uninitialized-value

2022-10-12 Thread David Malcolm via Fortran
Sorry in advance if this is a silly question; my knowledge of Fortran
is next to nothing, I'm afraid.

PR analyzer/107210 reports an ICE in -fanalyzer on this reproducer:


! { dg-additional-options "-O1" }

subroutine check_int (j)
  INTEGER(4) :: i, ia(5), ib(5,4), ip, ipa(:)
  target :: ib
  POINTER :: ip, ipa
  logical :: l(5)

  l = (/ sizeof(i) == 4, sizeof(ia) == 20, sizeof(ib) == 80, &
   sizeof(ip) == 4, sizeof(ipa) == 8 /) ! { dg-warning "use of 
uninitialized value" }

  if (any(.not.l)) STOP 4

end subroutine check_int


The fix for the ICE is trivial (a missing check that tree_fits_uhwi_p),
but after the fix, I see these warnings from the analyzer:


   10 |sizeof(ip) == 4, sizeof(ipa) == 8 /)
  |   ^
Warning: use of uninitialized value ‘ipa.dim[0].ubound’ [CWE-457] 
[-Wanalyzer-use-of-uninitialized-value]
  ‘check_int’: events 1-3
|
|4 |   INTEGER(4) :: i, ia(5), ib(5,4), ip, ipa(:)
|  | ^
|  | |
|  | (1) region created on 
stack here
|  | (2) capacity: 8 bytes
|..
|   10 |sizeof(ip) == 4, sizeof(ipa) == 8 /)
|  |   ~  
|  |   |
|  |   (3) use of uninitialized 
value ‘ipa.dim[0].ubound’ here
|
../../src/gcc/testsuite/gfortran.dg/analyzer/pr107210.f90:10:43:

   10 |sizeof(ip) == 4, sizeof(ipa) == 8 /)
  |   ^
Warning: use of uninitialized value ‘ipa.dim[0].lbound’ [CWE-457] 
[-Wanalyzer-use-of-uninitialized-value]
  ‘check_int’: events 1-3
|
|4 |   INTEGER(4) :: i, ia(5), ib(5,4), ip, ipa(:)
|  | ^
|  | |
|  | (1) region created on 
stack here
|  | (2) capacity: 8 bytes
|..
|   10 |sizeof(ip) == 4, sizeof(ipa) == 8 /)
|  |   ~  
|  |   |
|  |   (3) use of uninitialized 
value ‘ipa.dim[0].lbound’ here
|


The gimple in question is:

__attribute__((fn spec (". w ")))
void check_int (integer(kind=4) & restrict j)
{
  integer(kind=8) ipa$dim$0$lbound;
  integer(kind=8) ipa$dim$0$ubound;
  logical(kind=4) A.1[5];
  logical(kind=4) l[5];
  integer(kind=8) _1;
  logical(kind=4) _3;
  logical(kind=4) _4;
  integer(kind=8) _5;
  logical(kind=4) _6;
  integer(kind=8) S.5_7;
  logical(kind=4) test.6_8;
  integer(kind=8) S.7_9;
  integer(kind=8) S.5_16;
  integer(kind=8) S.7_18;

   [local count: 178992760]:
  MEM  [(c_char * {ref-all})&A.1] = 0x1000100010001;
  _1 = ipa$dim$0$ubound_2(D) - ipa$dim$0$lbound_12(D);
  _3 = _1 == 1;
  MEM[(logical(kind=4) *)&A.1 + 16B] = _3;

[...snip...]

where the analyzer is complaining about this gimple statement:
  _1 = ipa$dim$0$ubound_2(D) - ipa$dim$0$lbound_12(D);
where both:
  ipa$dim$0$ubound_2(D)
and:
  ipa$dim$0$lbound_12(D)
are considered by it to be uninitialized.

Is the analyzer correct here, or is there an aspect of Fortan and/or
gimple that I'm missing?

Thanks
Dave



Re: [PATCH RESEND 0/1] RFC: P1689R5 support

2022-10-13 Thread David Malcolm via Fortran
On Mon, 2022-10-10 at 16:21 -0400, Jason Merrill wrote:
> On 10/4/22 11:11, Ben Boeckel wrote:
> > This patch adds initial support for ISO C++'s [P1689R5][], a format
> > for
> > describing C++ module requirements and provisions based on the
> > source
> > code. This is required because compiling C++ with modules is not
> > embarrassingly parallel and need to be ordered to ensure that
> > `import
> > some_module;` can be satisfied in time by making sure that the TU
> > with
> > `export import some_module;` is compiled first.
> > 
> > [P1689R5]: https://isocpp.org/files/papers/P1689R5.html
> > 
> > I'd like feedback on the approach taken here with respect to the
> > user-visible flags. I'll also note that header units are not
> > supported
> > at this time because the current `-E` behavior with respect to
> > `import
> > ;` is to search for an appropriate `.gcm` file which
> > is not
> > something such a "scan" can support. A new mode will likely need to
> > be
> > created (e.g., replacing `-E` with `-fc++-module-scanning` or
> > something)
> > where headers are looked up "normally" and processed only as much
> > as
> > scanning requires.
> > 
> > Testing is currently happening in CMake's CI using a prior revision
> > of
> > this patch (the differences are basically the changelog, some
> > style, and
> > `trtbd` instead of `p1689r5` as the format name).
> > 
> > For testing within GCC, I'll work on the following:
> > 
> > - scanning non-module source
> > - scanning module-importing source (`import X;`)
> > - scanning module-exporting source (`export module X;`)
> > - scanning module implementation unit (`module X;`)
> > - flag combinations?
> > 
> > Are there existing tools for handling JSON output for testing
> > purposes?
> 
> David Malcolm would probably know best about JSON wrangling.

Unfortunately our JSON output doesn't make any guarantees about the
ordering of keys within an object, so the precise textual output
changes from run to run.  I've coped with that in my test cases by
limiting myself to simple regexes of fragments of the JSON output.

Martin Liska [CCed] went much further in
4e275dccfc2467b3fe39012a3dd2a80bac257dd0 by adding a run-gcov-pytest
DejaGnu directive, allowing for test cases for gcov to be written in
Python, which can thus test much more interesting assertions about the
generated JSON.

Dave

> 
> > Basically, something that I can add to the test suite that doesn't
> > care
> > about whitespace, but checks the structure (with sensible
> > replacements
> > for absolute paths where relevant)?
> 
> Various tests in g++.dg/debug/dwarf2 handle that sort of thing with
> regexps.
> 
> > For the record, Clang has patches with similar flags and behavior
> > by
> > Chuanqi Xu here:
> > 
> >  https://reviews.llvm.org/D134269
> > 
> > with the same flags (though using my old `trtbd` spelling for the
> > format name).
> > 
> > Thanks,
> > 
> > --Ben
> > 
> > Ben Boeckel (1):
> >    p1689r5: initial support
> > 
> >   gcc/ChangeLog   |   9 ++
> >   gcc/c-family/ChangeLog  |   6 +
> >   gcc/c-family/c-opts.cc  |  40 ++-
> >   gcc/c-family/c.opt  |  12 ++
> >   gcc/cp/ChangeLog    |   5 +
> >   gcc/cp/module.cc    |   3 +-
> >   gcc/doc/invoke.texi |  15 +++
> >   gcc/fortran/ChangeLog   |   5 +
> >   gcc/fortran/cpp.cc  |   4 +-
> >   gcc/genmatch.cc |   2 +-
> >   gcc/input.cc    |   4 +-
> >   libcpp/ChangeLog    |  11 ++
> >   libcpp/include/cpplib.h |  12 +-
> >   libcpp/include/mkdeps.h |  17 ++-
> >   libcpp/init.cc  |  14 ++-
> >   libcpp/mkdeps.cc    | 235
> > ++--
> >   16 files changed, 368 insertions(+), 26 deletions(-)
> > 
> > 
> > base-commit: d812e8cb2a920fd75768e16ca8ded59ad93c172f
> 



Re: [PATCH v2 1/3] libcpp: reject codepoints above 0x10FFFF

2022-10-28 Thread David Malcolm via Fortran
On Thu, 2022-10-27 at 19:16 -0400, Ben Boeckel wrote:
> Unicode does not support such values because they are unrepresentable
> in
> UTF-16.

Wikipedia pointed me to RFC 3629, which was when UTF-8 introduced this
restriction, whereas libcpp was implementing the higher upper limit
from the earlier, superceded RFC 2279.

The patch looks good to me, assuming it bootstraps and passes usual
regression testing, but...
> 
> Signed-off-by: Ben Boeckel 
> ---
>  libcpp/ChangeLog  | 6 ++
>  libcpp/charset.cc | 4 ++--
>  2 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/libcpp/ChangeLog b/libcpp/ChangeLog
> index 18d5bcceaf0..4d707277531 100644
> --- a/libcpp/ChangeLog
> +++ b/libcpp/ChangeLog
> @@ -1,3 +1,9 @@
> +2022-10-27  Ben Boeckel  
> +
> +   * include/charset.cc: Reject encodings of codepoints above
> 0x10.
> +   UTF-16 does not support such codepoints and therefore all
> Unicode
> +   rejects such values.
> +
>  2022-10-19  Lewis Hyatt  

...AIUI we now put ChangeLog entries in the blurb part of the patch, so
that server-side git scripts add them to the actual ChangeLog file.

Does the patch pass:
  ./contrib/gcc-changelog/git_check_commit.py
?

Thanks
Dave

>  
> * include/cpplib.h (struct cpp_string): Use new
> "string_length" GTY.
> diff --git a/libcpp/charset.cc b/libcpp/charset.cc
> index 12a398e7527..e9da6674b5f 100644
> --- a/libcpp/charset.cc
> +++ b/libcpp/charset.cc
> @@ -216,7 +216,7 @@ one_utf8_to_cppchar (const uchar **inbufp, size_t
> *inbytesleftp,
>    if (c <= 0x3FF && nbytes > 5) return EILSEQ;
>  
>    /* Make sure the character is valid.  */
> -  if (c > 0x7FFF || (c >= 0xD800 && c <= 0xDFFF)) return EILSEQ;
> +  if (c > 0x10 || (c >= 0xD800 && c <= 0xDFFF)) return EILSEQ;
>  
>    *cp = c;
>    *inbufp = inbuf;
> @@ -320,7 +320,7 @@ one_utf32_to_utf8 (iconv_t bigend, const uchar
> **inbufp, size_t *inbytesleftp,
>    s += inbuf[bigend ? 2 : 1] << 8;
>    s += inbuf[bigend ? 3 : 0];
>  
> -  if (s >= 0x7FFF || (s >= 0xD800 && s <= 0xDFFF))
> +  if (s > 0x10 || (s >= 0xD800 && s <= 0xDFFF))
>  return EILSEQ;
>  
>    rval = one_cppchar_to_utf8 (s, outbufp, outbytesleftp);



Re: [PATCH v2 2/3] libcpp: add a function to determine UTF-8 validity of a C string

2022-10-28 Thread David Malcolm via Fortran
On Thu, 2022-10-27 at 19:16 -0400, Ben Boeckel wrote:
> This simplifies the interface for other UTF-8 validity detections
> when a
> simple "yes" or "no" answer is sufficient.
> 
> Signed-off-by: Ben Boeckel 
> ---
>  libcpp/ChangeLog  |  6 ++
>  libcpp/charset.cc | 18 ++
>  libcpp/internal.h |  2 ++
>  3 files changed, 26 insertions(+)
> 
> diff --git a/libcpp/ChangeLog b/libcpp/ChangeLog
> index 4d707277531..4e2c7900ae2 100644
> --- a/libcpp/ChangeLog
> +++ b/libcpp/ChangeLog
> @@ -1,3 +1,9 @@
> +2022-10-27  Ben Boeckel  
> +
> +   * include/charset.cc: Add `_cpp_valid_utf8_str` which
> determines
> +   whether a C string is valid UTF-8 or not.
> +   * include/internal.h: Add prototype for
> `_cpp_valid_utf8_str`.
> +
>  2022-10-27  Ben Boeckel  
>  
> * include/charset.cc: Reject encodings of codepoints above
> 0x10.

The patch looks good to me, with the same potential caveat that you
might need to move the ChangeLog entry from the patch "body" to the
leading blurb, to satisfy:
  ./contrib/gcc-changelog/git_check_commit.py

Thanks
Dave



Re: [pushed] wwwdocs: readings: Remove Herman D. Knoble's Fortran Resources

2023-02-01 Thread David Malcolm via Fortran
On Wed, 2023-02-01 at 10:56 +0100, Gerald Pfeifer wrote:
> The original page is gone, and search engines did not reveal a new
> location.
> 
> If any of you has a new location, feel free to add that (or let me
> know and I'll take care).

FWIW the most recent version in archive.org is here:

https://web.archive.org/web/20220624224837/http://www.personal.psu.edu/faculty/h/d/hdk/fortran.html

Dave

> 
> Gerald
> ---
>  htdocs/readings.html | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/htdocs/readings.html b/htdocs/readings.html
> index 0a978e8f..6e640af1 100644
> --- a/htdocs/readings.html
> +++ b/htdocs/readings.html
> @@ -463,11 +463,6 @@ names.
>    Michel Olagnon's Fortran 90 List contains a "Tests and
>    Benchmarks" section mentioning commercial testsuites.
>  
> -    
> -   href="http://www.personal.psu.edu/faculty/h/d/hdk/fortran.html";>Herma
> n
> -  D. Knoble's Fortran Resources contain some sections on
> compiler
> -  validation and benchmarking.
> -    
>  
>    Complying with Fortran 90, How does the current crop
> of
>    Fortran 90 compilers measure up to the standard?,
> Steven