gcc 13 Build error

2023-04-02 Thread Damian Tometzki
Hello together,

i have the following build error gcc 13 current git:

/home/damian/data/gcc13built/./prev-gcc/xg++
-B/home/damian/data/gcc13built/./prev-gcc/
-B/usr/riscv64-linux-gnu/bin/ -nostdinc++
-B/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/src/.libs
-B/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/libsupc++/.libs
 
-I/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/include/riscv64-linux-gnu
 -I/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/include
 -I/home/damian/data/gcc/libstdc++-v3/libsupc++
-L/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/src/.libs
-L/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/libsupc++/.libs
 -fno-PIE -c   -g -O2 -fno-checking -gtoggle -DIN_GCC
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror
-DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/.
-I../../gcc/gcc/../include  -I../../gcc/gcc/../libcpp/include
-I../../gcc/gcc/../libcody  -I../../gcc/gcc/../libdecnumber
-I../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber
-I../../gcc/gcc/../libbacktrace   -o riscv-common.o -MT riscv-common.o
-MMD -MP -MF ./.deps/riscv-common.TPo
../../gcc/gcc/common/config/riscv/riscv-common.cc
../../gcc/gcc/common/config/riscv/riscv-common.cc: In static member
function 'static riscv_subset_list* riscv_subset_list::parse(const
char*, location_t)':
../../gcc/gcc/common/config/riscv/riscv-common.cc:1158:48: error:
unquoted keyword 'float' in format [-Werror=format-diag]
 1158 | "%<-march=%s%>: z*inx is conflict with float extensions",

Damian


Re: [GSoC] Interest and initial proposal for project on reimplementing cpychecker as -fanalyzer plugin

2023-04-02 Thread Sun Steven via Gcc
Hi, Eric, Malcom,

Sorry that I didn't check this thread before.

It sounds like there are a lot of things to do. I want to offer some help.

Let me add some backgrounds of memory management in python here.


## Intro (for people unfamiliar with CPython)

Unlike programs written in C++, where the compiler automatically adds
destructors on all exit paths, CPython requires manual memory management
on PyObject*.

The current CPython has 2 major memory management mechanisms,
including reference counting and a mark-and-sweep gc for cyclic references.
The former acts as the major mechanism. PyObject gets destructed when
the refcount drops to zero.

## CPython has made great efforts to reduce memory errors.

With specific compile flags on, the CPython interpreter records the total
refcount, also it aborts when refcount drops below zero (being double freed).
This helps to discover memory leaks. PEP 683 (implemented in 3.12) also
introduced "immortal objects" with initial refcount 9, prevent it from
being accidentally freed (such as small integers).

Even with these features, CPython extension management is still a problem,
since most errors occur on "error-handling path", which is less likely to be
covered. And most users will not use a debug-build cpython, making the error
more under the surface.

## Why I want to participate in?

I am currently working on the initial implementations of PEP 701 (a new
f-string​ parser). During the testing, I discovered (and fixed) 3 memory leaks.
As you can see, even the most experienced CPython developers sometimes
forget to properly decrease refs. I think it will be inspiring if a new analysis
tool was made available as a compiler builtin. It will lead to a better CPython.


I do not know if GSoC allows collaborations. Maybe the headcount is limited,
or maybe I am too senior for GSoC. But I think I am still a rookie in front of
GCC.


I want to contribute, no matter the forms.

Yours



Re: gcc 13 Build error

2023-04-02 Thread Damian Tometzki
Hello together,

i found a possible fix

diff --git a/gcc/common/config/riscv/riscv-common.cc
b/gcc/common/config/riscv/riscv-common.cc
index b3c6ec97e7a..32ba1d52556 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1155,7 +1155,7 @@ riscv_subset_list::parse (const char *arch,
location_t loc)

   if (subset_list->lookup("zfinx") && subset_list->lookup("f"))
error_at (loc,
-   "%<-march=%s%>: z*inx is conflict with float extensions",
+   "%<-march=%s%>: z*inx is conflict with \float extensions",
arch);

   return subset_list;

On Sun, Apr 2, 2023 at 7:13 PM Damian Tometzki  wrote:
>
> Hello together,
>
> i have the following build error gcc 13 current git:
>
> /home/damian/data/gcc13built/./prev-gcc/xg++
> -B/home/damian/data/gcc13built/./prev-gcc/
> -B/usr/riscv64-linux-gnu/bin/ -nostdinc++
> -B/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/src/.libs
> -B/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/libsupc++/.libs
>  
> -I/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/include/riscv64-linux-gnu
>  -I/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/include
>  -I/home/damian/data/gcc/libstdc++-v3/libsupc++
> -L/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/src/.libs
> -L/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/libsupc++/.libs
>  -fno-PIE -c   -g -O2 -fno-checking -gtoggle -DIN_GCC
> -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
> -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
> -Wconditionally-supported -Woverloaded-virtual -pedantic
> -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror
> -DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/.
> -I../../gcc/gcc/../include  -I../../gcc/gcc/../libcpp/include
> -I../../gcc/gcc/../libcody  -I../../gcc/gcc/../libdecnumber
> -I../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber
> -I../../gcc/gcc/../libbacktrace   -o riscv-common.o -MT riscv-common.o
> -MMD -MP -MF ./.deps/riscv-common.TPo
> ../../gcc/gcc/common/config/riscv/riscv-common.cc
> ../../gcc/gcc/common/config/riscv/riscv-common.cc: In static member
> function 'static riscv_subset_list* riscv_subset_list::parse(const
> char*, location_t)':
> ../../gcc/gcc/common/config/riscv/riscv-common.cc:1158:48: error:
> unquoted keyword 'float' in format [-Werror=format-diag]
>  1158 | "%<-march=%s%>: z*inx is conflict with float extensions",
>
> Damian


Re: gcc 13 Build error

2023-04-02 Thread Jonathan Wakely via Gcc
On Sun, 2 Apr 2023, 18:31 Damian Tometzki,  wrote:

> Hello together,
>
> i found a possible fix
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc
> b/gcc/common/config/riscv/riscv-common.cc
> index b3c6ec97e7a..32ba1d52556 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -1155,7 +1155,7 @@ riscv_subset_list::parse (const char *arch,
> location_t loc)
>
>if (subset_list->lookup("zfinx") && subset_list->lookup("f"))
> error_at (loc,
> -   "%<-march=%s%>: z*inx is conflict with float extensions",
> +   "%<-march=%s%>: z*inx is conflict with \float extensions",
> arch);
>
>return subset_list;
>


You can also use --disable-werror but please report this to bugzilla,
because it should build without errors.



> On Sun, Apr 2, 2023 at 7:13 PM Damian Tometzki 
> wrote:
> >
> > Hello together,
> >
> > i have the following build error gcc 13 current git:
> >
> > /home/damian/data/gcc13built/./prev-gcc/xg++
> > -B/home/damian/data/gcc13built/./prev-gcc/
> > -B/usr/riscv64-linux-gnu/bin/ -nostdinc++
> >
> -B/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/src/.libs
> >
> -B/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/libsupc++/.libs
> >
> -I/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/include/riscv64-linux-gnu
> >
> -I/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/include
> >  -I/home/damian/data/gcc/libstdc++-v3/libsupc++
> >
> -L/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/src/.libs
> >
> -L/home/damian/data/gcc13built/prev-riscv64-linux-gnu/libstdc++-v3/libsupc++/.libs
> >  -fno-PIE -c   -g -O2 -fno-checking -gtoggle -DIN_GCC
> > -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
> > -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
> > -Wconditionally-supported -Woverloaded-virtual -pedantic
> > -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror
> > -DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/.
> > -I../../gcc/gcc/../include  -I../../gcc/gcc/../libcpp/include
> > -I../../gcc/gcc/../libcody  -I../../gcc/gcc/../libdecnumber
> > -I../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber
> > -I../../gcc/gcc/../libbacktrace   -o riscv-common.o -MT riscv-common.o
> > -MMD -MP -MF ./.deps/riscv-common.TPo
> > ../../gcc/gcc/common/config/riscv/riscv-common.cc
> > ../../gcc/gcc/common/config/riscv/riscv-common.cc: In static member
> > function 'static riscv_subset_list* riscv_subset_list::parse(const
> > char*, location_t)':
> > ../../gcc/gcc/common/config/riscv/riscv-common.cc:1158:48: error:
> > unquoted keyword 'float' in format [-Werror=format-diag]
> >  1158 | "%<-march=%s%>: z*inx is conflict with float extensions",
> >
> > Damian
>


gcc-13-20230402 is now available

2023-04-02 Thread GCC Administrator via Gcc
Snapshot gcc-13-20230402 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/13-20230402/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 13 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch master 
revision 8f989fe879ddc2753ac9fa580b2b0a1024c98f0f

You'll find:

 gcc-13-20230402.tar.xz   Complete GCC

  SHA256=59369450ebbd4474e0a4339bb95ce77ad913f660a277a2eb3e13421d846d5ab8
  SHA1=231a06d531156ea9821e1d461bb9e398af262e91

Diffs from 13-20230326 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-13
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: [GSoC][analyzer-c++] Enable enough C++ support for self-analysis

2023-04-02 Thread David Malcolm via Gcc
On Sat, 2023-04-01 at 01:33 +0200, Benjamin Priour wrote:
> Hi David,
> 
> 
> On Thu, Mar 30, 2023 at 2:04 AM David Malcolm 
> wrote:
> > I think working on the C++ enablement prerequisites in the analyzer
> > would make more sense.  I'd planned to do this myself for GCC 14,
> > but
> > there are plenty of other tasks I could work on if you want to
> > tackle
> > C++ support as a GSoC project for GCC 14.
> > 
> 
> Yes, I gladly would.

Great.

> 
> 
> > A good C++ project might be: enable enough C++ support in the
> > analyzer
> > for it to be able to analyze itself.  This could be quite a large,
> > difficult project, though it sidesteps having to support exception-
> > handling, since we build ourselves with exception-handling
> > disabled.
> > 
> > Hope this is helpful
> > Dave
> > 
> > 
> To that purpose,  the following order of resolutions would make sense
> into
> achieving that:
> 0. An emphasis on reducing the amount of exploded nodes will have to
> be put
> from the beginning. All of my C++ samples produce graphs quite dense.

To be fair, C ones can be as well; the analyzer's exploded graphs tend
to get very big on anything but the most trivial examples.


> 1. First thing first, extending the current C tests to cover C++
> PR96395
> 

I wonder to what extent doing this would uncover issues that we don't
yet know about with C++ support, so yes - a good one to do first.

> (1.bis )
> 2. I would then go with supporting the options relative to sm-malloc:
>   - -Wanalyser-double-free should behave properly (cf the fresh}

[...snip...]

> Emits nothing. The state is 'nonnull' is properly tracked though
> along the
> constructors back to foo, so I will have to dive deeper into this
> tomorrow.

Indeed - you'll have to do a lot of looking at gimple IR dumps, what
the supergraph looks like, etc, for all of this.


> 3. Improve the scope of -Wanalyzer-null-dereference
>    - For the analyzer, -Wanalyzer[-possible]-null-dereference should
> fully
> support smart pointers. That is not the case currently (see PR109366
>  ), even though
> shared
> pointers are promising.

Good point.

>   - For smart pointers, it might be necessary to review the
> diagnostic
> path, as for shared_ptr they are quite long already.

Yeah.

> 4. Extension of out-of-bounds
>  ( - Extending -Wout-of-bounds to the many vec<...> might be a
> requirement.
> However I have to look into more details for that one, I don't see
> yet how
> it could be done without a similar reuse of the assertions as for the
> libstdc++.)
> 
> From what I saw, despite the bugs not being FIXED, vfuncs seem to be
> working nicely enough after the fix from GSoC 2021.

IIRC I was keeping those bugs open because there's still a little room
for making the analyzer smarter about the C++ type system e.g. if we
"know" that a foo * is pointing at a particular subclass, maybe we know
things about what vfunc implementations could be called.

We could even try an analysis mode where we split the analysis path at
a vfunc call, where we could create an out-edge in the egraph for each
known concrete subclass for foo *, so that we can consider all the
possible subclasses and the code that would be called.  (I'm not sure
if this is a *good* idea, but it intrigues me)

> 
> Unfortunately I couldn't devote as much time as I wanted to gcc
> yesterday,
> I plan to send a proposal draft tomorrow evening. Sincerely sorry for
> the
> short time frame before the deadline.

Sound promising.  Note that the deadline for submitting proposals to
the official GSoC website is April 4 - 18:00 UTC (i.e. this coming
Tuesday) and that Google are very strict about that deadline; see:
https://developers.google.com/open-source/gsoc/timeline

I believe you've already had a go at posting gcc patches to our mailing
list: that's a great thing to mention in your application.

Good luck!
Dave



Re: [GSoC][Static Analyzer] First proposal draft and a few more questions/requests

2023-04-02 Thread David Malcolm via Gcc
On Sat, 2023-04-01 at 16:19 +0200, Shengyu Huang wrote:
> Hi Dave,
> 
> > > 
> > > I has looked into compiling those files with the patch some time
> > > ago;
> > > looking at my notes, one issue was with this on-stack buffer:
> > >    char extra[1024];
> > > declared outside the loop.  Inside the loop, it gets modified in
> > > various ways:
> > >    extra[0] = '\0';
> > > and
> > >    if (fread(extra, 1, extsize, fpZip) == extsize) {
> > > where the latter means "extra" becomes tainted.
> > > 
> > > However "extra" is barely used, and is effectively reset each
> > > time
> > > through the loop - but the analyzer doesn't figure that out.  So
> > > the
> > > loop analysis explodes, as it tries to keep track of the
> > > possibility
> > > that "extra" is still tainted from previous iteration(s), despite
> > > the
> > > fact that it's going to be clobbered before it ever gets used.
> > > 
> > > So one fix might be to extend the state-purging code so that it
> > > somehow
> > > "sees" that "extra" gets clobbered before it gets used, and thus
> > > we can
> > > purge the tainted state from it.
> > 
> > Thanks for your notes. I think we may be talking about the same
> > thing? If you look at the updated proposal (I have changed it quite
> > a lot since I first sent it out), you’ll see there is one relevant
> > paper for state merging (although it is slightly different from
> > state purging, I think the goal and general methodology is
> > similar): https://dslab.epfl.ch/pubs/stateMerging.pdf 
> > 
> > I was trying to say if some similar situation happened for other
> > types of checkers, I expected state explosion would also happen. I
> > tried to construct a similar example (with the same kind of reset
> > and nested conditionals + a loop) but for double-free, so far no
> > success yet. I’ll pick it up afterwards, at latest by next
> > Saturday, because I need to prepare for a coming midterm on Friday.
> > I will also put this test case to the proposal because it seems
> > like a very good starting point for the project.
> 
> As promised, below is a small example that causes state explosion
> without taint state machine involved.
> 
> void test()
> {
>   void *p;
>   int a;
>   scanf(“%d", &a);
>   
>   while (a > 0)
>   {
>  p = malloc (1024);
>  if (a > 1)
>    free(p);
>  a--;
>    }
>    
>    if (a >0)
>  free(p);
> }
> 
> This example not only causes state explosion, but also reports false
> positive of double-free.

(nods)

Yeah, our handling of loops isn't great.  There's plenty of opportunity
within a GSoC project for tackling that.

> 
> By the way, do you have any feedback regarding my proposal
> (https://docs.google.com/document/d/1MRI1R5DaX8kM6DaqRQsEri5Mx2FvHmWv
> 13qe1W0Bj0g <
> https://docs.google.com/document/d/1MRI1R5DaX8kM6DaqRQsEri5Mx2FvHmWv1
> 3qe1W0Bj0g/edit>)? I am happy to allocate more time polishing the
> proposal if you find anything off there. If you prefer me sending it
> via email again (for ease of reference in the future maybe?), I am
> happy to do so as well.

Thanks for the proposal. 

Overall, it looks great.  Some notes:
- maybe specify the *GCC* static analyzer you first mention it
- you talk about "timeout" warnings.  The analyzer already can emit a
"timeout" warning of sorts, via -Wanalyzer-too-complex, though this is
based on the complexity of the exploded graph (e.g. # of nodes), rather
than actual timings.  Is the latter the kind of thing you had in mind,
or where you thinking about ways of making the "too complex" heuristics
smarter?  (I confess that you seem much more familiar with the theory
of this than I am!)
- the numbering of your references seems to have gotten out-of-sync; I
see references to [3] as a paper "Schwartz et al", but that's a link to
one of my blog posts.
- do you a link to a github account, or somewhere else that
demonstrates code you've written?  In particular, how is your C++ ?

Note that the deadline for submitting proposals to the official GSoC
website is April 4 - 18:00 UTC (i.e. this coming Tuesday) and that
Google are very strict about that deadline; see:
https://developers.google.com/open-source/gsoc/timeline

Good luck
Dave



Re: [GSoC] Interest and initial proposal for project on reimplementing cpychecker as -fanalyzer plugin

2023-04-02 Thread David Malcolm via Gcc
On Sun, 2023-04-02 at 17:24 +, Sun Steven via Gcc wrote:
> Hi, Eric, Malcom,

Hi - and welcome to the GCC community.

> 
> Sorry that I didn't check this thread before.
> 
> It sounds like there are a lot of things to do. I want to offer some
> help.
> 
> Let me add some backgrounds of memory management in python here.
> 
> 
> ## Intro (for people unfamiliar with CPython)
> 
> Unlike programs written in C++, where the compiler automatically adds
> destructors on all exit paths, CPython requires manual memory
> management
> on PyObject*.
> 
> The current CPython has 2 major memory management mechanisms,
> including reference counting and a mark-and-sweep gc for cyclic
> references.
> The former acts as the major mechanism. PyObject gets destructed when
> the refcount drops to zero.

(nods)

FWIW I wrote the original cpychecker code because I was sick of buggy
extension modules crashing /usr/bin/python... and all the bug reports
about it that were landing in my inbox (I was the maintainer of Python
within Fedora/RHEL back then).

> 
> ## CPython has made great efforts to reduce memory errors.
> 
> With specific compile flags on, the CPython interpreter records the
> total
> refcount, also it aborts when refcount drops below zero (being double
> freed).
> This helps to discover memory leaks. PEP 683 (implemented in 3.12)
> also
> introduced "immortal objects" with initial refcount 9,
> prevent it from
> being accidentally freed (such as small integers).

That feature is new to me (I last worked on CPython internals back in
the 2 to 3 transition era); thanks!

[...snip...]

> 
> ## Why I want to participate in?
> 
> I am currently working on the initial implementations of PEP 701 (a
> new
> f-string​ parser). During the testing, I discovered (and fixed) 3
> memory leaks.
> As you can see, even the most experienced CPython developers
> sometimes
> forget to properly decrease refs. I think it will be inspiring if a
> new analysis
> tool was made available as a compiler builtin. It will lead to a
> better CPython.
> 
> 
> I do not know if GSoC allows collaborations. Maybe the headcount is
> limited,
> or maybe I am too senior for GSoC. But I think I am still a rookie in
> front of
> GCC.

I'd be up for a collaboration between you and Eric (assuming Eric is,
of course), as the project is very large and there are several logical
components to it that you could carve up between the two of you.  That
said it may be out of our hands, it depends on:

(a) how many slots we get allocated to us from Google

(b) GSoC is meant for newcomers to open source development; it sounds
like you might already have significant experience.  I don't know what
GSoC's threshold is.

> I want to contribute, no matter the forms.

That's great.  It sounds like you have considerable knowledge of
CPython internals (a decade more up-to-date than mine!), and hopefully
you have recent contacts within the CPython community in terms of
getting traction there.

Some notes about a GSoC application:

Do you have a link to your github so we can see your CPython
contributions?

How's your C++?  Have you tried building GCC from source yet?  FWIW I
felt intimidated back when I first started working on GCC itself; I
wrote this guide to help people get started:
  https://gcc-newbies-guide.readthedocs.io/en/latest/


Note that the deadline for submitting proposals to the official GSoC
website is April 4 - 18:00 UTC (i.e. this coming Tuesday) and that
Google are very strict about that deadline; see:
https://developers.google.com/open-source/gsoc/timeline

Hope this is helpful
Dave



Re: [GSoC] Interest and initial proposal for project on reimplementing cpychecker as -fanalyzer plugin

2023-04-02 Thread David Malcolm via Gcc
On Sat, 2023-04-01 at 19:49 -0400, Eric Feng wrote:
> > For the task above, I think it's almost all there, it's "just" a
> > case
> > of implementing the special-case knowledge about the CPython API,
> > mostly via known_function subclasses.
> 
> Sounds good.
> 
> 
> > In cpychecker I added some custom function attributes:
> >  
> > https://gcc-python-plugin.readthedocs.io/en/latest/cpychecker.html
> > which were:
> >   __attribute__((cpychecker_returns_borrowed_ref))
> >   __attribute__((cpychecker_steals_reference_to_arg(n)))
> > 
> [...]
> > 
> > But exactly what these macros would look like would be a decision
> > for
> > the CPython community (hence do it via PEP, based on a sample
> > implementation).
> 
> Ok, I see what you mean now. Thanks for clarifying!
> 
> 
> > Yeah, this sounds like a big project.  Fortunately there are a lot
> > of
> > possible subtasks in this one, and the project has benefits to GCC
> > and
> > to CPython even if you only get a subset of the ideas done in the
> > time
> > available (refcount checking being probably the highest-value
> > subtask).
> 
> Sounds good.
> 
> I refactored the project description and timeline sections of the
> proposal according to our conversation. Notably, I moved format
> string
> checking to task #2 in the timeline since its subtasks are
> particularly beneficial. I also suggest in the timeline section to
> reach out to the CPython community via PEP about the specifics of new
> attributes in week 9/10 since I think we should have a somewhat
> mature
> prototype by that point. Let me know if you think it should be done
> earlier/later. Please find the changed sections below (I omitted
> unchanged sections for brevity)
> ___
> 
> Describe the project and clearly define its goals:
> One pertinent use case of the gcc-python plugin was as a static
> analysis tool for CPython extension modules. The main goal of the
> plugin was to help programmers writing extensions identify common
> coding errors. The gcc-python-plugin has bitrotted over the years
> and,
> in particular, cpychecker stopped working some GCC releases ago.
> Broadly, the goal of this project is to port the functionalities of
> cpychecker to a -fanalyzer plugin.
> 
> Below is a brief description of the functionalities of the static
> analysis tool for which I will work on porting over to a -fanalyzer
> plugin. The structure of the objectives is based on the
> gcc-python-plugin documentation:
> 
> Reference count checking: 
> 
> Format string checking: Some CPython APIs such as PyArgs_ParseTuple,
> PyArg_ParseTupleAndKeywords, etc take format strings as arguments.
> This check involves verifying that the format strings taken in by
> these APIs are correct with respect to the number and types of
> arguments passed in. In particular, I will work on integrating the
> analyzer with -Wformat
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107017) and adding
> plugin support for -Wformat
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100121) . We should
> then
> be able to specify our own archetype which reflects the format string
> syntax for the relevant CPython APIs and take advantage of the
> integrated analyzer to check them.
> 
> Associating PyTypeObject instances with compile-time-types:
>  from original proposal>
> 
> Error-handling checking (including errors in exception handling):
> Common errors such as dereferencing a NULL value are already checked
> by the analyzer. I will extend this functionality by implementing
> special-case knowledge about the CPython API.
> 
> Verification of PyMethodDef tables:  proposal>
> 
> Provide an expected timeline:
> Please find a rough estimate of the weekly progress in relation to
> the
> features described below. Tasks that I expect to take longer than one
> week are broken down in more detail. In addition to what’s described,
> each task also involves adding test coverage pertaining its specific
> feature to a regression test suite.
> 
> Week 1 - 7: Reference counting checking
>     Week 1: Set up the overall infrastructure of the plugin and begin
> building core functionality
>     Week 1 - 6: Core reference counting functionality
>     Week 7: Refine prototype
> Week 8 - 10.5: Format string checking (including associating
> PyTypeObject instances with compile-time-types)
>     Week 8 - ~9: RFE: support printf-style formatted functions in -
> fanalyzer
>     Week ~9 - 10.5: RFE: plugin support for -Wformat via
> __attribute__((format()))
>     Additionally, begin conversing with CPython community via PEP
> about the exact form of new attributes on CPython headers which may
> be
> helpful for both humans and the static analyzer. Present ideas based
> on work done so far.
> Week 10.5 - 12: Error-handling checking, errors in exception
> handling,
> and verification of PyMethodDef tables
> 

Sounds great.

Note that the deadline for submitting proposals to the official GSoC
website is April 4 - 18:00 UTC (i.e. this coming Tuesday) and that
Goo

Re: GSoC: want to take part in `Extend the static analysis pass for CPython Extension`

2023-04-02 Thread David Malcolm via Gcc
On Sat, 2023-04-01 at 20:32 +, Sun Steven via Gcc wrote:
> Hello,

Hi!

I just replied to your other email in the "[GSoC] Interest and initial
proposal for project on reimplementing cpychecker as -fanalyzer plugin
" thread.

> 
> I want to take part in this project.
> 
> b. Write a plugin to add checking for usage of the CPython API (e.g.
> reference-counting); see
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107646
> 
> 
> I know the deadline is arriving, but this idea just came to me now.

Indeed; the deadline for submitting proposals to the official GSoC
website is April 4 - 18:00 UTC (i.e. this coming Tuesday); see:
https://developers.google.com/open-source/gsoc/timeline

Google are very strict about that deadline.

> 
> Self-intro:
> I am a fan of C++, and have expertise in writing low-latency codes. I
> previously worked at a high-frequency trading company, mainly writing
> C++ and Python on Linux.
> 
> Familiarity with GCC:
> I get an overall idea of how the compiler works. I have debugged
> several GCC c++ frontend bugs. (eg. 108218,  99686, 99019,...) 

Thanks; I just took a look at those.


> But I only checked the c++ frontend codes in detail, not the middle
> or backend codes. I have the ability to work with large codebases.
> 
> Familiarity with CPython:
> I use a lot of CPython. Recently, I am contributing to the CPython
> interpreter on PEP 701 (mainly on the parser, which I am familiar
> with)
> 
> 
> I have always been wanting to contribute major changes to GCC, but
> just don't know if that project exists. I understand how middle-end
> works, but never really interact with the GIMPLE. This project allows
> me to take a real look at how GCC's middle end works.

Given your knowledge of both C++ and of CPython internals, this project
sounds like a good way for you to get involved.

> 
> I want to know if anyone was already on this project. I would prefer
> a large-sized object (350hrs).

I see you've already posted to the thread Eric started.

> 
> If b. was already taken, I also accept a. and c. 

I had to check the wiki page to see which ones (a) and (c) were; 

(a) is "Add format-string support to -fanalyzer."

(c) is "Add a checker for some API or project of interest to the
contributor (e.g. the Linux kernel, a POSIX API that we're not yet
checking, or something else), either as a plugin, or as part of the
analyzer core for e.g. POSIX."

Do you have specific ideas for (c)?

(a) would make a great project, in that it's reasonably self-contained.
Eric's proposal for (b) plans to eventually tackle it, but there's a
huge amount of potential work in (b) already.

> By the way, I don't really care about the GSoC. If we miss the
> deadline, we can still push forward this project without the support
> of GSoC, as long as I get coached.

I'm keen on helping new GCC contributors, with or without GSoC.  A good
next step is to build GCC from source, and try hacking in a new
warning.  See:
  https://gcc-newbies-guide.readthedocs.io/en/latest/

But remember that the GSoC deadline is April 4 - 18:00 UTC (i.e. this
coming Tuesday), so if you're going to apply, you need to act fast.

Good luck
Dave



Re: [GSoC][Static Analyzer] First proposal draft and a few more questions/requests

2023-04-02 Thread Shengyu Huang via Gcc
Hi Dave,

> Overall, it looks great.  Some notes:
> - maybe specify the *GCC* static analyzer you first mention it

Done.

> - you talk about "timeout" warnings.  The analyzer already can emit a
> "timeout" warning of sorts, via -Wanalyzer-too-complex, though this is
> based on the complexity of the exploded graph (e.g. # of nodes), rather
> than actual timings.  Is the latter the kind of thing you had in mind,
> or where you thinking about ways of making the "too complex" heuristics
> smarter?  (I confess that you seem much more familiar with the theory
> of this than I am!)

I was not ware of `-Wanalyzer-too-complex` when I wrote that proposal, and I 
forgot to rewrite this part. I planned to ask you why we did not turn on this 
flag by default. To avoid state explosion altogether, it is for sure that we 
need to bear with false positives in some cases. I am not yet sure what is a 
good approach to balance the soundness and completeness in symbolic execution, 
but my intuition (just based on my limited experience with other kinds of 
formal methods) is that we don’t want to avoid state explosion in all cases 
because we want to have more precision (that is, we don’t want too many false 
positives). Imagine a dummy static analyzer that just reports warnings 
regardless the program. It will not have any state explosion problems, but it 
will have lots of false positives. Therefore, I think we should consider 
turning it on by default. Maybe you have other considerations that I missed?

Another point but irrelevant for this project is that we will surely encounter 
timeout when we integrate SMT solvers in the future (I don’t know whether it is 
the plan for GCC14). It is just unavoidable…the current approach does not sound 
transferable to the timeout issued by, say, Z3. Maybe we want a unified 
approach at some point?

Anyway, this part does not seem too urgent anymore after I know the flag 
-Wanalyzer-too-complex exists…if you have some working solution in terms of how 
to handle timeout from SMT solvers, I’d be happy to know.

> - the numbering of your references seems to have gotten out-of-sync; I
> see references to [3] as a paper "Schwartz et al", but that's a link to
> one of my blog posts.

Thanks for letting me know that. Indeed I forgot to fix the numbering after 
adding your blog to the references.

> - do you a link to a github account, or somewhere else that
> demonstrates code you've written?  In particular, how is your C++ ?
> 

My Github account is https://github.com/kumom, but I would not post any code 
from my course projects there since it will violate honor code and promote 
plagiarism (I will attach a small? lab project to you in another private 
email). I have taken courses like systems programming and computer 
architecture, where I wrote plenty of C code and some C++ code. For C++, I’ve 
written maybe just a few thousand lines of code. Unfortunately, in all my 
previous jobs as student assistant where I coded mainly in Python and 
TypeScript, my code was neither open source nor owned by me…Now I am working on 
a semester project (on formal verification) using Dafny and a course project 
(on compiler design) using Scala, but I admit they are a bit far from C++. I 
have planned to read Effective C++ after the Easter break before you raised 
this question, but maybe you can recommend something else that you find 
helpful. Since I am relative familiar with programming language concepts in 
general, I believe I will get more fluent at C++ within a short amount of time 
once I get my hands dirty.

> Note that the deadline for submitting proposals to the official GSoC
> website is April 4 - 18:00 UTC (i.e. this coming Tuesday) and that
> Google are very strict about that deadline; see:
> https://developers.google.com/open-source/gsoc/timeline

Thanks for the reminder. I have kept this in mind and will submit it before the 
deadline.

Best,
Shengyu