Re: GSoC: Working on the static analyzer

2022-01-29 Thread Mir Immad via Gcc
Thank you for the detailed information.

I've been looking into the integer posix  file descriptor APIs and I
decided to write proof-of-concept  checker for them. (not caring about
errno). The checker tracks the fd returned by open(), warns if dup() is
called with closed fd otherwise tracks the fd returned by dup(), it also
warns if read() and write() functions were called on closed fd. I'm
attaching a text file that lists some c sources and warnings by the static
analyzer. I've used the diagnostic meta-data from sm-file. Is this
something that could also be added to the analyzer?

About the fd leak, that's the next thing I'll try to get working. Since
you've mentioned that it could be a GSoC project, this is what I'm going to
focus on.

Regards.



On Wed, Jan 26, 2022 at 7:56 PM David Malcolm  wrote:

> On Mon, 2022-01-24 at 01:41 +0530, Mir Immad wrote:
> > Hi, sir.
> >
> > I've been trying to understand the static analyzer's code. I spent most
> > of
> > my time learning the state machine's API. I learned how state machine's
> > on_stmt is supposed to "recognize" specific functions and how
> > on_transition
> > takes a specific tree from one state to another, and how the captured
> > states are used by pending_diagnostics to report the errors.
> > Furthermore, I
> > was able to create a dummy checker that mimicked the behaviour of sm-
> > file's
> > double_fclose and compile GCC with these changes. Is this the right way
> > of
> > learning?
>
> This sounds great.
>
> >
> > As you've mentioned on the projects page that you would like to add
> > more
> > support for some POSIX APIs. Can you please write (or refer me to a) a
> > simple C program that uses such an API (and also what the analyzer
> > should
> > have done) so that I can attempt to add such a checker to the analyzer.
>
> A couple of project ideas:
>
> (i) treat data coming from a network connection as tainted, by somehow
> teaching the analyzer about networking APIs.  Ideally: look at some
> subset of historical CVEs involving network-facing attacks on user-
> space daemons, and find a way to detect them in the analyzer (need to
> find a way to mark the incoming data as tainted, so that the analyer
> "know" about the trust boundary - that the incoming data needs to be
> sanitized and treated with extra caution; see
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584372.html
> for my attempts to do this for the Linux kernel).
>
> Obviously this is potentially a huge project, so maybe just picking a
> tiny subset and getting that working as a proof-of-concept would be a
> good GSoC project.  Maybe find an old CVE that someone has written a
> good write-up for, and think about "how could GCC/-fanalyzer have
> spotted it?"
>
> (ii) add leak-detection for POSIX file descriptors: i.e. the integer
> values returned by "open", "dup", etc.  It would be good to have a
> check that the user's code doesn't leak these values e.g. on error-
> handling paths, by failing to close a file-descriptor (and not storing
> it anywhere).  I think that much of this could be done by analogy with
> the sm-file.cc code.
>
>
> >
> > Also, I didn't realize the complexity of adding SARIF when I mentioned
> > it.
> > I'd rather work on adding more checkers.
>
> Fair enough.
>
> Hope this above is constructive.
>
> Dave
>
> >
> > Regards.
> >
> > Mir Immad
> >
> > On Sun, Jan 23, 2022, 11:04 PM Mir Immad  wrote:
> >
> > > Hi Sir,
> > >
> > > I've been trying to understand the static analyzer's code. I spent
> > > most of
> > > my time learning the state machine's API. I learned how state
> > > machine's
> > > on_stmt is supposed to "recognize" specific functions and takes a
> > > specific
> > > tree from one state to another, and how the captured states are used
> > > by
> > > pending_diagnostics to report the errors. Furthermore, I was able to
> > > create
> > > a dummy checker that mimicked the behaviour of sm-file's
> > > double_fclose and
> > > compile GCC with these changes. Is this the right way of learning?
> > >
> > > As you've mentioned on the projects page that you would like to add
> > > more
> > > support for some POSIX APIs. Can you please write (or refer me to a)
> > > a
> > > simple C program that uses such an API (and also what the analyzer
> > > should
> > > have done) so that I can attempt to add such a checker to the
> > > analyzer.
> > >
> > > Also, I didn't realize the complexity of adding SARIF when I
> > > mentioned it.
> > > I'd rather work on adding more checkers.
> > >
> > > Regards.
> > > Mir Immad
> > >
> > > On Mon, Jan 17, 2022 at 5:41 AM David Malcolm 
> > > wrote:
> > >
> > > > On Fri, 2022-01-14 at 22:15 +0530, Mir Immad wrote:
> > > > > HI David,
> > > > > I've been tinkering with the static analyzer for the last few
> > > > > days. I
> > > > > find
> > > > > the project of adding SARIF output to the analyzer intresting.
> > > > > I'm
> > > > > writing
> > > > > this to let you know that I'm trying to learn the codebase.
> > > > > 

how to get started with contribution

2022-01-29 Thread VAISHNAVI DAYANAND via Gcc
Respected sir/madam,
I am Vaishnavi Andhalkar, a junior undergrad at IIT Roorkee. I have
recently started contributing to open source, and I am new at it. But, I am
well aware of C++, programming and algorithms, and javascript. I would like
to contribute to your organization. Would you please tell me how to get
started?
Hoping to hear from you soon
Thanks and Regards
Vaishnavi


Re: how to get started with contribution

2022-01-29 Thread David Edelsohn via Gcc
On Sat, Jan 29, 2022 at 10:37 AM VAISHNAVI DAYANAND via Gcc
 wrote:
>
> Respected sir/madam,
> I am Vaishnavi Andhalkar, a junior undergrad at IIT Roorkee. I have
> recently started contributing to open source, and I am new at it. But, I am
> well aware of C++, programming and algorithms, and javascript. I would like
> to contribute to your organization. Would you please tell me how to get
> started?
> Hoping to hear from you soon
> Thanks and Regards
> Vaishnavi

Thanks for your interest in GCC.  Welcome!

A good place to start is the GCC Wiki Getting Started page:
https://gcc.gnu.org/wiki/#Getting_Started_with_GCC_Development

and browse other recent answers to similar questions in the archives
of this mailing list.

Thanks, David


Re: GSoC: Working on the static analyzer

2022-01-29 Thread David Malcolm via Gcc
On Sat, 2022-01-29 at 20:22 +0530, Mir Immad wrote:
> Thank you for the detailed information.
> 
> I've been looking into the integer posix  file descriptor APIs and I
> decided to write proof-of-concept  checker for them. (not caring
> about
> errno). The checker tracks the fd returned by open(), warns if dup()
> is
> called with closed fd otherwise tracks the fd returned by dup(), it
> also
> warns if read() and write() functions were called on closed fd. I'm
> attaching a text file that lists some c sources and warnings by the
> static
> analyzer. I've used the diagnostic meta-data from sm-file. Is this
> something that could also be added to the analyzer?

This looks great, and very promising as both new functionality for GCC
13, and as a GSoC 2022 project.

BTW, it looks like you're working with GCC 11, but the analyzer has
changed quite a bit on trunk for GCC 12, so it's worth trying to track
trunk.

I wonder if it's worth checking for attempts to write to a fd that was
opened with O_RDONLY, or the converse?  (I'm not sure, just thinking
aloud - how much state does it make sense to track for a fd?).

Also, at some point, we're going to have to handle "errno" - but given
that might be somewhat fiddly it's OK to defer that until you're more
familiar with the code.

> 
> About the fd leak, that's the next thing I'll try to get working.
> Since
> you've mentioned that it could be a GSoC project, this is what I'm
> going to
> focus on.

Excellent.

Let me know (via this mailing list) if you have any questions.

Thanks
Dave

> 
> Regards.
> 
> 
> 
> On Wed, Jan 26, 2022 at 7:56 PM David Malcolm 
> wrote:
> 
> > On Mon, 2022-01-24 at 01:41 +0530, Mir Immad wrote:
> > > Hi, sir.
> > > 
> > > I've been trying to understand the static analyzer's code. I
> > > spent most
> > > of
> > > my time learning the state machine's API. I learned how state
> > > machine's
> > > on_stmt is supposed to "recognize" specific functions and how
> > > on_transition
> > > takes a specific tree from one state to another, and how the
> > > captured
> > > states are used by pending_diagnostics to report the errors.
> > > Furthermore, I
> > > was able to create a dummy checker that mimicked the behaviour of
> > > sm-
> > > file's
> > > double_fclose and compile GCC with these changes. Is this the
> > > right way
> > > of
> > > learning?
> > 
> > This sounds great.
> > 
> > > 
> > > As you've mentioned on the projects page that you would like to
> > > add
> > > more
> > > support for some POSIX APIs. Can you please write (or refer me to
> > > a) a
> > > simple C program that uses such an API (and also what the
> > > analyzer
> > > should
> > > have done) so that I can attempt to add such a checker to the
> > > analyzer.
> > 
> > A couple of project ideas:
> > 
> > (i) treat data coming from a network connection as tainted, by
> > somehow
> > teaching the analyzer about networking APIs.  Ideally: look at some
> > subset of historical CVEs involving network-facing attacks on user-
> > space daemons, and find a way to detect them in the analyzer (need
> > to
> > find a way to mark the incoming data as tainted, so that the
> > analyer
> > "know" about the trust boundary - that the incoming data needs to
> > be
> > sanitized and treated with extra caution; see
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584372.html
> > for my attempts to do this for the Linux kernel).
> > 
> > Obviously this is potentially a huge project, so maybe just picking
> > a
> > tiny subset and getting that working as a proof-of-concept would be
> > a
> > good GSoC project.  Maybe find an old CVE that someone has written
> > a
> > good write-up for, and think about "how could GCC/-fanalyzer have
> > spotted it?"
> > 
> > (ii) add leak-detection for POSIX file descriptors: i.e. the
> > integer
> > values returned by "open", "dup", etc.  It would be good to have a
> > check that the user's code doesn't leak these values e.g. on error-
> > handling paths, by failing to close a file-descriptor (and not
> > storing
> > it anywhere).  I think that much of this could be done by analogy
> > with
> > the sm-file.cc code.
> > 
> > 
> > > 
> > > Also, I didn't realize the complexity of adding SARIF when I
> > > mentioned
> > > it.
> > > I'd rather work on adding more checkers.
> > 
> > Fair enough.
> > 
> > Hope this above is constructive.
> > 
> > Dave
> > 
> > > 
> > > Regards.
> > > 
> > > Mir Immad
> > > 
> > > On Sun, Jan 23, 2022, 11:04 PM Mir Immad 
> > > wrote:
> > > 
> > > > Hi Sir,
> > > > 
> > > > I've been trying to understand the static analyzer's code. I
> > > > spent
> > > > most of
> > > > my time learning the state machine's API. I learned how state
> > > > machine's
> > > > on_stmt is supposed to "recognize" specific functions and takes
> > > > a
> > > > specific
> > > > tree from one state to another, and how the captured states are
> > > > used
> > > > by
> > > > pending_diagnostics to report the errors. Furthermore, I was
> > > > able to

Doubts about the cp-demangler non recursive project.

2022-01-29 Thread Krishna Narayanan via Gcc
Respected Sir/Madam,
This is Krishna Narayanan a beginner in the gcc community.I have been
reading through a while about the cp-demangler non recursive project,
getting familiar with the basic terminologies about demangler.I would like
to work on it.
Topics which I have covered till now include extern C,C++ filt and about
function abi ::_cxa_demangle.
For the non recursive part I am going through the memory management of
C,stacks and heaps related to function calls, etc.Some basic theory related
to stack overflow and tried using fsanitize (address sanitizer) in the
basic program.Didnt get much clarity about the output of sanitizer but had
a overview of it.I am also familiar with gdb to some certain extent.

What should be my next step,am I on the right track for understanding the
concepts?It would be great if I could get some help about the upcoming
topics I should refer and how should I implement in this project.
Hoping to hear from you soon.
Thanks and Regards,
Krishna Narayanan.


Bisecting

2022-01-29 Thread Søren Holm via Gcc

Hi

I believe I have found some kind of bug in GCC. The target is a 
cortex-m7 CPU. I do not have an isolated test software so I'm thinking 
of bisecting GCC between GCC 9.4 and 10.1.


Are there any easy way do do a fast "change - compile - test"- cycle - 
and how do I do that? All the guide on building GCC is using huge 
scripts with installs and such. I'm sure the main developers does not do 
that.



Thanks

Søren Holm



gcc-11-20220129 is now available

2022-01-29 Thread GCC Administrator via Gcc
Snapshot gcc-11-20220129 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/11-20220129/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 11 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-11 revision 9794cf77a9305f5847919748a73bf30c42aeb5b9

You'll find:

 gcc-11-20220129.tar.xz   Complete GCC

  SHA256=a090404fd86e242e3265e059ba5c3a572b128ed2448f38b22cb30e2b567987a8
  SHA1=634ee2483dbebb23fd32edff0ebb93924524e5a2

Diffs from 11-20220122 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-11
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Bisecting

2022-01-29 Thread Jonathan Wakely via Gcc
On Sat, 29 Jan 2022, 20:25 Søren Holm via Gcc,  wrote:

> Hi
>
> I believe I have found some kind of bug in GCC. The target is a
> cortex-m7 CPU. I do not have an isolated test software so I'm thinking
> of bisecting GCC between GCC 9.4 and 10.1.
>
> Are there any easy way do do a fast "change - compile - test"- cycle -
> and how do I do that? All the guide on building GCC is using huge
> scripts with installs and such. I'm sure the main developers does not do
> that.
>



https://gcc.gnu.org/wiki/InstallingGCC is not a huge script, it's a very
small number of commands.

You can use git bisect to simplify things, but if you don't have a small
reproducer for the problem then I don't see how you can avoid doing a full
build and install. With a simple reproducer, you can just great using the
cc1 or cc1plus binary in the build tree, without installing anything.



>
> Thanks
>
> Søren Holm
>
>