Re: GSoC: Working on the static analyzer
Thank you for the detailed information. I've been looking into the integer posix file descriptor APIs and I decided to write proof-of-concept checker for them. (not caring about errno). The checker tracks the fd returned by open(), warns if dup() is called with closed fd otherwise tracks the fd returned by dup(), it also warns if read() and write() functions were called on closed fd. I'm attaching a text file that lists some c sources and warnings by the static analyzer. I've used the diagnostic meta-data from sm-file. Is this something that could also be added to the analyzer? About the fd leak, that's the next thing I'll try to get working. Since you've mentioned that it could be a GSoC project, this is what I'm going to focus on. Regards. On Wed, Jan 26, 2022 at 7:56 PM David Malcolm wrote: > On Mon, 2022-01-24 at 01:41 +0530, Mir Immad wrote: > > Hi, sir. > > > > I've been trying to understand the static analyzer's code. I spent most > > of > > my time learning the state machine's API. I learned how state machine's > > on_stmt is supposed to "recognize" specific functions and how > > on_transition > > takes a specific tree from one state to another, and how the captured > > states are used by pending_diagnostics to report the errors. > > Furthermore, I > > was able to create a dummy checker that mimicked the behaviour of sm- > > file's > > double_fclose and compile GCC with these changes. Is this the right way > > of > > learning? > > This sounds great. > > > > > As you've mentioned on the projects page that you would like to add > > more > > support for some POSIX APIs. Can you please write (or refer me to a) a > > simple C program that uses such an API (and also what the analyzer > > should > > have done) so that I can attempt to add such a checker to the analyzer. > > A couple of project ideas: > > (i) treat data coming from a network connection as tainted, by somehow > teaching the analyzer about networking APIs. Ideally: look at some > subset of historical CVEs involving network-facing attacks on user- > space daemons, and find a way to detect them in the analyzer (need to > find a way to mark the incoming data as tainted, so that the analyer > "know" about the trust boundary - that the incoming data needs to be > sanitized and treated with extra caution; see > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584372.html > for my attempts to do this for the Linux kernel). > > Obviously this is potentially a huge project, so maybe just picking a > tiny subset and getting that working as a proof-of-concept would be a > good GSoC project. Maybe find an old CVE that someone has written a > good write-up for, and think about "how could GCC/-fanalyzer have > spotted it?" > > (ii) add leak-detection for POSIX file descriptors: i.e. the integer > values returned by "open", "dup", etc. It would be good to have a > check that the user's code doesn't leak these values e.g. on error- > handling paths, by failing to close a file-descriptor (and not storing > it anywhere). I think that much of this could be done by analogy with > the sm-file.cc code. > > > > > > Also, I didn't realize the complexity of adding SARIF when I mentioned > > it. > > I'd rather work on adding more checkers. > > Fair enough. > > Hope this above is constructive. > > Dave > > > > > Regards. > > > > Mir Immad > > > > On Sun, Jan 23, 2022, 11:04 PM Mir Immad wrote: > > > > > Hi Sir, > > > > > > I've been trying to understand the static analyzer's code. I spent > > > most of > > > my time learning the state machine's API. I learned how state > > > machine's > > > on_stmt is supposed to "recognize" specific functions and takes a > > > specific > > > tree from one state to another, and how the captured states are used > > > by > > > pending_diagnostics to report the errors. Furthermore, I was able to > > > create > > > a dummy checker that mimicked the behaviour of sm-file's > > > double_fclose and > > > compile GCC with these changes. Is this the right way of learning? > > > > > > As you've mentioned on the projects page that you would like to add > > > more > > > support for some POSIX APIs. Can you please write (or refer me to a) > > > a > > > simple C program that uses such an API (and also what the analyzer > > > should > > > have done) so that I can attempt to add such a checker to the > > > analyzer. > > > > > > Also, I didn't realize the complexity of adding SARIF when I > > > mentioned it. > > > I'd rather work on adding more checkers. > > > > > > Regards. > > > Mir Immad > > > > > > On Mon, Jan 17, 2022 at 5:41 AM David Malcolm > > > wrote: > > > > > > > On Fri, 2022-01-14 at 22:15 +0530, Mir Immad wrote: > > > > > HI David, > > > > > I've been tinkering with the static analyzer for the last few > > > > > days. I > > > > > find > > > > > the project of adding SARIF output to the analyzer intresting. > > > > > I'm > > > > > writing > > > > > this to let you know that I'm trying to learn the codebase. > > > > >
how to get started with contribution
Respected sir/madam, I am Vaishnavi Andhalkar, a junior undergrad at IIT Roorkee. I have recently started contributing to open source, and I am new at it. But, I am well aware of C++, programming and algorithms, and javascript. I would like to contribute to your organization. Would you please tell me how to get started? Hoping to hear from you soon Thanks and Regards Vaishnavi
Re: how to get started with contribution
On Sat, Jan 29, 2022 at 10:37 AM VAISHNAVI DAYANAND via Gcc wrote: > > Respected sir/madam, > I am Vaishnavi Andhalkar, a junior undergrad at IIT Roorkee. I have > recently started contributing to open source, and I am new at it. But, I am > well aware of C++, programming and algorithms, and javascript. I would like > to contribute to your organization. Would you please tell me how to get > started? > Hoping to hear from you soon > Thanks and Regards > Vaishnavi Thanks for your interest in GCC. Welcome! A good place to start is the GCC Wiki Getting Started page: https://gcc.gnu.org/wiki/#Getting_Started_with_GCC_Development and browse other recent answers to similar questions in the archives of this mailing list. Thanks, David
Re: GSoC: Working on the static analyzer
On Sat, 2022-01-29 at 20:22 +0530, Mir Immad wrote: > Thank you for the detailed information. > > I've been looking into the integer posix file descriptor APIs and I > decided to write proof-of-concept checker for them. (not caring > about > errno). The checker tracks the fd returned by open(), warns if dup() > is > called with closed fd otherwise tracks the fd returned by dup(), it > also > warns if read() and write() functions were called on closed fd. I'm > attaching a text file that lists some c sources and warnings by the > static > analyzer. I've used the diagnostic meta-data from sm-file. Is this > something that could also be added to the analyzer? This looks great, and very promising as both new functionality for GCC 13, and as a GSoC 2022 project. BTW, it looks like you're working with GCC 11, but the analyzer has changed quite a bit on trunk for GCC 12, so it's worth trying to track trunk. I wonder if it's worth checking for attempts to write to a fd that was opened with O_RDONLY, or the converse? (I'm not sure, just thinking aloud - how much state does it make sense to track for a fd?). Also, at some point, we're going to have to handle "errno" - but given that might be somewhat fiddly it's OK to defer that until you're more familiar with the code. > > About the fd leak, that's the next thing I'll try to get working. > Since > you've mentioned that it could be a GSoC project, this is what I'm > going to > focus on. Excellent. Let me know (via this mailing list) if you have any questions. Thanks Dave > > Regards. > > > > On Wed, Jan 26, 2022 at 7:56 PM David Malcolm > wrote: > > > On Mon, 2022-01-24 at 01:41 +0530, Mir Immad wrote: > > > Hi, sir. > > > > > > I've been trying to understand the static analyzer's code. I > > > spent most > > > of > > > my time learning the state machine's API. I learned how state > > > machine's > > > on_stmt is supposed to "recognize" specific functions and how > > > on_transition > > > takes a specific tree from one state to another, and how the > > > captured > > > states are used by pending_diagnostics to report the errors. > > > Furthermore, I > > > was able to create a dummy checker that mimicked the behaviour of > > > sm- > > > file's > > > double_fclose and compile GCC with these changes. Is this the > > > right way > > > of > > > learning? > > > > This sounds great. > > > > > > > > As you've mentioned on the projects page that you would like to > > > add > > > more > > > support for some POSIX APIs. Can you please write (or refer me to > > > a) a > > > simple C program that uses such an API (and also what the > > > analyzer > > > should > > > have done) so that I can attempt to add such a checker to the > > > analyzer. > > > > A couple of project ideas: > > > > (i) treat data coming from a network connection as tainted, by > > somehow > > teaching the analyzer about networking APIs. Ideally: look at some > > subset of historical CVEs involving network-facing attacks on user- > > space daemons, and find a way to detect them in the analyzer (need > > to > > find a way to mark the incoming data as tainted, so that the > > analyer > > "know" about the trust boundary - that the incoming data needs to > > be > > sanitized and treated with extra caution; see > > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584372.html > > for my attempts to do this for the Linux kernel). > > > > Obviously this is potentially a huge project, so maybe just picking > > a > > tiny subset and getting that working as a proof-of-concept would be > > a > > good GSoC project. Maybe find an old CVE that someone has written > > a > > good write-up for, and think about "how could GCC/-fanalyzer have > > spotted it?" > > > > (ii) add leak-detection for POSIX file descriptors: i.e. the > > integer > > values returned by "open", "dup", etc. It would be good to have a > > check that the user's code doesn't leak these values e.g. on error- > > handling paths, by failing to close a file-descriptor (and not > > storing > > it anywhere). I think that much of this could be done by analogy > > with > > the sm-file.cc code. > > > > > > > > > > Also, I didn't realize the complexity of adding SARIF when I > > > mentioned > > > it. > > > I'd rather work on adding more checkers. > > > > Fair enough. > > > > Hope this above is constructive. > > > > Dave > > > > > > > > Regards. > > > > > > Mir Immad > > > > > > On Sun, Jan 23, 2022, 11:04 PM Mir Immad > > > wrote: > > > > > > > Hi Sir, > > > > > > > > I've been trying to understand the static analyzer's code. I > > > > spent > > > > most of > > > > my time learning the state machine's API. I learned how state > > > > machine's > > > > on_stmt is supposed to "recognize" specific functions and takes > > > > a > > > > specific > > > > tree from one state to another, and how the captured states are > > > > used > > > > by > > > > pending_diagnostics to report the errors. Furthermore, I was > > > > able to
Doubts about the cp-demangler non recursive project.
Respected Sir/Madam, This is Krishna Narayanan a beginner in the gcc community.I have been reading through a while about the cp-demangler non recursive project, getting familiar with the basic terminologies about demangler.I would like to work on it. Topics which I have covered till now include extern C,C++ filt and about function abi ::_cxa_demangle. For the non recursive part I am going through the memory management of C,stacks and heaps related to function calls, etc.Some basic theory related to stack overflow and tried using fsanitize (address sanitizer) in the basic program.Didnt get much clarity about the output of sanitizer but had a overview of it.I am also familiar with gdb to some certain extent. What should be my next step,am I on the right track for understanding the concepts?It would be great if I could get some help about the upcoming topics I should refer and how should I implement in this project. Hoping to hear from you soon. Thanks and Regards, Krishna Narayanan.
Bisecting
Hi I believe I have found some kind of bug in GCC. The target is a cortex-m7 CPU. I do not have an isolated test software so I'm thinking of bisecting GCC between GCC 9.4 and 10.1. Are there any easy way do do a fast "change - compile - test"- cycle - and how do I do that? All the guide on building GCC is using huge scripts with installs and such. I'm sure the main developers does not do that. Thanks Søren Holm
gcc-11-20220129 is now available
Snapshot gcc-11-20220129 is now available on https://gcc.gnu.org/pub/gcc/snapshots/11-20220129/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 11 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-11 revision 9794cf77a9305f5847919748a73bf30c42aeb5b9 You'll find: gcc-11-20220129.tar.xz Complete GCC SHA256=a090404fd86e242e3265e059ba5c3a572b128ed2448f38b22cb30e2b567987a8 SHA1=634ee2483dbebb23fd32edff0ebb93924524e5a2 Diffs from 11-20220122 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-11 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Bisecting
On Sat, 29 Jan 2022, 20:25 Søren Holm via Gcc, wrote: > Hi > > I believe I have found some kind of bug in GCC. The target is a > cortex-m7 CPU. I do not have an isolated test software so I'm thinking > of bisecting GCC between GCC 9.4 and 10.1. > > Are there any easy way do do a fast "change - compile - test"- cycle - > and how do I do that? All the guide on building GCC is using huge > scripts with installs and such. I'm sure the main developers does not do > that. > https://gcc.gnu.org/wiki/InstallingGCC is not a huge script, it's a very small number of commands. You can use git bisect to simplify things, but if you don't have a small reproducer for the problem then I don't see how you can avoid doing a full build and install. With a simple reproducer, you can just great using the cc1 or cc1plus binary in the build tree, without installing anything. > > Thanks > > Søren Holm > >