Hi, Eric and Dave, I did not make it to the GSoC program. I am not surprised.
In this email, I would like to share some thoughts on this project with Eric and pose some questions to Dave. In the past month, I have been active in the CPython community. Now I am nominated as a triage member. https://github.com/python/core-workflow/issues/503 I took a look at how the GCC extension and how the analyzer works. I have the basic idea of how this project should work. Questions: 1. Where should this project (cpychecker) resides? Since it's an extension, it may live outside of the GCC project. But it currently also relies on some internal headers of the analyzer. If it lives outside, making the analyzer's internal header stable for public use would be the best choice here. 2. Where do people in GCC discuss development plans or new ideas? In other large projects, I observed people discussing such things in a forum. I emailed one of the contributors. He replied that this email list would be such a place, as well as the IRC channel. But this mailing list is less active than the project itself. I guess the most discussions are through the `gcc-patch` mailing list. Thoughts/Experiences/Advice: (to Eric) 1. Plugins GCC has plugin mechanisms: https://gcc.gnu.org/wiki/plugins If you provide a shared library, the compiler loads your library and calls your function. It initiates your plugin. Your plugin registers some callbacks. The compiler invokes the callbacks later. Specific to the analyzer, you can see this initialization happen at `gcc/analyzer/engine.cc`. https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/analyzer/engine.cc;h=a5965c2b8ff048e47d9c1687d5298a11020a5bee;hb=HEAD#l6102 You can try writing a basic "nop" plugin first. You need to include those headers defining the virtual function interfaces. 1. State Machine and Known Functions As you can see from the interface: the class `plugin_analyzer_init_iface` https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/analyzer/analyzer.h;h=a1619525afaf9322f1ef6d6ec387d6eea70f7c0f;hb=HEAD#l275 You can register two things: state machine and known functions. The state machine is defined in `sm.h`. These provide core functionality. You can check all those `sm-*.cc` files. For instance, we have several states on a pointer, malloced or freed. You can read the logic in `sm-malloc.cc` Known function is defined in `analyzer.h`. It provides you the ability to do checks on function calls. You can check `kf.cc` for reference implementations. When completed, this plugin would consist of several `state_machine`s and `known_function`s. 3. Go through the code logic with GDB I don't know to what extent you have interacted with GCC or if you have coded in C++. I strongly recommend using gdb. I found it very helpful to debug with gdb. You can go through the code with gdb and do breakpoints anywhere. You don't need to add some debug lines, then recompile. (Once you have tried compiling GCC, you will understand what I am saying.) You can also see the full backtrace, knowing the callee of each function (even where function pointers are used). You can breakpoint all `ana::*` functions using a wildcard character `*.` Then gcc will break at any function related to the analyzer. You can then use `c` to continue. 4. Start with easy issues. You can read David's guide here. https://gcc-newbies-guide.readthedocs.io/en/latest/index.html My personal experience is that if you don't know what to do. Try solving relevant issues. You can merely find out what caused the bug. Solving them would be a plus. I did this in issues #109190 and #109027 and understood how the analyzer works. --- I will act more like a reviewer and adviser for this project. (To Eric:) I can review your code and give you advice. I will help you more when you are stuck with some implementation bugs. CC me the relevant changes. I will review them when I am available. Best, Steven