Update on CPython Extension Module -fanalyzer plugin development
Hi Dave, Please find an updated WIP patch on reference count checking below. Some parts aren't properly formatted yet; I apologize for that. Since the last WIP patch, the major updates include: - Updated certain areas of the core analyzer to support custom stmt_finder. - A significant revamp of the reference count checking logic. - __analyzer_cpython_dump_refcounts: This dumps reference count related information. Here's an updated look at the current WIP diagnostic we issue: rc3.c:25:10: warning: Expected to have reference count: ‘1’ but ob_refcnt field is: ‘2’ 25 | return list; | ^~~~ ‘create_py_object’: events 1-4 | |4 | PyObject* item = PyLong_FromLong(3); | |^~ | || | |(1) when ‘PyLong_FromLong’ succeeds |5 | PyObject* list = PyList_New(1); | |~ | || | |(2) when ‘PyList_New’ succeeds |.. | 16 | PyList_Append(list, item); | | ~ | | | | | (3) when ‘PyList_Append’ succeeds, moving buffer |.. | 25 | return list; | | | | | | | (4) here | The reference count checking logic for v1 should be almost complete. Currently, it supports situations where the returned object is newly created. It doesn't yet support the other case (i.e., incremented by 1 from what it was previously, if not newly created). This week, I've focused primarily on the reference count checking logic. I plan to shift my focus to refining the diagnostic next. As seen in the placeholder diagnostic message above, I believe we should at least inform the user about the variable name associated with the region that has an unexpected reference count issue (in this case, 'item'). Initially, I suspect the issue might be that: tree reg_tree = model->get_representative_tree (curr_region); returns NULL since curr_region is heap allocated and thus the path_var returned would be: path_var (NULL_TREE, 0); Which means that: refcnt_stmt_finder finder (*eg, reg_tree); always receives a NULL_TREE, causing it to always default to the not_found case. A workaround might be necessary, but I haven't delved too deeply into this yet, so my suspicion could be off. Additionally, I think it would be helpful to show users what the ob_refcnt looks like in each event as well. I'll keep you updated on both these points and welcome any feedback. Best, Eric --- gcc/analyzer/engine.cc| 8 +- gcc/analyzer/exploded-graph.h | 4 +- gcc/analyzer/region-model.cc | 3 + gcc/analyzer/region-model.h | 38 +- .../gcc.dg/plugin/analyzer_cpython_plugin.c | 376 +- 5 files changed, 421 insertions(+), 8 deletions(-) diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc index 61685f43fba..f9e239128a0 100644 --- a/gcc/analyzer/engine.cc +++ b/gcc/analyzer/engine.cc @@ -115,10 +115,12 @@ impl_region_model_context (program_state *state, } bool -impl_region_model_context::warn (std::unique_ptr d) +impl_region_model_context::warn (std::unique_ptr d, +const stmt_finder *custom_finder) { LOG_FUNC (get_logger ()); - if (m_stmt == NULL && m_stmt_finder == NULL) + auto curr_stmt_finder = custom_finder ? custom_finder : m_stmt_finder; + if (m_stmt == NULL && curr_stmt_finder == NULL) { if (get_logger ()) get_logger ()->log ("rejecting diagnostic: no stmt"); @@ -129,7 +131,7 @@ impl_region_model_context::warn (std::unique_ptr d) bool terminate_path = d->terminate_path_p (); if (m_eg->get_diagnostic_manager ().add_diagnostic (m_enode_for_diag, m_enode_for_diag->get_supernode (), - m_stmt, m_stmt_finder, std::move (d))) + m_stmt, curr_stmt_finder, std::move (d))) { if (m_path_ctxt && terminate_path diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h index 4a4ef9d12b4..633f8c263fc 100644 --- a/gcc/analyzer/exploded-graph.h +++ b/gcc/analyzer/exploded-graph.h @@ -56,7 +56,8 @@ class impl_region_model_context : public region_model_context uncertainty_t *uncertainty, logger *logger = NULL); - bool warn (std::unique_ptr d) final override; + bool warn (std::unique_ptr d, +const stmt_finder *custom_finder = NULL) final override; void add_note (std::unique_ptr pn) final override; void on_svalue_leak (const svalue *) override; void on_liveness_change (const svalue_set &live_svalues, @@ -106,6 +107,7 @@ class impl_region_model_context : public region_model_context std::unique_ptr *out_sm_context) override; const gimple *get_stmt
x64 MSABI gcc 3.2.3
I have a slightly modified gcc 3.2.3. Source is available in gcc-stage* in custom.zip at http://pdos.org Executables are available in customb.zip but everything that is really needed is in pdos.zip gccdos.txt has instructions to run windows.mak which produces the gccx64.exe that I use, but has these warnings on the way: C:\devel\gcc\gcc>pdmake -f windows.mak gccwin -DTARGET_64BIT_DEFAULT -DMSABI -fno-common -O2 -S -D__WIN32__ -D__NOBIVA__ -DI386 -DHAVE_CONFIG_H -DIN_GCC -DPUREISO -I ../../pdos/pdpclib -I ../../pdos/src -I . -I config/i386 -I config -I ../include -o combine.s combine.c combine.c: In function `try_combine': combine.c:1682: warning: left shift count >= width of type combine.c:1684: warning: left shift count >= width of type combine.c:1694: warning: left shift count >= width of type combine.c:1694: warning: left shift count >= width of type combine.c:1696: warning: left shift count >= width of type pdas --oformat coff -o combine.o combine.s If I do pdmake -f makefile.te6 in c:\devel\pdos\pdpclib I do get a valid 64-bit EFI executable. But a more complex program gives: C:\devel\pdos\pdpclib>pdmake -f makefile.eb6 rm -f *.o rm -f pdptest.efi gccx64 -S -O2 -mno-red-zone -D__64BIT__ -D__EFI__ -D__EFIBIOS__ -U__WIN32__ -I. -fno-builtin -fno-common -ansi -o pdptest.s pdptest.c pdptest.c: In function `main': pdptest.c:189: insn does not satisfy its constraints: compilation terminated. [pdptest.o] Error 1: Operation not permitted C:\devel\pdos\pdpclib> When I switch from mingw64 to gccx64: C:\devel\pdos\pdpclib>git diff . diff --git a/pdpclib/makefile.eb6 b/pdpclib/makefile.eb6 index 19fe84e4..98b02da0 100644 --- a/pdpclib/makefile.eb6 +++ b/pdpclib/makefile.eb6 @@ -1,14 +1,15 @@ # This builds EFI executables for x86_64 # We put no-builtin to stop the optimizer from making memset call memset -CC=x86_64-w64-mingw32-gcc +#CC=x86_64-w64-mingw32-gcc +CC=gccx64 #AR=x86_64-w64-mingw32-ar AR=ar #LD=x86_64-w64-mingw32-ld LD=pdld --no-insert-timestamp #AS=x86_64-w64-mingw32-as AS=pdas --64 --oformat coff -COPTS=-S -O2 -mno-red-zone -D__64BIT__ -D__EFI__ -D__EFIBIOS__ -U__WIN32__ -I. -fno-builtin -fno-common -ansi -Wno-builtin-declaration-mismatch +COPTS=-S -O2 -mno-red-zone -D__64BIT__ -D__EFI__ -D__EFIBIOS__ -U__WIN32__ -I. -fno-builtin -fno-common -ansi TARGET=pdptest.efi OBJS=efistart.o stdio.o string.o stdlib.o start.o time.o errno.o \ C:\devel\pdos\pdpclib> Anyone interested in helping out with this? Note that this (3.2.3) is the only version of gcc that runs on PDOS/386 and it took a lot of effort to even do that. Note that 3.2.3 normally wouldn't do the MSABI, but my slightly modified version does: #ifdef MSABI static int const x86_64_int_parameter_registers[6] = {2 /*RCX*/, 1 /*RDX*/, 8 /*R8*/, 9 /*R9*/, /* the following 2 need to be deleted */ FIRST_REX_INT_REG + 2 /*R10 */, FIRST_REX_INT_REG + 3 /*R11 */}; #else static int const x86_64_int_parameter_registers[6] = {5 /*RDI*/, 4 /*RSI*/, 1 /*RDX*/, 2 /*RCX*/, FIRST_REX_INT_REG /*R8 */, FIRST_REX_INT_REG + 1 /*R9 */}; #endif My version (like standard 3.2.3) has 64-bit long, which is not what Windows uses, but I am likely to stick with 64-bit long regardless. Thanks. Paul.
Re: Update on CPython Extension Module -fanalyzer plugin development
On Fri, 2023-08-25 at 08:50 -0400, Eric Feng wrote: > Hi Dave, > > Please find an updated WIP patch on reference count checking below. Some > parts aren't properly formatted yet; I apologize for that. > > Since the last WIP patch, the major updates include: > - Updated certain areas of the core analyzer to support custom stmt_finder. > - A significant revamp of the reference count checking logic. > - __analyzer_cpython_dump_refcounts: This dumps reference count related > information. Thanks for the patch. Given the scope, this is close to being ready for trunk; various comments inline below... > > Here's an updated look at the current WIP diagnostic we issue: > rc3.c:25:10: warning: Expected to > have reference count: ‘1’ but ob_refcnt field is: ‘2’ > 25 | return list; > | ^~~~ > ‘create_py_object’: events 1-4 > | > | 4 | PyObject* item = PyLong_FromLong(3); > | | ^~ > | | | > | | (1) when ‘PyLong_FromLong’ succeeds > | 5 | PyObject* list = PyList_New(1); > | | ~ > | | | > | | (2) when ‘PyList_New’ succeeds > |.. > | 16 | PyList_Append(list, item); > | | ~ > | | | > | | (3) when ‘PyList_Append’ succeeds, moving buffer > |.. > | 25 | return list; > | | > | | | > | | (4) here > | > Looks good. As mentioned in our chat, ideally there would be some kind of information to the user as to what the reference actually are. Perhaps it could be a series of notes like this: note: there is 1 pointer to "*item" note: (1) '*item' is pointed to by '((PyListObject *)list)->ob_item[0]' Or, to be even more fancy, we could add events to the execution path whenever the number of pointers or ob_refcnt changes, giving something like: (1) when ‘PyLong_FromLong’ succeeds (2) '(*item).ob_refcnt' is now 1 (3) when ‘PyList_New’ succeeds (4) '*item' is now pointed to by '((PyListObject *)list)->ob_item[0]'; pointer count is 2, ob_refcnt is 1 (5) when 'create_py_object' returns, '(*item).ob_refcnt' is 1 ...or somesuch (I'm brainstorming here, and I expect that the above might not be implementable, due to local variables and SSA names confusing things). Plus there's the question of whether the return value "owns" a reference. But all of that can be deferred to follow-up work. > The reference count checking logic for v1 should be almost complete. > Currently, it supports situations where the returned object is newly created. > It doesn't yet support the other case (i.e., incremented by 1 from > what it was previously, if not newly created). > > This week, I've focused primarily on the reference count checking logic. I > plan > to shift my focus to refining the diagnostic next. As seen in the placeholder > diagnostic message above, I believe we should at least inform the user about > the > variable name associated with the region that has an unexpected reference > count > issue (in this case, 'item'). Initially, I suspect the issue might be that: > > tree reg_tree = model->get_representative_tree (curr_region); > > returns NULL since curr_region is heap allocated and thus the path_var > returned would be: > > path_var (NULL_TREE, 0); > > Which means that: > > refcnt_stmt_finder finder (*eg, reg_tree); > > always receives a NULL_TREE, causing it to always default to the > not_found case. A workaround might be necessary, but I haven't delved too > deeply into this yet, > so my suspicion could be off. > You could try looking for a representative tree for the pointer, rather than the region. I ran into similar issues with leak detection: we report a leak when nothing is pointing at the region, but if nothing is pointing at the region, then there's nothing to label/describe the region with. A workaround I've used is to look at the old model immediately before the state change, rather than the new one. So for example, use the region_model * for immediately before popping the frame (if it's available; you might need to make a copy?), and do something like: tree expr = old_model->get_representative_tree (curr_region); > Additionally, I think it would be helpful to show users what the > ob_refcnt looks like in each event as well. > You could try make a custom_event subclass for such ob_refcnt messages, but adding them could be fiddly. Perhaps diagnostic_manager::add_events_for_eedge could support some kind of plugin hook whenever we add a statement_event, which you could use to add your extra events whenever the ob_refcnt of the object of interest differs between the old and the new state? (though currently we only emit a statement_event for the first stmt in an enode, and that function is already quite messy, a
How can I run xg++ from its build location?
I am attempting to debug an issue in gcc (PR 110827, if curious). In order to do this I have built a stage 1 compiler with debugging and without optimization as discussed here: https://gcc.gnu.org/wiki/DebuggingGCC#Building_a_Debuggable_Compiler I would like run the compiler from its build location without installing it, but I cannot determine how to have gcc look for its include files and libraries from within its source and build trees. My first problem was that it looked for for cc1plus in the wrong location; my next problems involved include paths. Is it possible to do this without extensive command-line options, or does this need to be installed? If the latter, what target do I use to install the unoptimized stage 1 compiler? -- Michael Welsh Duggan (m...@md5i.com)
Re: How can I run xg++ from its build location?
On Fri, Aug 25, 2023 at 4:16 PM Michael Welsh Duggan via Gcc < gcc@gcc.gnu.org> wrote: > I am attempting to debug an issue in gcc (PR 110827, if curious). In > order to do this I have built a stage 1 compiler with debugging and > without optimization as discussed here: > > https://gcc.gnu.org/wiki/DebuggingGCC#Building_a_Debuggable_Compiler > > I would like run the compiler from its build location without installing > it, but I cannot determine how to have gcc look for its include files > and libraries from within its source and build trees. My first problem > was that it looked for for cc1plus in the wrong location; my next > problems involved include paths. > > Is it possible to do this without extensive command-line options, or > does this need to be installed? If the latter, what target do I use to > install the unoptimized stage 1 compiler? > One can use $ ./xgcc -B./ to run GCC in the build directory, or wherever the driver and compiler are installed. To debug the compiler, it is easier to produce a preprocessed file with the header files included (-E command line option) and direct the compiler, e.g, cc1 or cc1plus, to use the preprocessed file as input. David
Re: How can I run xg++ from its build location?
Michael Welsh Duggan via Gcc writes: > I am attempting to debug an issue in gcc (PR 110827, if curious). In > order to do this I have built a stage 1 compiler with debugging and > without optimization as discussed here: > > https://gcc.gnu.org/wiki/DebuggingGCC#Building_a_Debuggable_Compiler > > I would like run the compiler from its build location without installing > it, but I cannot determine how to have gcc look for its include files > and libraries from within its source and build trees. My first problem > was that it looked for for cc1plus in the wrong location; my next > problems involved include paths. > You can try something like: ./xgcc -B$(pwd) -I$(pwd) ... # (may need to add some more include directives in) I know someone was interested in adding a proper wrapper for this so it Just Worked. But for now, I'd really just... > Is it possible to do this without extensive command-line options, or > does this need to be installed? If the latter, what target do I use to > install the unoptimized stage 1 compiler? Is there a particular reason you can't install it to a temporary prefix, like ./configure --prefix=/opt/foo or even /tmp/foo?
Re: How can I run xg++ from its build location?
On Fri, 2023-08-25 at 16:12 -0400, Michael Welsh Duggan via Gcc wrote: > I am attempting to debug an issue in gcc (PR 110827, if curious). In > order to do this I have built a stage 1 compiler with debugging and > without optimization as discussed here: > > https://gcc.gnu.org/wiki/DebuggingGCC#Building_a_Debuggable_Compiler > > I would like run the compiler from its build location without > installing > it, but I cannot determine how to have gcc look for its include files > and libraries from within its source and build trees. My first > problem > was that it looked for for cc1plus in the wrong location; my next > problems involved include paths. > > Is it possible to do this without extensive command-line options, or > does this need to be installed? If the latter, what target do I use > to > install the unoptimized stage 1 compiler? > You could try: ./xgcc -B. REST OF GCC OPTIONS -wrapper gdb,--args where "./xgcc -B." will run the locally-built "driver" binary (xgcc), and make it look for things like "cc1" in ".". You may find this guide on debugging gcc helpful: https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html Dave
gcc-12-20230825 is now available
Snapshot gcc-12-20230825 is now available on https://gcc.gnu.org/pub/gcc/snapshots/12-20230825/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 12 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-12 revision 8443b6abd7bc7e2f3ed28a0a1d8ebdde9da31ff3 You'll find: gcc-12-20230825.tar.xz Complete GCC SHA256=9e0c6207e540eb9904cde0bab9be183b9939207ce5d2d02454240acd846167c5 SHA1=6eeea7e189857cf218fe09a2b1ab0ef7dc9d2887 Diffs from 12-20230818 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-12 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.