GSoC: Extending the Static Analysis Pass
Hi everyone, Hi David, I'm interested in extending the static analysis pass as a GSoC project. Short introduction of me: I'm Tim, currently doing my master in computer science with focus on IT security at TU Darmstadt. I already worked with IFDS as part of my bachelor thesis and took both program analysis courses in my masters. Specifically, I thought about extending the analyzer to check new things i.e. the POSIX file-descriptor project. I would prefer a medium-sized project, do you think this is doable? Also, I've read a bit through the internal documentation and got a question. The documentation mostly mentions the Reps paper as the source for the exploded supergraph. But the paper suggests more such as having extensional summaries that lead to the same context sensitivity as unbounded call-strings. In contrast, the documentation also talks about being call-strings limited and searching for complex-enough methods to be worth summarizing. Do you only use the exploded supergraph idea but not the rest? Otherwise why do you use call-strings and search for methods worth summarizing? Best regards Tim
GSoC
Hi everyone, my name is Tim and I'm also working on the static analyzer this summer. Some of you might already noticed my nooby questions in the IRC ;). Specifically, I'll be working on extending the analyzer with several smaller warnings that the clang analyzer already has. David created a meta-bug[0] with the results of the discussion between him and me about the gap and what seems to be useful. I won't do all of those but rather look how many of them I'm able to get done until September. I will begin with a Cast Size warning. This emits a warning when the tracked allocation size is not a multiple of the pointee's size, e.g., when casting malloc(10) to int*. Furthermore, in preparation for the official coding phase, I played around a bit with a state machine that tracks whether an int is zero or not. So this is probably my next candidate after cast size. - Tim [0] https://gcc.gnu.org/bugzilla/showdependencytree.cgi?id=105887&hide_resolved=1
fanalyzer: debugging zero state machine
> On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm wrote: > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote: > > > > Hi Dave, > > > > I did spent some time to think about the zero state machine. I first > > thought about distinguishing between "assigned zero" and "EQ 0 > > condition on the path" for cases where e.g. unreachable() is used to > > say that some variable will never be zero according to the programmer. > > In that case the dev might not want zero warnings from conditions > > outside the if body itself for dev build where unreachable expands to > > some panic exit. But as the condition constraint is not pruned when the > > state machine is distinguishing the states, I'm not sure how to know > > whether the analysis already left the if body? > > The analyzer works on the gimple-ssa representation, which uses basic > blocks in a CFG, rather than an AST, so the only remants we have of > scope is in "clobber" statements (a special kind of assignment stmt), > which the gimplify adds as variables go out of scope. If the constraints only lived until the immediate dominator of `if (cond)`, I could easily distinguish: 1. if (x == 0) && still inside the if => zero 2. if (x == 0) && outside if => maybe zero but as this seems to be not the case, if I want to distinguish 1. & 2., I'd have to find another way. > > For pruning, the analyzer's state_machine class has a "can_purge_p" > virtual function: > > /* Return true if it safe to discard the given state (to help > when simplifying state objects). > States that need leak detection should return false. */ > virtual bool can_purge_p (state_t s) const = 0; > > which should return true for a "zeroness" state machine, in that we > always consider pruning states for svalues that aren't needed anymore > along a path. Is implemented and returns true. > > Is there some other kind of state explosion you're running into? It's > hard to figure this out further without seeing code. No, my code is by far not that mature to be tested. I just had in my head that I wanted to find out if I can distinguish the two cases. > > > > Also, while trying out different things, it seems simple assignments on > > phi like here > > int x; > > if (argc == 1) { > > x = 1; // x_5 > > } else { > > x = 0; // x_4 > > } > > printf("%i", 5 / x); // x_2 > > automatically work such that x_2 already inherits the state from > > x_4/x_5 without me doing anything inside my sm's on_phi function. Same > > for the simple y = 0; x = y; case. Where does this happen inside the > > code? > > With the caveat that I'm seeing your code, what's probably happening is > that we have either: > > BB (a): > x_5 = 1; > goto BB (c); > > BB (b): > x_4 = 0; > goto BB (c); > > BB (c): > x_2 = PHI (x_5 from (a), x_4 from (b)); I compiled it with -g, so this one is like the dumped gimple. > > or (at higher optimization levels): > > BB (a): > goto BB (c); > > BB (b): > goto BB (c); > > BB (c): > x_2 = PHI (1 from (a), 0 from (b)); > > and I think that at the phi node we have region_model::handle_phi, > which is setting x_2 to either the constant 1 or the constant 0 in the > store, and is calling the on_phi vfunc, leading to on_phi being called > for all state machines. Thanks, that is the case. The set_value inside handle_phi seems to this for me. > > BTW, are you implementing an override for this vfunc: > virtual state_machine::state_t get_default_state (const svalue *) > const; > > to capture the inherently known zeroness/nonzeroness of constant_svalue > instances? That would make those constants have that state. Yeah, I saw that on your nullness check. I tried it on a small example with and without, but didn't noticed a difference in warnings (except for not having zero(x_4) inside the supergraph.dot). So if I understood this right, this is just to have one state less for that variable/value[0]? If that is right, is it also favorable to "merge" the stop state and non_zero state inside the zero state machine because - for now - there is no warning planned on non-zero values? [0] less states are favorable because then the analyzer maybe has less different enodes to visit and thus less runtime(?) > > Thanks > Dave > > > - Tim Also another question unrelated to the ones before. I do have a weird bug in my zero sm[1] but I'm unsure where my sm is flawed. Take for example, the probably simplest case: int x = 0;h printf("%d", 42 / x); If I use inform to emit a no
Re: fanalyzer: debugging zero state machine
On Do, Jun 9 2022 at 13:40:06 -0400, David Malcolm wrote: On Thu, 2022-06-09 at 16:49 +0200, Tim Lange wrote: > On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm wrote: > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote: > > > > Hi Dave, Hi Tim; various responses inline below... > > > > I did spent some time to think about the zero state machine. I first > > thought about distinguishing between "assigned zero" and "EQ 0 > > condition on the path" for cases where e.g. unreachable() is used to > > say that some variable will never be zero according to the programmer. > > In that case the dev might not want zero warnings from conditions > > outside the if body itself for dev build where unreachable expands to > > some panic exit. But as the condition constraint is not pruned when the > > state machine is distinguishing the states, I'm not sure how to know > > whether the analysis already left the if body? > > The analyzer works on the gimple-ssa representation, which uses basic > blocks in a CFG, rather than an AST, so the only remants we have of > scope is in "clobber" statements (a special kind of assignment stmt), > which the gimplify adds as variables go out of scope. If the constraints only lived until the immediate dominator of `if (cond)`, I could easily distinguish: 1. if (x == 0) && still inside the if => zero 2. if (x == 0) && outside if => maybe zero but as this seems to be not the case, if I want to distinguish 1. & 2., I'd have to find another way. > > For pruning, the analyzer's state_machine class has a "can_purge_p" > virtual function: > > /* Return true if it safe to discard the given state (to help > when simplifying state objects). > States that need leak detection should return false. */ > virtual bool can_purge_p (state_t s) const = 0; > > which should return true for a "zeroness" state machine, in that we > always consider pruning states for svalues that aren't needed anymore > along a path. Is implemented and returns true. > > Is there some other kind of state explosion you're running into? It's > hard to figure this out further without seeing code. No, my code is by far not that mature to be tested. I just had in my head that I wanted to find out if I can distinguish the two cases. > > > > Also, while trying out different things, it seems simple assignments on > > phi like here > > int x; > > if (argc == 1) { > > x = 1; // x_5 > > } else { > > x = 0; // x_4 > > } > > printf("%i", 5 / x); // x_2 > > automatically work such that x_2 already inherits the state from > > x_4/x_5 without me doing anything inside my sm's on_phi function. Same > > for the simple y = 0; x = y; case. Where does this happen inside the > > code? > > With the caveat that I'm seeing your code, what's probably happening is > that we have either: > > BB (a): > x_5 = 1; > goto BB (c); > > BB (b): > x_4 = 0; > goto BB (c); > > BB (c): > x_2 = PHI (x_5 from (a), x_4 from (b)); I compiled it with -g, so this one is like the dumped gimple. > > or (at higher optimization levels): > > BB (a): > goto BB (c); > > BB (b): > goto BB (c); > > BB (c): > x_2 = PHI (1 from (a), 0 from (b)); > > and I think that at the phi node we have region_model::handle_phi, > which is setting x_2 to either the constant 1 or the constant 0 in the > store, and is calling the on_phi vfunc, leading to on_phi being called > for all state machines. Thanks, that is the case. The set_value inside handle_phi seems to this for me. > > BTW, are you implementing an override for this vfunc: > virtual state_machine::state_t get_default_state (const svalue *) > const; > > to capture the inherently known zeroness/nonzeroness of constant_svalue > instances? That would make those constants have that state. Yeah, I saw that on your nullness check. I tried it on a small example with and without, but didn't noticed a difference in warnings (except for not having zero(x_4) inside the supergraph.dot). So if I understood this right, this is just to have one state less for that variable/value[0]? The states are stored in sm_state_map using a mapping from svalue to state. Given e.g. a parameter, this will be "initial_svalue (parm)", but in the above case where x_4 either has value 1 or has value 0, it's not storing a state for x_4; it's storin
[RFC] analyzer: allocation size warning
allocation-size. * sm-malloc.cc (class dubious_allocation_size): New pending_diagnostic subclass. (capacity_compatible_with_type): New. (const_operand_in_sval_p): New. (struct_or_union_with_inheritance_p): New. (check_capacity): New. (malloc_state_machine::on_stmt): Add calls to on_pointer_assignment. (malloc_state_machine::on_allocator_call): Add node to parameters and call to on_pointer_assignment. (malloc_state_machine::on_pointer_assignment): New. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/attr-malloc-6.c: Disabled Wanalyzer-allocation-size and added default implementation for FILE. * gcc.dg/analyzer/capacity-1.c: Added dg directives. * gcc.dg/analyzer/malloc-4.c: Disabled Wanalyzer-allocation-size. * gcc.dg/analyzer/pr96639.c: Disabled Wanalyzer-allocation-size and added default implementation for foo and bar. * gcc.dg/analyzer/allocation-size-1.c: New test. * gcc.dg/analyzer/allocation-size-2.c: New test. * gcc.dg/analyzer/allocation-size-3.c: New test. * gcc.dg/analyzer/allocation-size-4.c: New test. Signed-off-by: Tim Lange --- gcc/analyzer/analyzer.opt | 4 + gcc/analyzer/sm-malloc.cc | 363 +- .../gcc.dg/analyzer/allocation-size-1.c | 54 +++ .../gcc.dg/analyzer/allocation-size-2.c | 44 +++ .../gcc.dg/analyzer/allocation-size-3.c | 48 +++ .../gcc.dg/analyzer/allocation-size-4.c | 39 ++ gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c | 2 + gcc/testsuite/gcc.dg/analyzer/capacity-1.c | 5 +- gcc/testsuite/gcc.dg/analyzer/malloc-4.c | 6 +- gcc/testsuite/gcc.dg/analyzer/pr96639.c | 2 + 10 files changed, 559 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt index 4aea52d3a87..f213989e0bb 100644 --- a/gcc/analyzer/analyzer.opt +++ b/gcc/analyzer/analyzer.opt @@ -78,6 +78,10 @@ Wanalyzer-malloc-leak Common Var(warn_analyzer_malloc_leak) Init(1) Warning Warn about code paths in which a heap-allocated pointer leaks. +Wanalyzer-allocation-size +Common Var(warn_analyzer_allocation_size) Init(1) Warning +Warn about code paths in which a buffer is assigned to a incompatible type. + Wanalyzer-mismatching-deallocation Common Var(warn_analyzer_mismatching_deallocation) Init(1) Warning Warn about code paths in which the wrong deallocation function is called. diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc index 3bd40425919..790c9f0e57d 100644 --- a/gcc/analyzer/sm-malloc.cc +++ b/gcc/analyzer/sm-malloc.cc @@ -46,6 +46,8 @@ along with GCC; see the file COPYING3. If not see #include "attribs.h" #include "analyzer/function-set.h" #include "analyzer/program-state.h" +#include "print-tree.h" +#include "gimple-pretty-print.h" #if ENABLE_ANALYZER @@ -428,6 +430,7 @@ private: get_or_create_deallocator (tree deallocator_fndecl); void on_allocator_call (sm_context *sm_ctxt, + const supernode *node, const gcall *call, const deallocator_set *deallocators, bool returns_nonnull = false) const; @@ -444,6 +447,16 @@ private: void on_realloc_call (sm_context *sm_ctxt, const supernode *node, const gcall *call) const; + void on_pointer_assignment(sm_context *sm_ctxt, + const supernode *node, + const gassign *assign_stmt, + tree lhs, + tree rhs) const; + void on_pointer_assignment(sm_context *sm_ctxt, + const supernode *node, + const gcall *call, + tree lhs, + tree rhs) const; void on_zero_assignment (sm_context *sm_ctxt, const gimple *stmt, tree lhs) const; @@ -1432,6 +1445,117 @@ private: const char *m_funcname; }; +/* Concrete subclass for casts of pointers that lead to trailing bytes. */ + +class dubious_allocation_size : public malloc_diagnostic +{ +public: + dubious_allocation_size (const malloc_state_machine &sm, tree lhs, tree rhs, + tree size_tree, unsigned HOST_WIDE_INT size_diff) + : malloc_diagnostic(sm, rhs), m_type(dubious_allocation_type::CONSTANT_SIZE), + m_lhs(lhs), m_size_tree(size_tree), m_size_diff(size_diff) + {} + + dubious_allocation_size (const malloc_state_machine &sm, tree lhs, tree rhs, + tree size_tree) + : malloc_diagnostic(sm, rhs), m_type(dubious_allocation_type::MISSING_OPERAND), + m_lhs(lhs), m_size_tree(size_tree), m_size_diff(0) + {} + + const char *get_kind () const final override + { + return "dubious_allocation_size"; + } + + int get_controlling_option () const final override + { + return OPT_Wanalyzer_allocation_size; + } + + bool subclass_equal_p (const pending_diagnostic &base_other) const + final override + { + const dubious_allocation_size &other = (const dubious_allocation_size &)base_
[RFC] analyzer: add allocation size warning
I think my mail client did apply auto-wrap and reduced multiple spaces to a single one while doing so. Here again, the full patch as well as the ASCII diagnostics. This should look better now. On constant size allocations: /path/to/allocation-size-3.c:22:14: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ | ^ ‘test_2’: event 1 | | 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ | | ^ | | | | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing bytes; either the allocated size is bogus or the type on the left-hand side is wrong | On symbolic buffer sizes: /path/to/allocation-size-3.c:33:14: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 33 | int *ptr = malloc (n + sizeof (int)); /* { dg-line malloc3 } */ | ^ ‘test_3’: event 1 | | 33 | int *ptr = malloc (n + sizeof (int)); /* { dg-line malloc3 } */ | | ^ | | | | | (1) Allocation is incompatible with ‘int *’; either the allocated size is bogus or the type on the left-hand side is wrong | And this is how a simple flow looks like: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] /path/to/allocation-size-2.c:39:8: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 39 | int *iptr = (int *)ptr; /* { dg-line assign } */ |^~~~ ‘test_4’: events 1-2 | | 38 | void *ptr = malloc (n * sizeof (short)); /* { dg-message } */ | | ^~~ | | | | | (1) allocated here | 39 | int *iptr = (int *)ptr; /* { dg-line assign } */ | | | || | |(2) ‘ptr’ is incompatible with ‘int *’; either the allocated size at (1) is bogus or the type on the left-hand side is wrong | This patch adds an allocation size checker to the analyzer. The checker warns when the tracked buffer size is not a multiple of the left-hand side pointee's type. This resolves PR analyzer/105900. The patch is not yet fully tested. gcc/analyzer/ChangeLog: * analyzer.opt: Add Wanalyzer-allocation-size. * sm-malloc.cc (class dubious_allocation_size): New pending_diagnostic subclass. (capacity_compatible_with_type): New. (const_operand_in_sval_p): New. (struct_or_union_with_inheritance_p): New. (check_capacity): New. (malloc_state_machine::on_stmt): Add calls to on_pointer_assignment. (malloc_state_machine::on_allocator_call): Add node to parameters and call to on_pointer_assignment. (malloc_state_machine::on_pointer_assignment): New. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/attr-malloc-6.c: Disabled Wanalyzer-allocation-size and added default implementation for FILE. * gcc.dg/analyzer/capacity-1.c: Added dg directives. * gcc.dg/analyzer/malloc-4.c: Disabled Wanalyzer-allocation-size. * gcc.dg/analyzer/pr96639.c: Disabled Wanalyzer-allocation-size and added default implementation for foo and bar. * gcc.dg/analyzer/allocation-size-1.c: New test. * gcc.dg/analyzer/allocation-size-2.c: New test. * gcc.dg/analyzer/allocation-size-3.c: New test. * gcc.dg/analyzer/allocation-size-4.c: New test. Signed-off-by: Tim Lange --- gcc/analyzer/analyzer.opt | 4 + gcc/analyzer/sm-malloc.cc | 363 +- .../gcc.dg/analyzer/allocation-size-1.c | 54 +++ .../gcc.dg/analyzer/allocation-size-2.c | 44 +++ .../gcc.dg/analyzer/allocation-size-3.c | 48 +++ .../gcc.dg/analyzer/allocation-size-4.c | 39 ++ gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c | 2 + gcc/testsuite/gcc.dg/analyzer/capacity-1.c| 5 +- gcc/testsuite/gcc.dg/analyzer/malloc-4.c | 6 +- gcc/testsuite/gcc.dg/analyzer/pr96639.c | 2 + 10 files changed, 559 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt index 4aea52d3a87..f213989e0bb 100644 --- a/gcc/analyzer/analyzer.opt +++ b/gcc/analyzer/analyzer.opt @@ -78,6 +78,10 @@ Wanalyze
Re: [RFC] analyzer: allocation size warning
On Fr, Jun 17 2022 at 22:45:42 +0530, Prathamesh Kulkarni wrote: On Fri, 17 Jun 2022 at 21:25, Tim Lange wrote: Hi everyone, Hi Tim, Thanks for posting the POC patch! Just a couple of comments (inline) Hi Prathamesh, thanks for looking at it. tracked in PR105900 [0], I'd like to add support for a new warning on dubious allocation sizes. The new checker emits a warning when the allocation size is not a multiple of the type's size. With the checker, following mistakes are detected: int *arr = malloc(3); // forgot to multiply by sizeof arr[0] = ...; arr[1] = ...; or int *buf = malloc (n + sizeof(int)); // probably should be * instead of + Because it is implemented inside the analyzer, it also emits warnings when the buffer is first of type void* and later on casted to something else. Though, this also inherits a limitation. The checker can not distinguish 2 * sizeof(short) from sizeof(int) because sizeof is resolved and constants are folded at the point when the analyzer runs. As a mitigation, I plan to implement a check in the frontend that emits a warning if sizeof(lhs pointee type) is not part of the malloc argument. IMHO, warning if sizeof(lhs pointee_type) is not present inside malloc, might not be a good idea because it would reject valid calls to malloc. For eg: (1) size_t size = sizeof(int); int *p = malloc (size); (2) void *p = malloc (sizeof(int)); int *q = p; Hm, that's right. Maybe only warn when there is a sizeof(type) in the argument and the lhs pointee_type != type (except for void*, maybe char* and "inherited" structs)? I'm looking for a first feedback on the phrasing of the diagnostics as well on the preliminary patch [1]. On constant buffer sizes, the warnings looks like this: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ | ^ ‘test_2’: event 1 | | 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ | | ^ | | | | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing bytes; either the allocated size is bogus or the type on the left-hand side is wrong | On symbolic buffer sizes: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 33 | int *ptr = malloc (n + sizeof(int)); /* { dg-line malloc3 } */ | ^~~~ ‘test_3’: event 1 | | 33 | int *ptr = malloc (n + sizeof(int)); /* { dg-line malloc3 } */ | | ^~~~ | | | | | (1) Allocation is incompatible with ‘int *’; either the allocated size is bogus or the type on the left-hand side is wrong | Won't the warning be incorrect if 'n' is a multiple of sizeof(int) ? I assume by symbolic buffer size, 'n' is not known at compile time. * VLAs are resolved to n * sizeof(type) when the analyzer runs and work fine. * Flows with if (cond) n = ...; else n = ...; are tracked by the analyzer with a widening_svalue and can be handled (While thinking about this answer, I noticed my patch is missing this case. Thanks!) * In case of more complicated flows, the analyzer's buffer size tracking resorts to unknown_svalue. If any variable in an expression is unknown, no warning will be emitted. * Generally, when requesting memory for a variable type, accepting an arbitrary number doesn't sound right. I do warn, e.g. if 'n' is a conjured_svalue (e.g. a from scanf call). I think only the last case could in theory be a false-positive. I've noticed that this is the case when 'n' is guarded by an if making sure n is only a multiple of sizeof(type). In theory, I can fix this case too as the analysis is path-sensitive. Do you know of some other case where 'n' might be an unknown value neither guarded an if condition nor resorted to 'unknown' by a complicated flow but still correct? - Tim Thanks, Prathamesh And this is how a simple flow looks like: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 39 | int *iptr = (int *)ptr; /* { dg-line assign } */ | ^~~~ ‘test_4’: events 1-2 | | 38 | void *ptr = malloc (n * sizeof (short)); /* { dg-message } */ | | ^~~ | | | | | (1) allocated here | 39 | int *iptr = (int *)ptr; /* { dg-line assign } */ | | | | | | | (2) ‘ptr’ is incompatible with ‘int *’; either the allocated size at (1) is bogus or the type on the left-hand side is wrong | There are some things to discuss from my side: * The tests with the "toy re-implementation of CPython's object model&qu
Re: [RFC] analyzer: allocation size warning
On Fri, Jun 17, 2022 at 01:48:09PM -0400, David Malcolm wrote: > On Fri, 2022-06-17 at 17:54 +0200, Tim Lange wrote: > > Hi everyone, > > Hi Tim. > > Thanks for the patch. > > Various comments inline below, throughout... > > > > > tracked in PR105900 [0], I'd like to add support for a new warning on > > dubious allocation sizes. The new checker emits a warning when the > > allocation size is not a multiple of the type's size. With the checker, > > following mistakes are detected: > > int *arr = malloc(3); // forgot to multiply by sizeof > > arr[0] = ...; > > arr[1] = ...; > > or > > int *buf = malloc (n + sizeof(int)); // probably should be * instead > > of + > > Because it is implemented inside the analyzer, it also emits warnings > > when the buffer is first of type void* and later on casted to something > > else. Though, this also inherits a limitation. The checker can not > > distinguish 2 * sizeof(short) from sizeof(int) because sizeof is > > resolved and constants are folded at the point when the analyzer runs. > > As a mitigation, I plan to implement a check in the frontend that emits > > a warning if sizeof(lhs pointee type) is not part of the malloc > > argument. > > > > I'm looking for a first feedback on the phrasing of the diagnostics as > > well on the preliminary patch [1]. > > > > On constant buffer sizes, the warnings looks like this: > > warning: Allocated buffer size is not a multiple of the pointee's size > > [CWE-131] [-Wanalyzer-allocation-size] > > 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ > > | ^ > > ‘test_2’: event 1 > > | > > | 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } > > */ > > | | ^ > > | | | > > | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing > > bytes; either the allocated size is bogus or the type on the left-hand > > side is wrong > > | > > Something strange seems to have happened with the indentation in your > email; the code in the patch seems to me to be strangely indented, and > looking at the archive here: > https://gcc.gnu.org/pipermail/gcc/2022-June/238907.html > I see the same thing, so I think it's a problem with what the mailing > list received, rather than just in my mail client. Maybe something > > FWIW I normally use "git send-email" to send patches. > > The underlinings in the above look strange; I see this in your email: I have resent the patch using git send-email as a reply to my original message. The new message looks properly formatted in the archive: https://gcc.gnu.org/pipermail/gcc/2022-June/238911.html > > warning: Allocated buffer size is not a multiple of the pointee's size > [CWE-131] [-Wanalyzer-allocation-size] >22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ > | ^ > ‘test_2’: event 1 > | > | 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } > */ > | | ^ > | | | > | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing > bytes; either the allocated size is bogus or the type on the left-hand > side is wrong > | > > Should it have been (omitting the dg-line directives for clarity): > > warning: Allocated buffer size is not a multiple of the pointee's size > [CWE-131] [-Wanalyzer-allocation-size] >22 | int *ptr = malloc (10 + sizeof(int)); > |^ > ‘test_2’: event 1 > | > | 22 | int *ptr = malloc (10 + sizeof(int)); > ||^ > ||| > ||(1) Casting a 14 byte buffer to ‘int *’ leaves 2 > trailing bytes; either the allocated size is bogus or the type on the > left-hand side is wrong > | > > ? > > It looks like something somewhere has collapsed repeated whitespace in > the message down to single spaces, which has broken the ASCII art in > your examples, and the indentation in your code. > > > It would probably be helpful for the message to tell the user what > sizeof(*ptr) is, sizeof(int) in this case (much more helpful when it's > a struct) > > Maybe something alike: > > note: a buffer of 14 bytes is allocated... > note: ...but sizeof (int) is 4 bytes... > note: ...leaving 2 trailing bytes for an array of 3 'int's (which would > occupy 12 bytes) > > or somesuch??? > > I'm br
Re: [RFC] analyzer: allocation size warning
On Sat Jun 18, 2022 at 12:13 AM CEST, David Malcolm wrote: > On Fri, 2022-06-17 at 22:23 +0200, Tim Lange wrote: > > On Fri, Jun 17, 2022 at 01:48:09PM -0400, David Malcolm wrote: > > > On Fri, 2022-06-17 at 17:54 +0200, Tim Lange wrote: > > [...snip...] > > > > > > > > I have resent the patch using git send-email as a reply to my original > > message. > > The new message looks properly formatted in the archive: > > https://gcc.gnu.org/pipermail/gcc/2022-June/238911.html > > Thanks; that's *much* more readable. > > > [...snip...] > > > > > > > > > > > > > On symbolic buffer sizes: > > > warning: Allocated buffer size is not a multiple of the pointee's > > > size > > > [CWE-131] [-Wanalyzer-allocation-size] > > > 33 | int *ptr = malloc (n + sizeof(int)); /* { dg-line malloc3 } > > > */ > > > | ^~~~ > > > ‘test_3’: event 1 > > > | > > > | 33 | int *ptr = malloc (n + sizeof(int)); /* { dg-line malloc3 > > > } > > > */ > > > | | ^~~~ > > > | | | > > > | | (1) Allocation is incompatible with ‘int *’; either the > > > allocated size is bogus or the type on the left-hand side is wrong > > > | > > > > > > > > > Is there location information for both the malloc and for the > > > assignment, here? > > > > I'm not sure whether I understand your question but the warning is > > emitted at the gcall* with a ssa var lhs and the call_fndecl on the > > rhs. > > I think that is enough to split that up into "(1) n + sizeof(int) > > allocated here" and "(2) Allocation at (1) is incompatible with..."? > > Probably, yes. > > FWIW I wrote some more notes about the events in my reply to to your > reply to Prathamesh, here: > https://gcc.gnu.org/pipermail/gcc/2022-June/238917.html > > [...snip...] > > > > > > > There are some things to discuss from my side: > > > * The tests with the "toy re-implementation of CPython's object > > > model"[2] fail due to a extra warning emitted. Because the analyzer > > > can't know the calculation actually results in a correct buffer size > > > when viewed as a string_obj later on, it emits a warning, e.g. at > > > line > > > 61 in data-model-5.c. The only mitigation would be to disable the > > > warning for structs entirely. Now, the question is to rather have > > > noise > > > on these cases or disable the warning for structs entirely? > > > > > > Can you post the full warning please? > > > > /path/to/data-model-5.c: In function ‘alloc_obj’: > > /path/to/data-model-5.c:61:31: warning: Allocated buffer size is not a > > multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] > > 61 | base_obj *obj = (base_obj *)malloc (sz); > > | ^~~ > > ‘new_string_obj’: events 1-2 > > | > > | 69 | base_obj *new_string_obj (const char *str) > > | | ^~ > > | | | > > | | (1) entry to ‘new_string_obj’ > > |.. > > | 75 | = (string_obj *)alloc_obj (&str_type, sizeof > > (string_obj) + len + 1); > > | | > > > > | | | > > | | (2) calling ‘alloc_obj’ from > > ‘new_string_obj’ > > | > > +--> ‘alloc_obj’: events 3-4 > > | > > | 59 | base_obj *alloc_obj (type_obj *ob_type, size_t sz) > > | | ^ > > | | | > > | | (3) entry to ‘alloc_obj’ > > | 60 | { > > | 61 | base_obj *obj = (base_obj *)malloc (sz); > > | | ~~~ > > | | | > > | | (4) Allocation is > > incompatible with ‘base_obj *’; either the allocated size is bogus or > > the type on the left-hand side is wrong > > | > > > > > > > > These testcases exhibit a common way of faking inheritance in C, and > > > I > > > think it ought to be possible to support this in the warning. > > >
Re: [RFC] analyzer: allocation size warning
The checker reaches region-model.cc#3083 in my patch with the impl_region_model_context on the 'after' node of create_buffer() but then discards the warning inside impl_region_model_context::warn because m_stmt is null. Even if m_stmt were not be NULL at the 'after' node, my warning would be emitted before the return edge was taken and thus be wrongly indented like shown below: /path/to/.c:10:16: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 10 | int *buf = create_buffer(42); |^ ‘main’: events 1-2 | |9 | int main (int argc, char **argv) { | | ^~~~ | | | | | (1) entry to ‘main’ | 10 | int *buf = create_buffer(42); | |~ | || | |(2) calling ‘create_buffer’ from ‘main’ | +--> ‘create_buffer’: events 3-4 | |4 | void *create_buffer(int n) | | ^ | | | | | (3) entry to ‘create_buffer’ |5 | { |6 | return malloc(n); | |~ | || | |(4) allocated 42 bytes here | ‘main’: event 5 | | 10 | int *buf = create_buffer(42); | |^ | || | |(5) Assigned to ‘int *’ here | The correct warning should be: /path/to/.c:10:16: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 10 | int *buf = create_buffer(42); |^ ‘main’: events 1-2 | |9 | int main (int argc, char **argv) { | | ^~~~ | | | | | (1) entry to ‘main’ | 10 | int *buf = create_buffer(42); | |~ | || | |(2) calling ‘create_buffer’ from ‘main’ | +> ‘create_buffer’: events 3-4 | |4 | void *create_buffer(int n) | | ^ | | | | | (3) entry to ‘create_buffer’ |5 | { |6 | return malloc(n); | |~ | || | |(4) allocated 42 bytes here | ‘main’: event 5 <--+ | | 10 | int *buf = create_buffer(42); | |^ | || | |(5) Assigned to ‘int *’ here | For that, the return edge has to be visited to be part of the emission_path. This is currently not the case as the assignment of the to the caller lhs is handled inside pop_frame, which is transitively called from program_state::on_edge of the 'after' node of the callee. I tried to defer the set_value(caller lhs, ) call to the 'before' node after the return edge but failed to do elegantly. My last try is in the patch commented out with // FIXME. My main problem is that I can not pop the frame and later get the return value easily. Deferring the whole pop_frame to the before node breaks the assumptions inside exploded_graph::get_or_create_node. I don't know what's the best/elegant way of solving this. Is a solution to attach the return svalue to the return edge and then use it later in the PK_BEFORE_SUPERNODE? Signed-off-by: Tim Lange --- gcc/analyzer/analyzer.opt | 4 + gcc/analyzer/checker-path.cc | 12 +- gcc/analyzer/checker-path.h | 2 +- gcc/analyzer/engine.cc| 12 + gcc/analyzer/pending-diagnostic.h | 21 ++ gcc/analyzer/region-model.cc | 322 ++ gcc/analyzer/region-model.h | 4 + .../gcc.dg/analyzer/allocation-size-1.c | 63 .../gcc.dg/analyzer/allocation-size-2.c | 44 +++ .../gcc.dg/analyzer/allocation-size-3.c | 48 +++ .../gcc.dg/analyzer/allocation-size-4.c | 92 + gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c | 2 + gcc/testsuite/gcc.dg/analyzer/malloc-4.c | 2 +- gcc/testsuite/gcc.dg/analyzer/pr96639.c | 2 + 14 files changed, 627 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-3.
[PATCH v2] analyzer: add allocation size checker
Hi, I've addressed most of the points from the review. * The allocation size warning warns whenever region_model::get_capacity returns something useful, i.e. also on statically-allocated regions. * I added a new virtual function to the pending-diagnostic class, so that it is possible to emit a custom region creation description. * The test cases should have a better coverage now. * Conservative struct handling The warning now looks like this: /path/to/main.c:9:8: warning: allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 9 | int *iptr = ptr; |^~~~ ‘main’: events 1-2 | |8 | void *ptr = malloc((long unsigned int)n * sizeof(short)); | | ^ | | | | | (1) allocated ‘(long unsigned int)n * 2’ bytes here |9 | int *iptr = ptr; | | | || | |(2) assigned to ‘int *’ here; ‘sizeof(int)’ is ‘4’ | /path/to/main.c:15:15: warning: allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 15 | int *ptrw = malloc (sizeof (short)); | ^~~ ‘main’: events 1-2 | | 15 | int *ptrw = malloc (sizeof (short)); | | ^~~ | | | | | (1) allocated ‘2’ bytes here | | (2) assigned to ‘int *’ here; ‘sizeof (int)’ is ‘4’ | The only thing I couldn't address was moving the second event toward the lhs or assign token here. I tracked it down till get_stmt_location where it seems that the rhs is actually the location of the statement. Is there any way to get (2) to be focused on the lhs? Otherwise, the patch compiled coreutils, openssh, curl and httpd without any false-positive (but none of them contained a bug found by the checker either). `make check-gcc RUNTESTFLAGS="analyzer.exp"` tests pass and as I just worked on the event splitting, the regression tests are yet to run. - Tim This patch adds an checker that warns about code paths in which a buffer is assigned to a incompatible type, i.e. when the allocated buffer size is not a multiple of the pointee's size. gcc/analyzer/ChangeLog: * analyzer.opt: Added Wanalyzer-allocation-size. * checker-path.cc (region_creation_event::get_desc): Added call to new virtual function pending_diagnostic::describe_region_creation_event. * checker-path.h: Added region_creation_event::get_desc. * diagnostic-manager.cc (diagnostic_manager::add_event_on_final_node): New function. * diagnostic-manager.h: Added diagnostic_manager::add_event_on_final_node. * pending-diagnostic.h (struct region_creation): New event_desc struct. (pending_diagnostic::describe_region_creation_event): Added virtual function to overwrite description of a region creation. * region-model.cc (class dubious_allocation_size): New class. (capacity_compatible_with_type): New helper function. (class size_visitor): New class. (struct_or_union_with_inheritance_p): New helper function. (is_any_cast_p): New helper function. (region_model::check_region_size): New function. (region_model::set_value): Added call to region_model::check_region_size. * region-model.h (class region_model): New function check_region_size. * svalue.cc (region_svalue::accept): Changed to post-order traversal. (initial_svalue::accept): Likewise. (unaryop_svalue::accept): Likewise. (binop_svalue::accept): Likewise. (sub_svalue::accept): Likewise. (repeated_svalue::accept): Likewise. (bits_within_svalue::accept): Likewise. (widening_svalue::accept): Likewise. (unmergeable_svalue::accept): Likewise. (compound_svalue::accept): Likewise. (conjured_svalue::accept): Likewise. (asm_output_svalue::accept): Likewise. (const_fn_result_svalue::accept): Likewise. gcc/ChangeLog: * doc/invoke.texi: Added Wanalyzer-allocation-size. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/pr96639.c: Changed buffer size to omit warning. * gcc.dg/analyzer/allocation-size-1.c: New test. * gcc.dg/analyzer/allocation-size-2.c: New test. * gcc.dg/analyzer/allocation-size-3.c: New test. * gcc.dg/analyzer/allocation-size-4.c: New test. * gcc.dg/analyzer/allocation-size-5.c: New test. Signed-off-by: Tim Lange --- gcc/analyzer/analyzer.opt | 4 + gcc/analyzer/checker-path.cc | 11 +- gcc/analyzer/checker-path.h | 2 +- gcc/analyzer/diagnostic-manager.cc| 61 gcc/analyzer/diagnostic-manager.h | 4 +
Re: [PATCH v2] analyzer: add allocation size checker
On Wed Jun 29, 2022 at 7:39 PM CEST, David Malcolm wrote: > On Wed, 2022-06-29 at 17:39 +0200, Tim Lange wrote: > > > Hi, > > Thanks for the updated patch. > > Overall, looks nearly ready; various nits inline below, throughout... > > > > > I've addressed most of the points from the review. > > * The allocation size warning warns whenever region_model::get_capacity > > returns > > something useful, i.e. also on statically-allocated regions. > > Thanks. Looks like you added test coverage for this in allocation- > size-5.c > > > * I added a new virtual function to the pending-diagnostic class, so > that it > > is possible to emit a custom region creation description. > > * The test cases should have a better coverage now. > > * Conservative struct handling > > > > The warning now looks like this: > > /path/to/main.c:9:8: warning: allocated buffer size is not a multiple of > > the pointee's size [CWE-131] [-Wanalyzer-allocation-size] > > 9 | int *iptr = ptr; > > |^~~~ > > ‘main’: events 1-2 > > | > > |8 | void *ptr = malloc((long unsigned int)n * sizeof(short)); > > | | ^ > > | | | > > | | (1) allocated ‘(long unsigned int)n * 2’ bytes > > here > > |9 | int *iptr = ptr; > > | | > > | || > > | |(2) assigned to ‘int *’ here; ‘sizeof(int)’ is ‘4’ > > | > > Looks great. > > > > > /path/to/main.c:15:15: warning: allocated buffer size is not a multiple of > > the pointee's size [CWE-131] [-Wanalyzer-allocation-size] > >15 | int *ptrw = malloc (sizeof (short)); > > | ^~~ > > ‘main’: events 1-2 > > | > > | 15 | int *ptrw = malloc (sizeof (short)); > > | | ^~~ > > | | | > > | | (1) allocated ‘2’ bytes here > > Looks a bit weird to be quoting a number here; maybe whenever the > expression is just a constant, print it unquoted? (though that could > be fiddly to implement, so can be ignored if it turns out to be) . Isn't the 'q' in '%qE' responsible for quoting. Using '%E' instead if m_expr is an INTEGER_CST works. Otherwise, I've left the quoting on the "'sizeof (int)' is '4'" note. I do think thata looks better than without. /path/to/main.c:16:15: warning: allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 16 | int *ptrw = malloc (21 * sizeof (short)); | ^~~~ ‘main’: events 1-2 | | 16 | int *ptrw = malloc (21 * sizeof (short)); | | ^~~~ | | | | | (1) allocated 42 bytes here | | (2) assigned to ‘int *’ here; ‘sizeof (int)’ is ‘4’ | > > > > | | (2) assigned to ‘int *’ here; ‘sizeof (int)’ is > > ‘4’ > > | > > The only thing I couldn't address was moving the second event toward the > > lhs or > > assign token here. I tracked it down till get_stmt_location where it seems > > that > > the rhs is actually the location of the statement. Is there any way to get > > (2) > > to be focused on the lhs? > > Annoyingly, we've lost a lot of location information by the time the > analyzer runs. > > In theory we could special-case for when we have the def-stmt of the > SSA_NAME that's that default (i.e. initial) value of a VAR_DECL, and if > we see the write is there, we could use the DECL_SOUCE_LOCATION of the > VAR_DECL for the write, so that we'd get: > > | 15 | int *ptrw = malloc (sizeof (short)); > | |^~~~ ^~~ > | || | > | || (1) allocated ‘2’ bytes here > | |(2) assigned to ‘int *’ here; ‘sizeof (int)’ is ‘4’ > | > > which is perhaps slightly more readable. I'm not sure it's worth it > though. Hm, okay. I've left that out for now. > > > > > Otherwise, the patch compiled coreutils, openssh, curl and httpd without any > > false-positive (but none of them contained a bug found by the checker > > either). > > Great. > > > `make check-gcc RUNTESTFLAGS="analyzer.exp"` tests pass and as I just > > worked on > &g
[PATCH v3] analyzer: add allocation size checker [PR105900]
Hi, here's the updated patch that should address all the comments from the v2. - Tim This patch adds an checker that warns about code paths in which a buffer is assigned to a incompatible type, i.e. when the allocated buffer size is not a multiple of the pointee's size. 2022-07-30 Tim Lange gcc/analyzer/ChangeLog: PR analyzer/105900 * analyzer.opt: Added Wanalyzer-allocation-size. * checker-path.cc (region_creation_event::get_desc): Added call to new virtual function pending_diagnostic::describe_region_creation_event. * checker-path.h: Added region_creation_event::get_desc. * diagnostic-manager.cc (diagnostic_manager::add_event_on_final_node): New function. * diagnostic-manager.h: Added diagnostic_manager::add_event_on_final_node. * pending-diagnostic.h (struct region_creation): New event_desc struct. (pending_diagnostic::describe_region_creation_event): Added virtual function to overwrite description of a region creation. * region-model.cc (class dubious_allocation_size): New class. (capacity_compatible_with_type): New helper function. (class size_visitor): New class. (struct_or_union_with_inheritance_p): New helper function. (is_any_cast_p): New helper function. (region_model::check_region_size): New function. (region_model::set_value): Added call to region_model::check_region_size. * region-model.h (class region_model): New function check_region_size. * svalue.cc (region_svalue::accept): Changed to post-order traversal. (initial_svalue::accept): Likewise. (unaryop_svalue::accept): Likewise. (binop_svalue::accept): Likewise. (sub_svalue::accept): Likewise. (repeated_svalue::accept): Likewise. (bits_within_svalue::accept): Likewise. (widening_svalue::accept): Likewise. (unmergeable_svalue::accept): Likewise. (compound_svalue::accept): Likewise. (conjured_svalue::accept): Likewise. (asm_output_svalue::accept): Likewise. (const_fn_result_svalue::accept): Likewise. gcc/ChangeLog: PR analyzer/105900 * doc/invoke.texi: Added Wanalyzer-allocation-size. gcc/testsuite/ChangeLog: PR analyzer/105900 * gcc.dg/analyzer/pr96639.c: Changed buffer size to omit warning. * gcc.dg/analyzer/allocation-size-1.c: New test. * gcc.dg/analyzer/allocation-size-2.c: New test. * gcc.dg/analyzer/allocation-size-3.c: New test. * gcc.dg/analyzer/allocation-size-4.c: New test. * gcc.dg/analyzer/allocation-size-5.c: New test. Signed-off-by: Tim Lange --- gcc/analyzer/analyzer.opt | 4 + gcc/analyzer/checker-path.cc | 11 +- gcc/analyzer/checker-path.h | 2 +- gcc/analyzer/diagnostic-manager.cc| 61 +++ gcc/analyzer/diagnostic-manager.h | 4 + gcc/analyzer/pending-diagnostic.h | 20 + gcc/analyzer/region-model.cc | 370 ++ gcc/analyzer/region-model.h | 2 + gcc/analyzer/svalue.cc| 26 +- gcc/doc/invoke.texi | 14 + .../gcc.dg/analyzer/allocation-size-1.c | 116 ++ .../gcc.dg/analyzer/allocation-size-2.c | 155 .../gcc.dg/analyzer/allocation-size-3.c | 45 +++ .../gcc.dg/analyzer/allocation-size-4.c | 60 +++ .../gcc.dg/analyzer/allocation-size-5.c | 36 ++ gcc/testsuite/gcc.dg/analyzer/pr96639.c | 2 +- 16 files changed, 912 insertions(+), 16 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-5.c diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt index 4aea52d3a87..1d612246a30 100644 --- a/gcc/analyzer/analyzer.opt +++ b/gcc/analyzer/analyzer.opt @@ -54,6 +54,10 @@ The minimum number of supernodes within a function for the analyzer to consider Common Joined UInteger Var(param_analyzer_max_enodes_for_full_dump) Init(200) Param The maximum depth of exploded nodes that should appear in a dot dump before switching to a less verbose format. +Wanalyzer-allocation-size +Common Var(warn_analyzer_allocation_size) Init(1) Warning +Warn about code paths in which a pointer to a buffer is assigned to an incompatible type. + Wanalyzer-double-fclose Common Var(warn_analyzer_double_fclose) Init(1) Warning Warn about code paths in which a stdio FILE can be closed more than once. diff --git a/gcc/analyzer/checker-path.cc b/gcc/analyzer/checker-path.cc index 0133dc94137..953e192cd55 100644 --- a/gcc/analyzer/check
Re: Setting up editors for the GNU/GCC coding style?
On Thu, Jul 28 2022 at 02:46:58 PM -0400, David Malcolm via Gcc wrote: Is there documentation on setting up text editors to work with our coding style? A lot of the next generation of developers aren't using vi or emacs; they's using VS Code, CLion, and other editors. Does anyone have docs on e.g. how to set up VS Code, CLion, etc (IntelliJ ?) to work well on GCC's own code base. FWIW I use Emacs; I've dabbed with VS Code but haven't used it "for real". I did prepare my first patch(es) with vscode. For debugging, I set up vscode to launch gcc with gdbserver as wrapper and then let the vscode debugger to connect to the gdbserver. At first, I tried to get the gnu coding style to work in the hacky way by using tabSize=8 and rebinding tab to 2 spaces but later ditched that because it bothered me more than doing just spaces and replacing 8 spaces with 1 tab before sending the patch. That still wastes time because all files that I didn't touch look ugly unless I temporarily change the tabSize and some comments don't use tabs so I can't just replace all 8 spaces with 1 tab. For reference, my config files for gcc are available at [0]. - Tim [0] https://gist.github.com/timll/1c4c542c7c98e3610c14aec19cdf7e91 I'm hoping to add this to my newbies guide. Thanks Dave
GCC warns on defined behavior with Wrestrict?
Hi everyone, while testing a new buffer overlap and restrict checker in the analyzer, it emitted a warning inside coreutils. During the discussion [0], Paul Eggert posted a link to the current draft of the next C standard [1] with new examples for the definition of 'restrict'. Especially example 3 in 6.7.3.1 is interesting: void h(int n, int * restrict p, int * restrict q, int * restrict r) { int i; for (i = 0; i < n; i++) p[i] = q[i] + r[i]; } The draft says that a call h(100, a, b, b) is *defined* behavior because 'b' is never modified inside the method. That brings up the question how GCC should handle this. At the moment, Wrestrict (and my new Wanalyzer-restrict) warns on defined behavior: /path/to/main.c:70:13: warning: passing argument 3 to ‘restrict’-qualified parameter aliases with argument 4 [-Wrestrict] 70 | h(100, a, b, b); | ^ ~ But finding out that 'q' and 'r' are never modified needs a pass over the callee. Further, when the callee implementation isn't available at the point Wrestrict runs, we couldn't emit a warning at all if we don't want any false positive. I thought maybe a tradeoff would be to keep warning without checking for writes on parameters in the callee but additionally issue a fix-it hint that if the memory location the parameter points to is never written, to add the points-to const type qualifier. The addition of the const qualifier simplifies deducing that the memory location the parameter point to is never written for Wrestrict and already silences the warning. What do you think? - Tim [0] https://lists.gnu.org/archive/html/bug-gnulib/2022-07/msg00066.html [1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912
Usage of the C++ stdlib unordered_map in GCC
Hello, I was preparing a patch for GCC and used the unordered_map from the C++ stdlib in my patch. Later on, I noticed that it is used nowhere else inside GCC except for some files in the go frontend. I wondered, now that building GCC requires a C++11 host compiler, whether there is a consensus on which data structure implementation is preferred. Should I rather use a hash_map instead of an unordered_map or is it on my side to decide which one I choose? - Tim
Re: Usage of the C++ stdlib unordered_map in GCC
On Mi, Aug 31 2022 at 10:35:08 -0400, Jason Merrill via Gcc wrote: Generally we want to use the GCC hash_map because it works with GCC garbage collection (and PCH). Is that not relevant to your patch? Jason The map is only part a short-lived visitor object inside the analyzer and is used to map svalues to other svalues (in most cases <15 kv pairs). I'm new to GCC, so I started of using what I knew. Only later, I have noticed that the unordered_map is used little to nowhere. It is not much effort to change it but I just wondered whether the usage is so low because GCC only lately switched to C++11 or other reasons. The responses answered my question, thanks. - Tim
Re: Debugging C++ frontend using CLion IDE
Hi Berke, I had the same problem last year. Many IDEs don't really work for developing gcc. Most here probably use either emacs or vim. If you want to use an IDE, you might have to do some hacks. The oldschool indentation style of gcc (mix of tab and spaces) is not widely supported. IDEs/Editors that support displaying this identation style are GNOME Builder (but that doesn't support the way gcc is built), Eclipse, VS Code (since Dec 22) and vim/emacs of course. For VS Code, you need to apply 'unexpand' to the source files as it does not support converting 8 spaces to 1 tab automatically. There are plugins for Code to run customs commands on save. To keep your IDE snappy, you should exclude everything that you won't need from the indexing, especially the test directory. At last, for debugging, I had good experiences with launching a gdbserver, i.e. replacing "-wrapper gdb" with "-wrapper gdbserver,localhost:2345". Then you can connect your IDE to that gdbserver and fully use the IDE interface to debug. You can configure running gcc with a gdbserver as a pre-task to automate this. - Tim PS: When I tried CLion last year, I neither could get the build system working nor the indentation. So you might want to look at one of the other IDEs but I don't if something changed in the time. On Mi, Mär 1 2023 at 20:59:17 +0300, Berke Yavas via Gcc wrote: Hi, I am trying to set up my environment for the upcoming google summer of code. One thing I haven't figured out yet, how can I debug C++ frontend(or any other language frontend) with CLion. If anybody managed to do this(or using another IDE), could you please share your settings with me? Best regards, Berke