[PATCH] Avoid infinite loop with duplicate anonymous union fields
(this patch is already uploaded to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86690 ) If a struct contains an anonymous union and both have a field with the same name, detect_field_duplicates_hash() will replace one of them with NULL. If compilation doesn't stop immediately, it may later call lookup_field() on the union, which falsely assumes the union's LANG_SPECIFIC array is sorted, and may loop indefinitely because of this. Reproduced on amd64 since gcc-5, on ubuntu-18.04 and gentoo. The patch falls back to iterating via DECL_CHAIN if there was an error earlier during compilation. I ran the gcc testsuite with the result (the FAIL seems unrelated to the patch): FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors) === gcc Summary === # of expected passes135094 # of unexpected failures1 # of expected failures 398 # of unsupported tests 2140 gcc-build/gcc/xgcc version 8.0.1 20180424 (experimental) (GCC) --- gcc-8.0.1-20180424/gcc/c/c-typeck.c 2018-07-26 20:00:55.475792602 +0300 +++ gcc-8.0.1-20180424/gcc/c/c-typeck.c 2018-07-26 21:39:13.312629356 +0300 @@ -2207,9 +2207,14 @@ /* If TYPE_LANG_SPECIFIC is set, then it is a sorted array of pointers to the field elements. Use a binary search on this array to quickly find the element. Otherwise, do a linear search. TYPE_LANG_SPECIFIC - will always be set for structures which have many elements. */ + will always be set for structures which have many elements. + + Duplicate field checking replaces duplicates with NULL_TREE so + TYPE_LANG_SPECIFIC arrays are potentially no longer sorted. In that + case just iterate using DECL_CHAIN. */ - if (TYPE_LANG_SPECIFIC (type) && TYPE_LANG_SPECIFIC (type)->s) + if (TYPE_LANG_SPECIFIC (type) && TYPE_LANG_SPECIFIC (type)->s + && diagnostic_kind_count(global_dc, DK_ERROR) == 0) { int bot, top, half; tree *field_array = &TYPE_LANG_SPECIFIC (type)->s->elts[0];
Re: [PATCH] Avoid infinite loop with duplicate anonymous union fields
With fresh git sources and contrib/gcc_update the tests pass: === gcc Summary === # of expected passes 133500 # of expected failures 422 # of unsupported tests 2104 gcc-build/gcc/xgcc version 9.0.0 20180730 (experimental) (GCC) I wasn't able to reduce the input to avoid including and as it only reproduces without -save-temps, it's not clear how to write a testcase for this one. On Fri, Jul 27, 2018 at 8:01 PM, Joseph Myers wrote: > On Fri, 27 Jul 2018, Bogdan Harjoc wrote: > >> If a struct contains an anonymous union and both have a field with the >> same name, detect_field_duplicates_hash() will replace one of them >> with NULL. If compilation doesn't stop immediately, it may later call >> lookup_field() on the union, which falsely assumes the union's >> LANG_SPECIFIC array is sorted, and may loop indefinitely because of >> this. >> >> Reproduced on amd64 since gcc-5, on ubuntu-18.04 and gentoo. >> >> The patch falls back to iterating via DECL_CHAIN if there was an error >> earlier during compilation. > > The patch should also add a testcase to the testsuite (which fails before > and passes after the patch). > >> I ran the gcc testsuite with the result (the FAIL seems unrelated to the >> patch): >> >> FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors) > > You should use contrib/gcc_update --touch to get timestamps in the right > order when checking out. This failure is a symptom of badly ordered > timestamps. > > -- > Joseph S. Myers > jos...@codesourcery.com
Re: [PATCH] Avoid infinite loop with duplicate anonymous union fields
#define foo(a) did it, thanks! As Joseph suspected, the hang/no hang result depended on the values of DECL_NAME pointers: - with #define foo(a) plus the testcase from bugzilla id 86690 and no -save-temps, the "return s.a" that triggers lookup_field() will find the sorted field_array containing: (gdb) p component $1 = (tree) 0x76300050 (gdb) p field_array[0].decl_minimal.name $0 = (tree) 0x761cfd70 $1 = (tree) 0x0 $2 = (tree) 0x763000f0 $3 = (tree) 0x76300140 $4 = (tree) 0x76300190 $5 = (tree) 0x763001e0 $6 = (tree) 0x76300230 $7 = (tree) 0x76300280 $8 = (tree) 0x763002d0 $9 = (tree) 0x76300320 So array[0] < component < array[2], which loops (I removed the gdb p commands for field_array[1] and so on). - with same test-case and with -save-temps I get: (gdb) p component $1 = (tree) 0x76300c30 (gdb) p field_array[0].decl_minimal.name $0 = (tree) 0x761cfd70 $1 = (tree) 0x0 $2 = (tree) 0x76300050 $3 = (tree) 0x763000a0 $4 = (tree) 0x763000f0 $5 = (tree) 0x76300140 $6 = (tree) 0x76300190 $7 = (tree) 0x763001e0 $8 = (tree) 0x76300230 $9 = (tree) 0x76300280 So component > array[9], and in this case the binary search doesn't end up with bottom=0 and top=2, where it would hang earlier. Component here is the s.a field, and with -save-temps, cc1 gets bin.i as input (which it treats as preprocessed due to the extension) instead of bin.c. So with preprocessed/unprocessed source, the order in which builtin/user-defined names are allocated changes, resulting in a hang or no-hang result. I propose this testcase for gcc/testsuite/gcc.dg/ as it reproduces with or without -save-temps, and with no other magic 'scanf' identifiers because it keeps a0 first in the sorted array and a1 (which will be replaced with NULL) second: int a0; struct S { int a1; union { int a0; int a1; int a2, a3, a4, a5, a6, a7, a8, a9; int a10, a11, a12, a13, a14, a15; }; }; int f() { struct S s; return s.a0; } (gdb) p component $1 = (tree) 0x76300c30 (gdb) p field_array[0].decl_minimal.name $1 = (tree) 0x76300c30 $1 = (tree) 0x0 $2 = (tree) 0x76300050 $3 = (tree) 0x763000a0... Thanks for the guidance, Bogdan On Tue, Jul 31, 2018 at 10:43 PM, Joseph Myers wrote: > On Tue, 31 Jul 2018, Bogdan Harjoc wrote: > >> With fresh git sources and contrib/gcc_update the tests pass: >> >> === gcc Summary === >> >> # of expected passes 133500 >> # of expected failures 422 >> # of unsupported tests 2104 >> >> gcc-build/gcc/xgcc version 9.0.0 20180730 (experimental) (GCC) >> >> I wasn't able to reduce the input to avoid including and as >> it only reproduces without -save-temps, it's not clear how to write a >> testcase for this one. > > Could you give more details of the paths through the code that are > involved in the infinite loop, and the different paths you get without > -save-temps? Is this an issue of dependence on the values of pointers, or > something like that? Is it possible to produce a test with more instances > of the problem in it, say, so that the probability of the problem showing > up at least once with the test is much higher? > > A binary search should not result in an infinite loop simply because the > elements of the array are not sorted; in that case it should just converge > on an unpredictable element. So more explanation of how the infinite loop > occurs is needed. (But if choice of an unpredictable element results in > e.g. subsequent diagnostics varying depending on pointer values, that by > itself is a problem that may justify avoiding this code in the case where > the array may not be sorted.) > > -- > Joseph S. Myers > jos...@codesourcery.com
Re: [PATCH] Avoid infinite loop with duplicate anonymous union fields
On Wed, Aug 1, 2018 at 1:20 AM, Joseph Myers wrote: > On Wed, 1 Aug 2018, Bogdan Harjoc wrote: > >> So array[0] < component < array[2], which loops (I removed the gdb p >> commands for field_array[1] and so on). > > Is the key thing here that you end up with DECL_NAME (field) == NULL_TREE, > but DECL_NAME (field_array[bot]) != NULL_TREE - and in this particular > case of a bad ordering only, it's possible to loop without either top or > bot being changed? (But other details of the DECL_NAME ordering are > needed to actually get to that particular point.) Yes, once it enters the "if DECL_NAME (field) == NULL_TREE" body, only bot can change, and since "DECL_NAME (field_array[bot]) == NULL_TREE" is false, the inner while never runs, so it skips directly to "continue;" below with no changes to bot or top. So the function looks correct, as long as field_array really is qsort'ed if TYPE_LANG_SPECIFIC is set. > seen_error () is the idiomatic way of testing whether an error has been > reported. The updated patch is attached and includes a test that passes with: make check-gcc RUNTESTFLAGS="dg.exp=union-duplicate-field.c" diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 90ae306c9..5fc62d84d 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -2209,7 +2209,11 @@ lookup_field (tree type, tree component) find the element. Otherwise, do a linear search. TYPE_LANG_SPECIFIC will always be set for structures which have many elements. */ - if (TYPE_LANG_SPECIFIC (type) && TYPE_LANG_SPECIFIC (type)->s) + /* Duplicate field checking replaces duplicates with NULL_TREE so + TYPE_LANG_SPECIFIC arrays are potentially no longer sorted. In that + case just iterate using DECL_CHAIN. */ + + if (TYPE_LANG_SPECIFIC (type) && TYPE_LANG_SPECIFIC (type)->s && !seen_error()) { int bot, top, half; tree *field_array = &TYPE_LANG_SPECIFIC (type)->s->elts[0]; diff --git a/gcc/testsuite/gcc.dg/union-duplicate-field.c b/gcc/testsuite/gcc.dg/union-duplicate-field.c new file mode 100644 index 0..da9a945d9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/union-duplicate-field.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-std=c99" } */ + +int a0; + +struct S +{ +int a1; +union { +int a0; +int a1; /* { dg-error "duplicate member" } */ +int a2, a3, a4, a5, a6, a7, a8, a9; +int a10, a11, a12, a13, a14, a15; +}; +}; + +int f() +{ +struct S s; +return s.a0; +}