Re: gccadmin hooks: make it a public git repo
Adding GCC ML. On 1/7/21 11:41 AM, Martin Liška wrote: Hello. Time to time, I'm debugging git server hooks with Jakub and I'm always struggling with miss of gccadmin hooks. I speak about the following repo: /home/gccadmin/hooks-bin: $ ls commit_checker commit_email_formatter email-to-bugzilla-filtered email_to.py git_commit.py git_repository.py __pycache__ style_checker update_hook Can we please make it a public repo sitting here: https://gcc.gnu.org/git/ ? Second part of the server hooks are git-hooks which are public: https://github.com/AdaCore/git-hooks/tree/master/hooks Thanks, Martin
[RFC] restricting aliasing by standard containers (PR 98465)
The test case in PR 98465 brings to light a problem we've discussed before (e.g., PR 93971) where a standard container (std::string in this case but the problem applies to any class that owns and manages allocated memory) might trigger warnings for unreachable code. The code is not eliminated due to a missing aliasing constraint: because GCC doesn't know that the member pointer to the memory managed by the container cannot alias other objects, it emits code that can never be executed in a valid program and that's prone to causing false positives. To illustrate, at the moment it's impossible to fold away the assert below because there's no way to determine in the middle end that String::s cannot point to a: extern char array[]; class String { char *s; public: String (const char *p): s (strdup (p)) { } String (const String &str): s (strdup (str.s)) { } ~String () { free (s); } void f () { assert (s != array); } }; The constraint is obvious to a human reader (String::s is private and nothing sets it to point to array) but there's no way for GCC to infer it from the code alone (at least not in general): there could be member or friend functions defined in other translation units that violate this assumption. One way to solve the problem is to explicitly declare that String::s, in fact, doesn't point to any such objects and that it only ever points to allocated memory. My idea for doing that is to extend attribute malloc to (or add a new attribute for) pointer variables to imply that the pointer only points to allocated memory. However, besides pointing to allocated memory, std::string can also point to its own internal buffer, so the extended malloc attribute couldn't be used there by itself. I think this could be solved by also either extending the may_alias attribute or adding a new "alias" (or some such) attribute to denote that a pointer variable may point to an object or subobject. Putting the two together, to eliminate the assert, std::string would be annotated like so: class string { char *s __attribute__ ((malloc, may_alias (buf))); char buf[8]; public: string (): s (buf) { } string (const char *p): s (strdup (p)) { } string (const string &str): s (strdup (str.s)) { } ~string () { if (s != buf) free (s); } void f () { assert (s != array); } }; The may_alias association with members is relative to the this pointer (i.e., as if by may_alias (this->buf), as opposed to being taken as may_alias (String::buf) and meaning that s might be equal to any other String::s with a different this. To help avoid mistakes, setting s in violation of the constraints would trigger warnings. If this sounds reasonable I'm prepared to prototype it, either for GCC 11 if it's in scope to solve the PR and there's still time, or (I suspect more likely) for GCC 12. Richard, what are your thoughts/concerns? Martin PS An alternate solution might be to provide a late-evaluated built-in, something like __builtin_decl (T *ptr) that would return a answer if ptr could be determined to point to a declared object or subobject, a if not (e.g., it points to allocated storage), and a if it couldn't be determined. The built-in would then be used in code to eliminate infeasible paths. For example, a built-in like that could be used to eliminate the assert in string::f(): void string::f () { if ( == __builtin_decl_p (s) && s != buf) __builtin_unreachable (); assert (s != array); } A built-in might be more flexible but would also be harder to use (and likely more error-prone).
gcc-8-20210107 is now available
Snapshot gcc-8-20210107 is now available on https://gcc.gnu.org/pub/gcc/snapshots/8-20210107/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 8 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-8 revision 5114ee0676e432493ada968e34071f02fb08114f You'll find: gcc-8-20210107.tar.xzComplete GCC SHA256=26e557443dfd6dd3680db9c198b1715b0a89b1ab3b91e6b9931de8b3d67841bf SHA1=3710a70124ed2f9a53ecfd1b34be95ec3b82712a Diffs from 8-20201231 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Copyright Forms Request
Hello, Hope all is well. Please could I have a copy of the form to assign copyright of all future changes. Kind regards, Anthony Sharp
Re: Copyright Forms Request
Sent privately. - David On Thu, Jan 7, 2021 at 8:08 PM Anthony Sharp via Gcc wrote: > > Hello, > > Hope all is well. Please could I have a copy of the form to assign > copyright of all future changes. > > Kind regards, > Anthony Sharp
Re: Copyright Forms Request
Thank you! Anthony On Fri, 8 Jan 2021 at 01:14, David Edelsohn wrote: > > Sent privately. > > - David > > On Thu, Jan 7, 2021 at 8:08 PM Anthony Sharp via Gcc wrote: > > > > Hello, > > > > Hope all is well. Please could I have a copy of the form to assign > > copyright of all future changes. > > > > Kind regards, > > Anthony Sharp
Re: Add -fdirect-access-external-data
On Wed, Jan 6, 2021 at 10:32 PM Fangrui Song wrote: > > On Sat, Dec 26, 2020 at 7:39 AM H.J. Lu wrote: > > > > On Sat, Dec 26, 2020 at 7:32 AM Florian Weimer wrote: > > > > > > * Fangrui Song: > > > > > > > Hi, I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 which > > > > proposes -fdirect-access-external-data to address some x86-64 > > > > GCC/binutils pain[1] and also benefit non-x86 architectures (also see > > > > [1] > > > > it can prevent copy relocations). > > > > > > > > [1] Mentioned in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112#c2 > > > > > > > > Since I am going to add this option to Clang and I hope (once GCC > > > > decides to > > > > implement this option the two compilers can use the same option name), > > > > I bring > > > > it to your attention. > > > > > > One worry I have is that people start building shared objects with > > > direct data access, expecting the main program to be built with > > > indirect access. We already see this today with Qt. It's not really > > > supported well by the toolchain and causes frequent issues. > > > > It can be solved by ABI extension implemented in linker, ld.so and > > compiler. > > > > > Depending on the ELF ABI in question, the new pair of -f options might > > > not actually be meaningful. It really depends on whether you have > > > reasonably-sized displacements available. I think there are some ABIs > > > where the optimization is theoretically possible, but impractical > > > because the ilimit it imposes on data segment (think AArch64 without > > > adrp). > > > > > > > > -- > > H.J. > > Please check out new comments on > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 > > -fdirect-access-external-data is still the best name. The option is > useful to avoid copy relocations / "canonical PLT entry" > (st_shndx=0,st_value!=0) in -fno-pic code. > I will proceed with my Clang patch. If I understand it correctly, you want to treat all accesses to protected definitions as local access and all read/write accesses to undefined symbols should go through GOT. Branches to undefined symbols can use PLT. -fdirect-access-external-data doesn't reflect it. -- H.J.
Re: Add -fdirect-access-external-data
On Thu, Jan 7, 2021 at 6:07 PM H.J. Lu wrote: > > On Wed, Jan 6, 2021 at 10:32 PM Fangrui Song wrote: > > > > On Sat, Dec 26, 2020 at 7:39 AM H.J. Lu wrote: > > > > > > On Sat, Dec 26, 2020 at 7:32 AM Florian Weimer wrote: > > > > > > > > * Fangrui Song: > > > > > > > > > Hi, I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 which > > > > > proposes -fdirect-access-external-data to address some x86-64 > > > > > GCC/binutils pain[1] and also benefit non-x86 architectures (also see > > > > > [1] > > > > > it can prevent copy relocations). > > > > > > > > > > [1] Mentioned in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112#c2 > > > > > > > > > > Since I am going to add this option to Clang and I hope (once GCC > > > > > decides to > > > > > implement this option the two compilers can use the same option > > > > > name), I bring > > > > > it to your attention. > > > > > > > > One worry I have is that people start building shared objects with > > > > direct data access, expecting the main program to be built with > > > > indirect access. We already see this today with Qt. It's not really > > > > supported well by the toolchain and causes frequent issues. > > > > > > It can be solved by ABI extension implemented in linker, ld.so and > > > compiler. > > > > > > > Depending on the ELF ABI in question, the new pair of -f options might > > > > not actually be meaningful. It really depends on whether you have > > > > reasonably-sized displacements available. I think there are some ABIs > > > > where the optimization is theoretically possible, but impractical > > > > because the ilimit it imposes on data segment (think AArch64 without > > > > adrp). > > > > > > > > > > > > -- > > > H.J. > > > > Please check out new comments on > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 > > > > -fdirect-access-external-data is still the best name. The option is > > useful to avoid copy relocations / "canonical PLT entry" > > (st_shndx=0,st_value!=0) in -fno-pic code. > > I will proceed with my Clang patch. > > If I understand it correctly, you want to treat all accesses to protected > definitions as local access > and all read/write accesses to undefined symbols > should go through GOT. Branches to undefined symbols can use PLT. > -fdirect-access-external-data doesn't reflect it. My apologies. Direct/indirect access to protected definitions is a separate topic, unrelated to -f[no-]direct-access-external-data. ( If anyone is interested, there was a heated discussion about accesses to protected definitions https://sourceware.org/legacy-ml/binutils/2016-03/msg00312.html basically a lot of folks considered that copy relocations are best-effort support provided by the toolchain. For protected symbols, copy relocations do not necessarily work. Clang always treats protected similar to hidden/internal, no special logic for x86-64 protected. ) Branches to undefined symbols is yet another separate topic. ( On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the informal convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT, but R_386_PLT32 is arguably preferable for -fno-pic code as well: this can avoid a "canonical PLT entry" (st_shndx=0, st_value!=0) if the symbol turns out to be defined externally. My idea is that we can always use R_386_PLT32 in -fno-pic mode. ) Taking the address of an external function is related to -f[no-]direct-access-external-data. A function pointer of an external function is very similar to external data. A canonical PLT entry can be caused by either a branch (R_386_PC32/R_386_32) or an address taken operation (R_386_PC32/R_386_32) if the symbol turns out to be external. -fno-direct-access-external-data can only address the function pointer usage.
Re: Add -fdirect-access-external-data
On Thu, Jan 7, 2021 at 7:45 PM Fangrui Song wrote: > > On Thu, Jan 7, 2021 at 6:07 PM H.J. Lu wrote: > > > > On Wed, Jan 6, 2021 at 10:32 PM Fangrui Song wrote: > > > > > > On Sat, Dec 26, 2020 at 7:39 AM H.J. Lu wrote: > > > > > > > > On Sat, Dec 26, 2020 at 7:32 AM Florian Weimer > > > > wrote: > > > > > > > > > > * Fangrui Song: > > > > > > > > > > > Hi, I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 which > > > > > > proposes -fdirect-access-external-data to address some x86-64 > > > > > > GCC/binutils pain[1] and also benefit non-x86 architectures (also > > > > > > see [1] > > > > > > it can prevent copy relocations). > > > > > > > > > > > > [1] Mentioned in > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112#c2 > > > > > > > > > > > > Since I am going to add this option to Clang and I hope (once GCC > > > > > > decides to > > > > > > implement this option the two compilers can use the same option > > > > > > name), I bring > > > > > > it to your attention. > > > > > > > > > > One worry I have is that people start building shared objects with > > > > > direct data access, expecting the main program to be built with > > > > > indirect access. We already see this today with Qt. It's not really > > > > > supported well by the toolchain and causes frequent issues. > > > > > > > > It can be solved by ABI extension implemented in linker, ld.so and > > > > compiler. > > > > > > > > > Depending on the ELF ABI in question, the new pair of -f options might > > > > > not actually be meaningful. It really depends on whether you have > > > > > reasonably-sized displacements available. I think there are some ABIs > > > > > where the optimization is theoretically possible, but impractical > > > > > because the ilimit it imposes on data segment (think AArch64 without > > > > > adrp). > > > > > > > > > > > > > > > > -- > > > > H.J. > > > > > > Please check out new comments on > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 > > > > > > -fdirect-access-external-data is still the best name. The option is > > > useful to avoid copy relocations / "canonical PLT entry" > > > (st_shndx=0,st_value!=0) in -fno-pic code. > > > I will proceed with my Clang patch. > > > > If I understand it correctly, you want to treat all accesses to protected > > definitions as local access > > and all read/write accesses to undefined symbols > > should go through GOT. Branches to undefined symbols can use PLT. > > -fdirect-access-external-data doesn't reflect it. > > My apologies. Direct/indirect access to protected definitions is a separate > topic, unrelated to -f[no-]direct-access-external-data. > > ( > If anyone is interested, there was a heated discussion about accesses to > protected definitions > https://sourceware.org/legacy-ml/binutils/2016-03/msg00312.html basically a > lot > of folks considered that copy relocations are best-effort support provided by > the toolchain. For protected symbols, copy relocations do not necessarily > work. > > Clang always treats protected similar to hidden/internal, no special > logic for x86-64 protected. > ) Then, function pointer and copy relocation don't work correctly on protected symbols with clang. For GCC, function pointer and copy relocation on protected symbols work correctly today. > Branches to undefined symbols is yet another separate topic. > All these issues are related. > ( > On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 > relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with > newer > (2018) GNU as/LLVM integrated assembler. > > On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the informal > convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT, > but > R_386_PLT32 is arguably preferable for -fno-pic code as well: this can avoid a > "canonical PLT entry" (st_shndx=0, st_value!=0) if the symbol turns out to be > defined externally. > > My idea is that we can always use R_386_PLT32 in -fno-pic mode. > ) > > Taking the address of an external function is related to > -f[no-]direct-access-external-data. A function pointer of an external function > is very similar to external data. > > A canonical PLT entry can be caused by either a branch (R_386_PC32/R_386_32) > or an address taken operation (R_386_PC32/R_386_32) if the symbol > turns out to be external. > -fno-direct-access-external-data can only address the function pointer usage. i386 is legacy. We should leave it alone. But we should do something for x86-64. That is why I proposed an ABI change: https://groups.google.com/g/x86-64-abi/c/DRvKxJ1AH3Q What I'd like to see is a compiler option which does 1. In shared object, all accesses to protected definitions can be treated as local access. 2. In PIE, all read/write accesses to undefined symbols should go through GOT. 3. In PIE and shared object, all global function pointers, whose function bodies aren't defined locally, should be resolved to G
Re: [RFC] restricting aliasing by standard containers (PR 98465)
On Thu, Jan 7, 2021 at 10:41 PM Martin Sebor wrote: > > The test case in PR 98465 brings to light a problem we've discussed > before (e.g., PR 93971) where a standard container (std::string in > this case but the problem applies to any class that owns and manages > allocated memory) might trigger warnings for unreachable code. > The code is not eliminated due to a missing aliasing constraint: > because GCC doesn't know that the member pointer to the memory > managed by the container cannot alias other objects, it emits code > that can never be executed in a valid program and that's prone to > causing false positives. > > To illustrate, at the moment it's impossible to fold away the assert > below because there's no way to determine in the middle end that > String::s cannot point to a: > >extern char array[]; > >class String { > char *s; >public: > String (const char *p): s (strdup (p)) { } > String (const String &str): s (strdup (str.s)) { } > ~String () { free (s); } > > void f () { assert (s != array); } >}; > > The constraint is obvious to a human reader (String::s is private > and nothing sets it to point to array) but there's no way for GCC > to infer it from the code alone (at least not in general): there > could be member or friend functions defined in other translation > units that violate this assumption. > > One way to solve the problem is to explicitly declare that > String::s, in fact, doesn't point to any such objects and that it > only ever points to allocated memory. My idea for doing that is > to extend attribute malloc to (or add a new attribute for) pointer > variables to imply that the pointer only points to allocated memory. > > However, besides pointing to allocated memory, std::string can also > point to its own internal buffer, so the extended malloc attribute > couldn't be used there by itself. I think this could be solved by > also either extending the may_alias attribute or adding a new > "alias" (or some such) attribute to denote that a pointer variable > may point to an object or subobject. > > Putting the two together, to eliminate the assert, std::string would > be annotated like so: > >class string { > char *s __attribute__ ((malloc, may_alias (buf))); > char buf[8]; >public: > string (): s (buf) { } > string (const char *p): s (strdup (p)) { } > string (const string &str): s (strdup (str.s)) { } > ~string () { if (s != buf) free (s); } > > void f () { assert (s != array); } >}; > > The may_alias association with members is relative to the this pointer > (i.e., as if by may_alias (this->buf), as opposed to being taken as > may_alias (String::buf) and meaning that s might be equal to any other > String::s with a different this. To help avoid mistakes, setting s > in violation of the constraints would trigger warnings. > > If this sounds reasonable I'm prepared to prototype it, either for > GCC 11 if it's in scope to solve the PR and there's still time, or > (I suspect more likely) for GCC 12. > > Richard, what are your thoughts/concerns? I'm not sure it's feasible to make use of this attribute. First there's the malloc part which has difficult semantics (similar to restrict) when generating PTA constraints. We might see _1 = str.s; _2 = str.s; but are of course required to associate the same allocated dummy object with both pointers (as opposed to when we'd see two malloc calls). What would possibly work is to have the object keyed on the field decl, but then for _1 = p_to_str_4(D); _2 = _1 + offsetof-s; _3 = *_2; we have to somehow conservatively arrive at the same object. I don't see how that can work out. All the same applies to the may_alias part but I guess when the malloc part falls apart that's not of much interest. So I'm concerned about correctness - I'm sure you can hack sth together to get some testcases optimized. But I'm not sure you can make it correct in all cases (within the current PTA framework). Richard. > Martin > > PS An alternate solution might be to provide a late-evaluated built-in, > something like > > __builtin_decl (T *ptr) > > that would return a answer if ptr could be determined to point > to a declared object or subobject, a if not (e.g., it points to > allocated storage), and a if it couldn't be determined. > The built-in would then be used in code to eliminate infeasible > paths. For example, a built-in like that could be used to eliminate > the assert in string::f(): > >void string::f () >{ > if ( == __builtin_decl_p (s) && s != buf) >__builtin_unreachable (); > > assert (s != array); >} > > A built-in might be more flexible but would also be harder to use > (and likely more error-prone).