Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 01:48 AM, David Starner wrote: > I'd like to mention that I too was bit by this one on Debian. I don't > have a 32-bit development environment installed; why would I? I'm > building primarily for myself, and if I did have to target a 32-bit > environment, I'd likely have to mess with more stuff then just the > compiler. No, you probably wouldn't. Just use -m32 and you'd be fine. > If you can't find a way to detect this error, I can't > imagine many people would have a problem with turning off multilibs on > x86-64; it's something of a minority setup. I don't think it is, really. Andrew.
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 10:17 AM, Andrew Haley wrote: On 07/24/2013 01:48 AM, David Starner wrote: I'd like to mention that I too was bit by this one on Debian. I don't have a 32-bit development environment installed; why would I? I'm building primarily for myself, and if I did have to target a 32-bit environment, I'd likely have to mess with more stuff then just the compiler. No, you probably wouldn't. Just use -m32 and you'd be fine. No, that doesn't work. The glibc development environment on Debian/amd64 does not contain the 32-bit header files, and that's where the error message comes from. I don't think that's easy to change because of the way dpkg handles file conflicts (even if the files are identical) and how true multi-arch support is implemented in Debian. -- Florian Weimer / Red Hat Product Security Team
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 09:35 AM, Florian Weimer wrote: > On 07/24/2013 10:17 AM, Andrew Haley wrote: >> On 07/24/2013 01:48 AM, David Starner wrote: >>> I'd like to mention that I too was bit by this one on Debian. I don't >>> have a 32-bit development environment installed; why would I? I'm >>> building primarily for myself, and if I did have to target a 32-bit >>> environment, I'd likely have to mess with more stuff then just the >>> compiler. >> >> No, you probably wouldn't. Just use -m32 and you'd be fine. > > No, that doesn't work. The glibc development environment on > Debian/amd64 does not contain the 32-bit header files, and that's where > the error message comes from. Well, of course. It's a prerequisite for building GCC. I presume that Debian has the same abilities as Fedora, where if you want to build GCC you just type yum-builddep gcc and Fedora installs all the build reqs for GCC. > I don't think that's easy to change because of the way dpkg handles file > conflicts (even if the files are identical) and how true multi-arch > support is implemented in Debian. But hold on: if I just wanted to compile C programs I'd use the system's C compiler. Anyone building GCC for themself has a reason for doing so. Andrew.
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On 07/23/2013 09:49 PM, H.J. Lu wrote: 2. Extend the current 16-byte PLT entry: ff 25 32 8b 21 00jmpq *name@GOTPCREL(%rip) 68 00 00 00 00 pushq $index e9 00 00 00 00 jmpq PLT0 which clear bound registers, to 32-byte to add BND prefix to branch instructions. Would it be possible to use a different instruction sequence that stays in the 16 byte limit? Or restrict MPX support to BIND_NOW relocations? -- Florian Weimer / Red Hat Product Security Team
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 10:39 AM, Andrew Haley wrote: Well, of course. It's a prerequisite for building GCC. I presume that Debian has the same abilities as Fedora, where if you want to build GCC you just type yum-builddep gcc and Fedora installs all the build reqs for GCC. Yes, "apt-get build-dep gcc" should work, but will install quite a bit of other stuff that's not needed for building the more popular front ends. I don't think that's easy to change because of the way dpkg handles file conflicts (even if the files are identical) and how true multi-arch support is implemented in Debian. But hold on: if I just wanted to compile C programs I'd use the system's C compiler. Anyone building GCC for themself has a reason for doing so. I suspect a fairly common exercise is to check if the trunk still has the bug you're about to report. -- Florian Weimer / Red Hat Product Security Team
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 10:04 AM, Florian Weimer wrote: > On 07/24/2013 10:39 AM, Andrew Haley wrote: > >> Well, of course. It's a prerequisite for building GCC. I presume that >> Debian has the same abilities as Fedora, where if you want to build GCC >> you just type >> >>yum-builddep gcc >> >> and Fedora installs all the build reqs for GCC. > > Yes, "apt-get build-dep gcc" should work, but will install quite a bit > of other stuff that's not needed for building the more popular front ends. That is not of any significant consequence. >>> I don't think that's easy to change because of the way dpkg handles file >>> conflicts (even if the files are identical) and how true multi-arch >>> support is implemented in Debian. >> >> But hold on: if I just wanted to compile C programs I'd use the system's >> C compiler. Anyone building GCC for themself has a reason for doing so. > > I suspect a fairly common exercise is to check if the trunk still has > the bug you're about to report. Right, so it should be built the right way. Andrew.
Re: fatal error: gnu/stubs-32.h: No such file
On Wed, Jul 24, 2013 at 1:17 AM, Andrew Haley wrote: > On 07/24/2013 01:48 AM, David Starner wrote: >> I'd like to mention that I too was bit by this one on Debian. I don't >> have a 32-bit development environment installed; why would I? I'm >> building primarily for myself, and if I did have to target a 32-bit >> environment, I'd likely have to mess with more stuff then just the >> compiler. > > No, you probably wouldn't. Just use -m32 and you'd be fine. That's assuming that the hypothetical 32-bit x86 system I was targeting was running GNU libc6 2.17 (as well as whatever libraries I need, with version numbers apropos of Debian Unstable.) Conceivable, but not something I'd bet on. I've got 3 ARM (Android) computers around, and 3 AMD-64 computers, and I can't imagine why I'd need an x86 computer. There is one x86 program I run (zsnes), but if Debian stopped carrying it, it probably wouldn't be worth the time to compile it myself. >> If you can't find a way to detect this error, I can't >> imagine many people would have a problem with turning off multilibs on >> x86-64; it's something of a minority setup. > > I don't think it is, really. Really? Because my impression is that on Unix, the primary use of the C compiler has always been to compile programs for the system the compiler is running on. And x86-32 is a slow, largely obsolete chip; it's certainly useful to emulate, but I suspect any developer who needs to build for it knows that up-front and is prepared to deal for it in the same way that someone who needs an ARM or MIPS compiler is. > Anyone building GCC for themself has a reason for doing so. At the current time, Debian's version of GNAT is built from older sources then the rest of GCC; if I want a 4.8 version of GNAT, I have to build it myself. > Right, so it should be built the right way. The right way? If I don't want to build support for obsolete systems I don't use, I'm building it the wrong way? If I were building ia64-linux-gnu, I wouldn't have to enable support for x86-linux-gnu, but because I'm building amd64-linux-gnu, if I don't, I'm building it the wrong way? I don't see this resistance to making it work with real systems and real workloads. This feature is not useful to many of us, and fails the GCC build in the middle. That's not really acceptable. -- Kie ekzistas vivo, ekzistas espero.
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 11:32 AM, David Starner wrote: > On Wed, Jul 24, 2013 at 1:17 AM, Andrew Haley wrote: >> On 07/24/2013 01:48 AM, David Starner wrote: >>> I'd like to mention that I too was bit by this one on Debian. I don't >>> have a 32-bit development environment installed; why would I? I'm >>> building primarily for myself, and if I did have to target a 32-bit >>> environment, I'd likely have to mess with more stuff then just the >>> compiler. >> >> No, you probably wouldn't. Just use -m32 and you'd be fine. > > That's assuming that the hypothetical 32-bit x86 system I was > targeting was running GNU libc6 2.17 (as well as whatever libraries > I need, with version numbers apropos of Debian Unstable.) > Conceivable, but not something I'd bet on. I've got 3 ARM (Android) > computers around, and 3 AMD-64 computers, and I can't imagine why > I'd need an x86 computer. There is one x86 program I run (zsnes), > but if Debian stopped carrying it, it probably wouldn't be worth the > time to compile it myself. No, I'm assuming you want to build and run an x86 program on your own system. -32 is not in any sense a cross-compiler. >>> If you can't find a way to detect this error, I can't >>> imagine many people would have a problem with turning off multilibs on >>> x86-64; it's something of a minority setup. >> >> I don't think it is, really. > > Really? Really. > Because my impression is that on Unix, the primary use of the > C compiler has always been to compile programs for the system the > compiler is running on. So, use the system's C compiler. > And x86-32 is a slow, largely obsolete chip; it's certainly useful > to emulate, but I suspect any developer who needs to build for it > knows that up-front and is prepared to deal for it in the same way > that someone who needs an ARM or MIPS compiler is. > >> Anyone building GCC for themself has a reason for doing so. > > At the current time, Debian's version of GNAT is built from older > sources then the rest of GCC; if I want a 4.8 version of GNAT, I have > to build it myself. So you have a specialized use. Fair enough; you can disable multilib. I would just install GCC's build dependencies and build with the defaults. >> Right, so it should be built the right way. > > The right way? The default way, then. > If I don't want to build support for obsolete systems I don't use, > I'm building it the wrong way? If I were building ia64-linux-gnu, I > wouldn't have to enable support for x86-linux-gnu, but because I'm > building amd64-linux-gnu, if I don't, I'm building it the wrong way? > > I don't see this resistance to making it work with real systems and > real workloads. This feature is not useful to many of us, and fails > the GCC build in the middle. That's not really acceptable. There is no resistance whatsoever to making it work with real systems and real workloads. The problem is that you don't know that people running on 64-bit hardware often choose to compile -32 and run -32 locally. Andrew.
Re: fatal error: gnu/stubs-32.h: No such file
On Wed, Jul 24, 2013 at 4:14 AM, Andrew Haley wrote: > I would just install GCC's build dependencies and build with the > defaults. I'm glad you have infinite hard-drive space. I rather wish fewer developers did, as well as those infinitely fast computers they seem to have; perhaps they would have more empathy with my day-to-day computing needs. > There is no resistance whatsoever to making it work with real systems > and real workloads. Yes, there is. Both in this thread, and on bugzilla, people with real systems, in my case a well-stocked development system, have said they started compiling gcc and after hours the compile has failed without even an explanation, and people have shrugged, and said what do you want us to do about it? If a feature causes failure on real systems like that, then disabling it by default, even if it's used by a significant minority of users, should be considered. Yes, it would be better to leave multilibs on and give people building without 32-bit libraries a proper error message up front, but leaving it as is is not making it work with real systems; it's causing real people to pull their hair out or give up on trying to build GCC. -- Kie ekzistas vivo, ekzistas espero.
Re: fatal error: gnu/stubs-32.h: No such file
> There is no resistance whatsoever to making it work with real systems > and real workloads. It does not sound or look like that way. > The problem is that you don't know that people > running on 64-bit hardware often choose to compile -32 and run -32 > locally. But we know people are running into this issue and reporting it. -- Gaby
Re: atomic support for LEON3 platform
Hi Eric, Thank you for your interesting on this feature. Best Regards WeiY 在 2013-7-24,上午1:07,Eric Botcazou 写道: >> ok, because i am not familiar with compiler implementation. So if you can >> give me some references i will appreciate you very much. And by the way is >> there any plan to support this feature in the mainline? > > OK, let's go ahead and implement the feature. We first need the binutils > side, > because a 'cas' is currently rejected by the assembler: > > eric@hermes:~/leon-elf> gcc/xgcc -Bgcc -c cas.adb -mcpu=leon3 > /tmp/ccOuqOpo.s: Assembler messages: > /tmp/ccOuqOpo.s:24: Error: Architecture mismatch on "cas". > /tmp/ccOuqOpo.s:24: (Requires v9|v9a|v9b; requested architecture is v8.) > /tmp/ccOuqOpo.s:47: Error: Architecture mismatch on "cas". > /tmp/ccOuqOpo.s:47: (Requires v9|v9a|v9b; requested architecture is v8.) > > David, how do you want to handle this on the binutils side? The compiler > currently passes -Av8 for LEON3 and I don't think that we want to pass -Av9 > instead, so we would need something in between. > > -- > Eric Botcazou
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 01:26 PM, David Starner wrote: > On Wed, Jul 24, 2013 at 4:14 AM, Andrew Haley wrote: >> I would just install GCC's build dependencies and build with the >> defaults. > > I'm glad you have infinite hard-drive space. I rather wish fewer > developers did, as well as those infinitely fast computers they seem > to have; perhaps they would have more empathy with my day-to-day > computing needs. > >> There is no resistance whatsoever to making it work with real systems >> and real workloads. > > Yes, there is. Both in this thread, and on bugzilla, people with real > systems, in my case a well-stocked development system, have said they > started compiling gcc and after hours the compile has failed without > even an explanation, and people have shrugged, and said what do you > want us to do about it? There should be a better diagnostic. Andrew.
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 01:36 PM, Gabriel Dos Reis wrote: >> There is no resistance whatsoever to making it work with real systems >> and real workloads. > > It does not sound or look like that way. > >> The problem is that you don't know that people >> running on 64-bit hardware often choose to compile -32 and run -32 >> locally. > > But we know people are running into this issue and reporting it. Yes. But that on its own is not sufficient to change the default. Andrew.
OpenMP canonical loop form
Hi! OpenMP defines a canonical loop form (in OpenMP 4: »2.6 Canonical Loop Form«, in OpenMP 3.1 as part of »2.5.1 Loop Construct«) that says that the loop index variable »must not be modified during the execution of the for-loop other than in incr-expr«. The following code, which violates this when modifying i in the loop body, thus isn't a conforming program, and GCC may then exhibit unspecified behavior. Instead of accepting it silently, I wonder if it makes sense to have GCC detect this violation and warn about the unspecified behavior, or even turn it into a hard error? #include #include int main(void) { #pragma omp parallel #pragma omp for for (int i = 0; i < 20; i += 2) { printf("%d: #%d\n", omp_get_thread_num(), i); /* Violation of canonical loop form. */ --i; } return 0; } 2: #8 2: #9 0: #0 0: #1 0: #2 0: #3 3: #10 3: #11 1: #4 1: #5 1: #6 1: #7 6: #16 6: #17 4: #12 4: #13 5: #14 5: #15 7: #18 7: #19 Grüße, Thomas pgpJYlK_I5DFR.pgp Description: PGP signature
whether DIE of a "static const int" member has attribute "DW_AT_const_value"
Hi all, I find a strange things: Whether DIE(Debug Information Entry) of a "static const int" member in a class has the attribute "DW_AT_const_value" depends on whether there is a virtual function defined in the class. Is it a expected behavior for GCC? And the attribute "DW_AT_const_value" matters because GDB could print the member's address if it does not have the attribute, otherwise GDB could not. My test case is as following: // symbol.cpp class Test{ private: const static int hack; public: virtual void get(){} }; const int Test::hack = 3; int main() { Test t; return 0; } Then I run the following steps: $ g++ -g -o symbol symbol.cpp $ readelf -wi symbol The DIE for Test::hack is as follwing: <2><46>: Abbrev Number: 4 (DW_TAG_member) <47> DW_AT_name: (indirect string, offset: 0x3a): hack <4b> DW_AT_decl_file : 1 <4c> DW_AT_decl_line : 8 <4d> DW_AT_MIPS_linkage_name: (indirect string, offset: 0x3f): _ZN4Test4hackE <51> DW_AT_type: <0xc3> <55> DW_AT_external: 1 <56> DW_AT_accessibility: 3 (private) <57> DW_AT_declaration : 1 <58> DW_AT_const_value : 3 If I delete the function get() from the class, the class turns into: class Test{ private: const static int hack; }; const int Test::hack = 3; int main() { Test t; return 0; } Then the DIE for Test::hack does not have the attribute "DW_AT_const_value". Is it exptected? Thank you!
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
"H.J. Lu" wrote: >Hi, > >Here is a patch to extend x86-64 psABI to support AVX-512: Afaik avx 512 doubles the amount of xmm registers. Can we get them callee saved please? Thanks, Richard. >http://software.intel.com/sites/default/files/319433-015.pdf > > >-- >H.J.
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Wed, Jul 24, 2013 at 1:43 AM, Florian Weimer wrote: > On 07/23/2013 09:49 PM, H.J. Lu wrote: >> >> 2. Extend the current 16-byte PLT entry: >> >>ff 25 32 8b 21 00jmpq *name@GOTPCREL(%rip) >>68 00 00 00 00 pushq $index >>e9 00 00 00 00 jmpq PLT0 >> >> which clear bound registers, to 32-byte to add BND prefix to branch >> instructions. > > > Would it be possible to use a different instruction sequence that stays in > the 16 byte limit? Or restrict MPX support to BIND_NOW relocations? > It isn't possible to use different insns in PLT to add BND prefix. The issue isn't about relocation. The issue is external calls are routed via PLT entry, which clears bound registers. That is why we need to use a different PLT entry to preserve bound registers. -- H.J.
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, Jul 24, 2013 at 8:23 AM, Richard Biener wrote: > "H.J. Lu" wrote: > >>Hi, >> >>Here is a patch to extend x86-64 psABI to support AVX-512: > > Afaik avx 512 doubles the amount of xmm registers. Can we get them callee > saved please? > Make them callee saved means we need to change ld.so to preserve them and we need to change unwind library to support them. It is certainly doable. -- H.J.
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, 24 Jul 2013, H.J. Lu wrote: > > Afaik avx 512 doubles the amount of xmm registers. Can we get them > > callee saved please? > > Make them callee saved means we need to change ld.so to > preserve them and we need to change unwind library to > support them. It is certainly doable. And setjmp/longjmp (with consequent versioning implications if there isn't enough space in jmp_buf). Avoiding the need for such library changes in order to use new instruction set features is why it's usual to make new registers (or new bits of existing registers) call-clobbered. -- Joseph S. Myers jos...@codesourcery.com
Re: fatal error: gnu/stubs-32.h: No such file
On Wed, Jul 24, 2013 at 8:44 AM, Andrew Haley wrote: > On 07/24/2013 01:36 PM, Gabriel Dos Reis wrote: >>> There is no resistance whatsoever to making it work with real systems >>> and real workloads. >> >> It does not sound or look like that way. >> >>> The problem is that you don't know that people >>> running on 64-bit hardware often choose to compile -32 and run -32 >>> locally. >> >> But we know people are running into this issue and reporting it. > > Yes. But that on its own is not sufficient to change the default. I suspect this might just provide evidence for a claim previously denied. -- Gaby
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 04:38 PM, Gabriel Dos Reis wrote: > On Wed, Jul 24, 2013 at 8:44 AM, Andrew Haley wrote: >> On 07/24/2013 01:36 PM, Gabriel Dos Reis wrote: There is no resistance whatsoever to making it work with real systems and real workloads. >>> >>> It does not sound or look like that way. >>> The problem is that you don't know that people running on 64-bit hardware often choose to compile -32 and run -32 locally. >>> >>> But we know people are running into this issue and reporting it. >> >> Yes. But that on its own is not sufficient to change the default. > > I suspect this might just provide evidence for a claim previously denied. Not at all: we're just disagreeing about what a real system with a real workload looks like. It's a stupid thing to say anyway, because who is to say their system is more real than mine or yours? Andrew.
Intel® Memory Protection Extensions support in the GCC
Hi All! This is to let you know that enabling of Intel® MPX technology (see details in http://download-software.intel.com/sites/default/files/319433-015.pdf) in GCC has been started. (Corresponding changes in binutils are here - http://sourceware.org/ml/binutils/2013-07/msg00233.html) Currently compiler changes for Intel® MPX has been put in the branch svn://gcc.gnu.org/svn/gcc/branches/mpx (will soon be reflected in svn.html). Ilya Enkovich (in cc) will be the main person maintaining this branch and submitting changes into the trunk. Some implementation details could be found on wiki Thanks, Igor
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Tue, Jul 23, 2013 at 12:49 PM, H.J. Lu wrote: > > http://software.intel.com/sites/default/files/319433-015.pdf > > introduces 4 bound registers, which will be used for parameter passing > in x86-64. Bound registers are cleared by branch instructions. Branch > instructions with BND prefix will keep bound register contents. I took a very quick look at the doc. Why shouldn't we run the kernel with BNDPRESERVE = 1, to avoid this behaviour of clearing the bound registers on branch instructions? That would let us avoid these issues. > I prefer the note section solution. Any suggestions, comments? I concur, but why not use the ELF attributes support rather than a new note section? Ian
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
"H.J. Lu" wrote: >On Wed, Jul 24, 2013 at 8:23 AM, Richard Biener > wrote: >> "H.J. Lu" wrote: >> >>>Hi, >>> >>>Here is a patch to extend x86-64 psABI to support AVX-512: >> >> Afaik avx 512 doubles the amount of xmm registers. Can we get them >callee saved please? >> > >Make them callee saved means we need to change ld.so to >preserve them and we need to change unwind library to >support them. It is certainly doable. IMHO it was a mistake to not have any callee saved xmm register in the original abi - we should fix this at this opportunity. Loops with function calls are not that uncommon. Richard. >-- >H.J.
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, Jul 24, 2013 at 10:36 AM, Richard Biener wrote: > "H.J. Lu" wrote: > >>On Wed, Jul 24, 2013 at 8:23 AM, Richard Biener >> wrote: >>> "H.J. Lu" wrote: >>> Hi, Here is a patch to extend x86-64 psABI to support AVX-512: >>> >>> Afaik avx 512 doubles the amount of xmm registers. Can we get them >>callee saved please? >>> >> >>Make them callee saved means we need to change ld.so to >>preserve them and we need to change unwind library to >>support them. It is certainly doable. > > IMHO it was a mistake to not have any callee saved xmm register in the > original abi - we should fix this at this opportunity. Loops with function > calls are not that uncommon. > Are there any other Linux targets with callee saved vector registers? -- H.J.
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, 2013-07-24 at 10:42 -0700, H.J. Lu wrote: > Are there any other Linux targets with callee saved vector registers? Yes, on POWER. From our ABI: On processors with the VMX feature. v0-v1 Volatile scratch registers v2-v13 Volatile vector parameters registers v14-v19 Volatile scratch registers v20-v31 Non-volatile registers I'll note that the new VSX register state we recently added with power7 were made volatile, but then we already had these non-volatile altivec regs to use. Peteer
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, Jul 24, 2013 at 07:36:31PM +0200, Richard Biener wrote: > "H.J. Lu" wrote: > > >On Wed, Jul 24, 2013 at 8:23 AM, Richard Biener > > wrote: > >> "H.J. Lu" wrote: > >> > >>>Hi, > >>> > >>>Here is a patch to extend x86-64 psABI to support AVX-512: > >> > >> Afaik avx 512 doubles the amount of xmm registers. Can we get them > >callee saved please? > >> > > > >Make them callee saved means we need to change ld.so to > >preserve them and we need to change unwind library to > >support them. It is certainly doable. > > IMHO it was a mistake to not have any callee saved xmm register in the > original abi - we should fix this at this opportunity. Loops with function > calls are not that uncommon. > I also noticed this problem and best solution that I came upon is analogue to __attribute__((fastcall)) This would make possible for libraries to add versioned symbols that use attribute and migrate to saner calling convention. > Richard. > > >-- > >H.J. >
Re: Help forging a function_decl
Problem solved. The trouble was that the blocks of my statement list weren't correctly chained, so when lower_gimple_bind executed in pass_lower_cf, it accessed an uninitialized memory area, thus sometimes reading the flag as true and sometimes as false. Now everything runs smoothly. Again, thank you for the patience. I'm kind of new to gcc, so I was basically learning how to do it. Kindest regards, --- Rodolfo Guilherme Wottrich Universidade Estadual de Campinas - Unicamp 2013/7/23 Rodolfo Guilherme Wottrich : > Later I found out that cgraph_mark_needed_node was already being > called in cgraph_finalize_function, and that should really keep my > function from being removed. But when the function > cgraph_remove_unreachable_nodes executes, it is marked as unreachable > just because at this point it is not marked as needed anymore. > The piece of code which is doing that is > function_and_variable_visibility, in ipa.c. Oddly, the condition for > doing so is having DECL_EXTERNAL set for my fndecl, which I just said, > was my former mistake. > I really don't know why this erratic behavior is happening, I am > explicitly setting DECL_EXTERNAL to false. > > Cheers, > > --- > Rodolfo Guilherme Wottrich > Universidade Estadual de Campinas - Unicamp > > > 2013/7/23 Rodolfo Guilherme Wottrich : >> Hello, >> >> 2013/7/23 Martin Jambor : >>> Hi, >>> >>> But you do call cgraph_add_new_function on it as well, right? If not, >>> how is its symbol table node (also called and serving as the call >>> graph node) created? >> >> I call finish_function for my decl. Inside that function, there's this >> one call to cgraph_add_new_function, but the condition it is inside is >> bypassed: >> >> /* ??? Objc emits functions after finalizing the compilation unit. >> This should be cleaned up later and this conditional removed. */ >> if (cgraph_global_info_ready) >> { >> cgraph_add_new_function (fndecl, false); >> return; >> } >> >> Right after that, there's a call to cgraph_finalize_function, which I >> guess works the way it should, creating its cgraph node. I took a look >> at the comment in cgraph_add_new_function and it states that "this >> function is intended to be used by middle end and allows insertion of >> new function at arbitrary point of compilation. The function can be >> either in high, low or SSA form GIMPLE." As I'm doing it in parsing >> time and there was no gimplification passes yet, I guess it is ok not >> to use it at this point. >> >>> (And BTW, you are hacking on trunk, right? Older versions can be >>> quite a bit different here.) >> >> I understand that, but I intend to use Dragonegg to obtain LLVM IR >> after the gcc front-end has done its work, so I'm hacking version >> 4.6.4, which is said to work better with Dragonegg. >> >>> What do you mean by "50% of the time?" That you get different results >>> even when you do not change your compiler? That should not happen and >>> means you invoke undefined behavior, most likely depending on some >>> uninitialized stuff (assuming your HW is OK) so you are probably not >>> clearing some allocated structure or something. (Do you know why >>> DECL_EXTERNAL was set? That looks weird). >> >> Yeah, exactly: different results even not changing the compiler. >> About DECL_EXTERNAL: it was my fault. At first I was trying to >> reproduce the creation of a function_decl as in >> create_omp_child_function, in omp-low.c, and I eventually forgot that >> I set that attribute when playing with the possibilities. >> >>> Anyway, my best guess is that your function is removed by >>> symtab_remove_unreachable_nodes in ipa.c. (And now I also see that >>> analyze_functions in cgraphunit.c is also doing its own unreachable >>> node removal, but hopefully runs early enough this should not be your >>> problem.) If your function is static and is not called or referenced >>> from anywhere else, gcc will try to get rid of it. >> >> I traced the execution path in gdb, and it happens that the function >> is really being removed in cgraph_remove_unreachable_nodes in ipa.c >> (that's basically the same function you pointed out, just another gcc >> version). I don't know why that happens anyway, neither why it happens >> intermittently when not debugging and never when debugging. My >> function is not called or referenced yet, but neither is another >> function in my source code which I put just in order to test this >> issue, and yet it is not removed like my forged one. So it is in >> respect to being or not static. A doubt: I tested setting TREE_STATIC >> for my decl, but does that mean my function is static? The >> documentation on the TREE_STATIC macro is: "In a FUNCTION_DECL, >> nonzero if function has been defined". >> >>> Try setting DECL_PRESERVE_P of your decl (or calling >>> cgraph_mark_force_output_node on its call graph node which is cleaner >>> but should be equivalent for debugging, I suppose). If that helps, >>> this is most likely your problem. >>
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On 07/24/2013 05:23 AM, Richard Biener wrote: > "H.J. Lu" wrote: > >> Hi, >> >> Here is a patch to extend x86-64 psABI to support AVX-512: > > Afaik avx 512 doubles the amount of xmm registers. Can we get them callee > saved please? Having them callee saved pre-supposes that one knows the width of the register. There's room in the instruction set for avx1024. Does anyone believe that is not going to appear in the next few years? r~
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, Jul 24, 2013 at 08:25:14AM -1000, Richard Henderson wrote: > On 07/24/2013 05:23 AM, Richard Biener wrote: > > "H.J. Lu" wrote: > > > >> Hi, > >> > >> Here is a patch to extend x86-64 psABI to support AVX-512: > > > > Afaik avx 512 doubles the amount of xmm registers. Can we get them callee > > saved please? > > Having them callee saved pre-supposes that one knows the width of the > register. > > There's room in the instruction set for avx1024. Does anyone believe that is > not going to appear in the next few years? > It would be mistake for intel to focus on avx1024. You hit diminishing returns and only few workloads would utilize loading 128 bytes at once. Problem with vectorization is that it becomes memory bound so you will not got much because performance is dominated by cache throughput. You would get bigger speedup from more effective pipelining, more fusion...
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Wed, Jul 24, 2013 at 9:45 AM, Ian Lance Taylor wrote: > On Tue, Jul 23, 2013 at 12:49 PM, H.J. Lu wrote: >> >> http://software.intel.com/sites/default/files/319433-015.pdf >> >> introduces 4 bound registers, which will be used for parameter passing >> in x86-64. Bound registers are cleared by branch instructions. Branch >> instructions with BND prefix will keep bound register contents. > > I took a very quick look at the doc. Why shouldn't we run the kernel > with BNDPRESERVE = 1, to avoid this behaviour of clearing the bound > registers on branch instructions? That would let us avoid these > issues. This doesn't work in case of legacy callees which return pointers. The bound registers will be incorrect since they are set in the last MPX function. MPX callers will get wrong bounds on pointers returned by legacy callees > >> I prefer the note section solution. Any suggestions, comments? > > I concur, but why not use the ELF attributes support rather than a new > note section? > The issues are 1. ELF attributes target static linker. There is no support in shared library nor executables. We may need it to make run-time decision based on MPX feature to select legacy or MPX share library. 2. ELF attribute lookup isn't very fast at run-time. -- H.J.
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Wed, Jul 24, 2013 at 11:53 AM, H.J. Lu wrote: > On Wed, Jul 24, 2013 at 9:45 AM, Ian Lance Taylor wrote: >> On Tue, Jul 23, 2013 at 12:49 PM, H.J. Lu wrote: >>> >>> http://software.intel.com/sites/default/files/319433-015.pdf >>> >>> introduces 4 bound registers, which will be used for parameter passing >>> in x86-64. Bound registers are cleared by branch instructions. Branch >>> instructions with BND prefix will keep bound register contents. >> >> I took a very quick look at the doc. Why shouldn't we run the kernel >> with BNDPRESERVE = 1, to avoid this behaviour of clearing the bound >> registers on branch instructions? That would let us avoid these >> issues. > > This doesn't work in case of legacy callees which return pointers. > The bound registers will be incorrect since they are set in the > last MPX function. MPX callers will get wrong bounds on > pointers returned by legacy callees As far as I can see the compiler needs to know the pair of bound registers associated with a pointer anyhow. So if the compiler calls some function and gets a pointer, it needs to know the bound registers that go with that pointer. Are you suggesting that not only are bound registers passed as parameters to functions, they are also implicitly returned by functions? Ian
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Wed, Jul 24, 2013 at 11:59 AM, Ian Lance Taylor wrote: > On Wed, Jul 24, 2013 at 11:53 AM, H.J. Lu wrote: >> On Wed, Jul 24, 2013 at 9:45 AM, Ian Lance Taylor wrote: >>> On Tue, Jul 23, 2013 at 12:49 PM, H.J. Lu wrote: http://software.intel.com/sites/default/files/319433-015.pdf introduces 4 bound registers, which will be used for parameter passing in x86-64. Bound registers are cleared by branch instructions. Branch instructions with BND prefix will keep bound register contents. >>> >>> I took a very quick look at the doc. Why shouldn't we run the kernel >>> with BNDPRESERVE = 1, to avoid this behaviour of clearing the bound >>> registers on branch instructions? That would let us avoid these >>> issues. >> >> This doesn't work in case of legacy callees which return pointers. >> The bound registers will be incorrect since they are set in the >> last MPX function. MPX callers will get wrong bounds on >> pointers returned by legacy callees > > As far as I can see the compiler needs to know the pair of bound > registers associated with a pointer anyhow. So if the compiler calls > some function and gets a pointer, it needs to know the bound registers > that go with that pointer. Are you suggesting that not only are bound > registers passed as parameters to functions, they are also implicitly > returned by functions? > Yes, when pointer is returned in register. -- H.J.
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, Jul 24, 2013 at 10:55 AM, Peter Bergner wrote: > On Wed, 2013-07-24 at 10:42 -0700, H.J. Lu wrote: >> Are there any other Linux targets with callee saved vector registers? > > Yes, on POWER. From our ABI: > > On processors with the VMX feature. > v0-v1 Volatile scratch registers > v2-v13 Volatile vector parameters registers > v14-v19 Volatile scratch registers > v20-v31 Non-volatile registers > > I'll note that the new VSX register state we recently added with power7 > were made volatile, but then we already had these non-volatile altivec > regs to use. How do you save/restore those vector registers for exception? Unwinder in libgcc uses _Unwind_Word to save and restore registers in DWARF unwind frame. It doesn't support anything wider than _Unwind_Word, which is usually smaller than vector register. -- H.J.
GNU Tools Cauldron 2013 - Presentation videos
I have uploaded all the videos we recorded at the Cauldron to the workshop page (http://gcc.gnu.org/wiki/cauldron2013). The videos are also available at the YouTube playlist: http://www.youtube.com/playlist?list=PLsgS8fWwKJZhrjVEN7tsQyj2nLb5z0n70 If you think your talk was recorded but you do not see the video, please let me know and I'll fix it in the playlist. If you presented a talk and do not see your slides in http://gcc.gnu.org/wiki/cauldron2013, please fix the link yourself or let me know and I'll add them to the table (if you can fix the links yourself, you'll be doing me a big favour). Diego.
Re: fatal error: gnu/stubs-32.h: No such file
On Wed, Jul 24, 2013 at 8:50 AM, Andrew Haley wrote: > Not at all: we're just disagreeing about what a real system with > a real workload looks like. No, we aren't. We're disagreeing about whether it's acceptable to enable a feature by default that breaks the compiler build half way through with an obscure error message. Real systems need features that aren't enabled by default sometimes. > It's a stupid thing to say anyway, because > who is to say their system is more real than mine or yours? By that logic, you've already said that any system needing GNAT is less real then others, because it's not enabled by default. -- Kie ekzistas vivo, ekzistas espero.
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
I've read through the MPX spec once, but most of it is still not very clear to me. So please correct any misconceptions. (HJ, if you answer any or all of these questions in your usual style with just, "It's not a problem," I will find you and I will kill you. Explain!) Will an MPX-using binary require an MPX-supporting dynamic linker to run correctly? * An old dynamic linker won't clobber %bndN directly, so that's not a problem. * Does having the bounds registers set have any effect on regular/legacy code, or only when bndc[lun] instructions are used? If it doesn't affect normal instructions, then I don't entirely understand why it would matter to clear %bnd* when entering or leaving legacy code. Is it solely for the case of legacy code returning a pointer value, so that the new code would expect the new ABI wherein %bnd0 has been set to correspond to the pointer returned in %rax? * What's the effect of entering the dynamic linker via "bnd jmp" (i.e. new MPX-using binary with new PLT, old dynamic linker)? The old dynamic linker will leave %bndN et al exactly as they are, until its first unadorned branching instruction implicitly clears them. So the only problem would be if the work _dl_runtime_{resolve,profile} does before its first branch/call were affected by the %bndN state. If there are indeed any problems with this scenario, then you need a plan to make new binaries require a new dynamic linker (and fail gracefully in the absence of one, and have packaging systems grok the dependency, etc.) In a related vein, what's the effect of entering some legacy code via "bnd jmp" (i.e. new binary using PLT call into legacy DSO)? * If the state of %bndN et al does not affect legacy code directly, then it's not a problem. The legacy code will eventually use an unadorned branch instruction, and that will implicitly clear %bnd*. (Even if it's a leaf function that's entirely branch-free, its return will count as such an unadorned branch instruction.) * If that's not the case, then a PLT entry that jumps to legacy code will need to clear the %bndN state. I see one straightforward approach, at the cost of a double-bounce (i.e. turning the normal double-bounce into a triple-bounce) when going from MPX code to legacy code. Each PLT entry can be: bnd jmp *foo@GOTPCREL(%rip) pushq $N bnd jmp .Lplt0 .balign 16 jmp *foo@GOTPCREL+8(%rip) .balign 32 and now each of those gets two (adjacent) GOT slots rather than just one. When the dynamic linker resolves "foo" and sees that it's in a legacy DSO, it sets the foo GOT slot to point to .plt+(N*32 + 16) and the foo+1 GOT slot to point to the real target (resolution of "foo"). After fixup, entering that PLT entry will do "bnd jmp" to the second half of the entry, which does (unadorned) "jmp" to the real target, implicitly clearing %bndN state. Those are the background questions to help me understand better. Now, to your specific questions. I can't tell if you are proposing that a single object might contain both 16-byte and 32-byte PLT slots next to each other in the same .plt section. That seems like a bad idea. I can think of two things off hand that expect PLT entries to be of uniform size, and there may well be more. * The foo@plt pseudo-symbols that e.g. objdump will display are based on the BFD backend knowing the size of PLT entries. Arguably this ought to look at sh_entsize of .plt instead of using baked-in knowledge, but it doesn't. * The linker-generated CFI for .plt is a single FDE for the whole section, using a DWARF expression covering all normal PLT entries together based on them having uniform size and contents. (You could of course make the linker generate per-entry CFI, or partition the PLT into short and long entries and have the CFI treat the two partitions appropriately differently. But that seems like a complication best avoided.) Now, assuming we are talking about a uniform PLT in each object, there is the question of whether to use a new PLT layout everywhere, or only when linking an object with some input files that use MPX. * My initial reaction was to say that we should just change it unconditionally to keep things simple: use new linker, get new format, end of story. Simplicity is good. * But, doubling the size of PLT entries means more i-cache pressure. If cache lines are 64 bytes, then today you fit four entries into a cache line. Assuming PLT entries are more used than unused, this is a good thing. Reducing that to two entries per cache line means twice as many i-cache misses if you hit a given PLT frequently (with even distribution of which entries you actually use--at any rate, it's "more" even if it's not "twice as many"). Perhaps this is enough cost in real-world situations to be worried about. I really don't know. * As I mentioned before, there are things floating ar
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Wed, Jul 24, 2013 at 4:36 PM, Roland McGrath wrote: > > Will an MPX-using binary require an MPX-supporting dynamic linker to run > correctly? > > * An old dynamic linker won't clobber %bndN directly, so that's not a > problem. These are my answers and likely incorrect. It will clobber the registers indirectly, though, as soon as it executes a branching instruction. The effect will be that calls from bnd-checked code to bnd-checked code through the dynamic linker will not succeed. I have not yet seen the changes this will require to the ABI, but I'm making the natural assumptions: the first four pointer arguments to a function will be associated with a pair of bound registers, and similarly for a returned pointer. I don't know what the proposal is for struct parameters and return values. > * Does having the bounds registers set have any effect on regular/legacy > code, or only when bndc[lun] instructions are used? As far as I can tell, only when the bndXX instructions are used, though I'd be happy to hear otherwise. > If it doesn't affect normal instructions, then I don't entirely > understand why it would matter to clear %bnd* when entering or leaving > legacy code. Is it solely for the case of legacy code returning a > pointer value, so that the new code would expect the new ABI wherein > %bnd0 has been set to correspond to the pointer returned in %rax? There is no problem with clearing the bnd registers when calling in or out of legacy code. The issue is avoiding clearing the pointers when calling from bnd-enabled code to bnd-enabled code. > * What's the effect of entering the dynamic linker via "bnd jmp" > (i.e. new MPX-using binary with new PLT, old dynamic linker)? The old > dynamic linker will leave %bndN et al exactly as they are, until its > first unadorned branching instruction implicitly clears them. So the > only problem would be if the work _dl_runtime_{resolve,profile} does > before its first branch/call were affected by the %bndN state. "It's not a problem." > In a related vein, what's the effect of entering some legacy code via > "bnd jmp" (i.e. new binary using PLT call into legacy DSO)? > > * If the state of %bndN et al does not affect legacy code directly, then > it's not a problem. The legacy code will eventually use an unadorned > branch instruction, and that will implicitly clear %bnd*. (Even if > it's a leaf function that's entirely branch-free, its return will > count as such an unadorned branch instruction.) Yes. > * If that's not the case, It is the case. > I can't tell if you are proposing that a single object might contain > both 16-byte and 32-byte PLT slots next to each other in the same .plt > section. That seems like a bad idea. I can think of two things off > hand that expect PLT entries to be of uniform size, and there may well > be more. > > * The foo@plt pseudo-symbols that e.g. objdump will display are based on > the BFD backend knowing the size of PLT entries. Arguably this ought > to look at sh_entsize of .plt instead of using baked-in knowledge, but > it doesn't. This seems fixable. Of course, we could also keep the PLT the same length by changing it. The current PLT entries are jmpq *GOT(sym) pushq offset jmpq plt0 The linker or dynamic linker initializes *GOT(sym) to point to the second instruction in this sequence. So we can keep the PLT at 16 bytes by simply changing it to jump somewhere else. bnd jmpq *GOT(sym) .skip 9 We have the linker or dynamic linker fill in *GOT(sym) to point to the second PLT table. When the dynamic linker is involved, we use another DT tag to point to the second PLT. The offsets are consistent: there is one entry in each PLT table, so the dynamic linker can compute the right value. Then in the second PLT we have the sequence pushq offset bnd jmpq plt0 That gives the dynamic linker the offset that it needs to update *GOT(sym) to point to the runtime symbol value. So we get slightly worse instruction cache handling the first time a function is called, but after that we are the same as before. And PLT entries are the same size as always so everything is simpler. The special DT tag will tell the dynamic linker to apply the special processing. No attribute is needed to change behaviour. The issue then is: a program linked in this way will not work with an old dynamic linker, because the old dynamic linker will not initialize GOT(sym) to the right value. That is a problem for any scheme, so I think that is OK. But if that is a concern, we could actually handle by generating two PLTs. One conventional PLT, and another as I just outlined. The linker branches to the new PLT, and initializes GOT(sym) to point to the old PLT. The dynamic linker spots this because it recognizes the new DT tags, and cunningly rewrites the GOT to point to the new PLT. Cost is an extra jump the first time a function is called when using the old
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, Jul 24, 2013 at 07:36:31PM +0200, Richard Biener wrote: > >Make them callee saved means we need to change ld.so to > >preserve them and we need to change unwind library to > >support them. It is certainly doable. > > IMHO it was a mistake to not have any callee saved xmm register in the > original abi - we should fix this at this opportunity. Loops with > function calls are not that uncommon. I've raised that earlier already. One issue with that beyond having to teach unwinders about this (dynamic linker if you mean only for the lazy PLT resolving is only a matter of whether the dynamic linker itself has been built with a compiler that would clobber those registers anywhere) is that as history shows, the vector registers keep growing over time. So if we reserve now either 8 or all 16 zmm16 to zmm31 registers as call saved, do we save them as 512 bit registers, or say 1024 bit already? If just 512 bit, then when next time the vector registers grow in size (will they?), would we have just low parts of the 1024 bits registers call saved and upper half call clobbered (I guess that is the case for M$Win 64-bit ABI now, just with 128 bit vs. more). But yeah, it would be nice to have some call saved ones. Jakub