Android native build of GCC
Android native GCC can't support LTO because of a lack of support for dlopen() in the C library. How should we patch the configury to disable LTO by default? Thanks, Andrew.
GCC 5 Status Report (2015-01-08), Stage 4 to start soon
The trunk is still in Stage 3 now, which means it is open for general bugfixing, but will enter Stage 4 on Friday, 16th, end of day (timezone of your preference). Once that happens, only wrong-code fixes, regression bugfixes and documentation fixes will be allowed, as is normal for our release branches too. There are still a few patches that have been posted during Stage 1, please get them committed into trunk before Stage 4 starts. Still misleading quality data below - some P3 bugs have not been re-prioritized. Quality Data Priority # Change from last report --- --- P1 39+ 24 P2 98+ 15 P3 48- 84 --- --- Total 185- 45 Previous Report === https://gcc.gnu.org/ml/gcc/2014-11/msg00249.html
IRA : Changes in the cost of putting allocno into memory.
Hello Vladimir: We have made the changes in the ira-color.c in ira_loop_edge_freq and move_spill_restore. The main motivation behind the change is to reduce the memory instruction with respect to the Loops. The changes are done to not to consider the back edge frequency if there are no references of the allocno inside the loop even though the allocno are live out at the exit of the Loop and Live In at the entry of the Loop. This will reduce the calculation of the cost of putting allocno into memory. If we reduce the cost of putting allocno into memory, then there are chances of getting color in the simplification phase of the gcc IRA register allocator. Enabling this, there are chances of putting the allocno in the register and the splitting phase will split the live range across the Loop boundary. This will beneficial with respect to reducing memory instruction inside the Loop. Other changes is done to enable some of the allocno to allocate in memory if the cost is less than equal to zero. This is the first phase of the change in the IRA core register module, I would like to get your feedbacks on this change based on this the actual patch is sent. The changes are accompanied with the patch created for the change which will be easier for you to review the change. We have tested the changes for Microblaze target for MIBENCH and EEMBC benchmarks and there are no degradation in the Geomean and some of the Mibench and EEMBC benchmarks is benefitted with this change. Please review the changes and let me know your feedbacks and other steps needs to be done to incorporate the changes in the Upstream". We would like to get this change for Microblaze, Please also let us know if the changes are required to be enabled with any switch flag. Changes are given below. From 758ee2227e9dde946ac35b772bee99279b1bf996 Mon Sep 17 00:00:00 2001 From: Ajit Kumar Agarwal Date: Tue, 6 Jan 2015 19:42:16 +0530 Subject: [PATCH] IRA : Changes in the cost of putting allocno into memory. Changes are made to not consider the back edge frequency for cost calculation if the allocno is live out at the exit of the Loop and Live in at the entry of the Loop and there are no references inside the Loop. Further changes are made to enable some of the allocnos into memory if the costs is less than equal to 0. ChangeLog: 2015-01-06 Ajit Agarwal * ira-color.c (move_spill_restore): Add the check cost less than equal to 0. (ira_loop_edge_freq): Add to ignore the loop edge frequence if there are no reference of allocno inside the loop. Signed-off-by:Ajit Agarwal ajit...@xilinx.com --- gcc/ira-color.c | 22 +++--- 1 files changed, 19 insertions(+), 3 deletions(-) diff --git a/gcc/ira-color.c b/gcc/ira-color.c index 39387c8..d180e86 100644 --- a/gcc/ira-color.c +++ b/gcc/ira-color.c @@ -2409,6 +2409,8 @@ int ira_loop_edge_freq (ira_loop_tree_node_t loop_node, int regno, bool exit_p) { int freq, i; + bitmap_iterator bi; + unsigned int j; edge_iterator ei; edge e; vec edges; @@ -2423,7 +2425,14 @@ ira_loop_edge_freq (ira_loop_tree_node_t loop_node, int regno, bool exit_p) && (regno < 0 || (bitmap_bit_p (df_get_live_out (e->src), regno) && bitmap_bit_p (df_get_live_in (e->dest), regno - freq += EDGE_FREQUENCY (e); + { + EXECUTE_IF_SET_IN_BITMAP (loop_node->all_allocnos, 0, j, bi) + { + if(regno == ALLOCNO_REGNO (ira_allocnos[j])) + if (ALLOCNO_NREFS (ira_allocnos[j]) != 0) + freq += EDGE_FREQUENCY (e); + } + } } else { @@ -2432,7 +2441,14 @@ ira_loop_edge_freq (ira_loop_tree_node_t loop_node, int regno, bool exit_p) if (regno < 0 || (bitmap_bit_p (df_get_live_out (e->src), regno) && bitmap_bit_p (df_get_live_in (e->dest), regno))) - freq += EDGE_FREQUENCY (e); + { +EXECUTE_IF_SET_IN_BITMAP (loop_node->all_allocnos, 0, j, bi) +{ + if(regno == ALLOCNO_REGNO (ira_allocnos[j])) +if(ALLOCNO_NREFS (ira_allocnos[j]) != 0) + freq += EDGE_FREQUENCY (e); + } + } edges.release (); } @@ -3441,7 +3457,7 @@ move_spill_restore (void) * (exit_freq + enter_freq)); } } - if (cost < 0) + if (cost <= 0) { ALLOCNO_HARD_REGNO (a) = -1; if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) -- 1.7.1 Thanks & Regards Ajit
Re: Support for architectures without hardware interlocks
Hi David, I've worked on a gcc target that was porting an architecture without hardware interlock support. Basically, you need to emit nop operations to avoid possible hw conflicts. At the moment, this was done by patching the gcc scheduler to do so, Another issue to keep is to check for hardware conflicts across basic-block boundaries. And not the last, is to prohibit/avoid any instruction stream modification after scheduler (e.g., peephole optimizations etc.). Best, Claudiu On Fri, Oct 31, 2014 at 8:03 PM, David Kang wrote: > > Hello, > > I'm working on porting gcc to an architecture without hardware interlock > support for floating point unit. I read that instruction latency time can be > expressed in machine description file of gcc. I set the latency time of the > instructions and built gcc. > I expected that gcc would put the two dependent instructions apart > automatically > at least as many as the latency time of the first instruction. > However, my gcc doesn't do that. > I'm using a little old 4.7.3. > I also expected that gcc may fill the gap with no-op when it cannot find > other useful instructions to fill the gap. > But, I don't see that, either. > > Does gcc support an architecture without hardware interlock automatically? > Could anyone help me to understand how I can enforce the latency requirements > of two dependent instructions in gcc? > > I saw that GCC didn't support architectures without hardware interlocks in > the gcc mailing list > which is dated in 2007. (https://gcc.gnu.org/ml/gcc/2007-07/msg00915.html) > Is it still true? > > Thanks, > > David. > > -- > -- > Dr. Dong-In "David" Kang > Computer Scientist > USC/ISI
Re: Support for architectures without hardware interlocks
> I've worked on a gcc target that was porting an architecture without > hardware interlock support. Basically, you need to emit nop operations > to avoid possible hw conflicts. At the moment, this was done by > patching the gcc scheduler to do so, Another issue to keep is to check > for hardware conflicts across basic-block boundaries. And not the > last, is to prohibit/avoid any instruction stream modification after > scheduler (e.g., peephole optimizations etc.). That's an overly complex approach, this usually can be done in a simpler way with a machine-specific pass that runs at the end of the RTL pipeline. -- Eric Botcazou
Re: Support for architectures without hardware interlocks
On 1/8/2015 9:01 AM, Eric Botcazou wrote: >> I've worked on a gcc target that was porting an architecture without >> hardware interlock support. Basically, you need to emit nop operations >> to avoid possible hw conflicts. At the moment, this was done by >> patching the gcc scheduler to do so, Another issue to keep is to check >> for hardware conflicts across basic-block boundaries. And not the >> last, is to prohibit/avoid any instruction stream modification after >> scheduler (e.g., peephole optimizations etc.). > That's an overly complex approach, this usually can be done in a simpler way > with a machine-specific pass that runs at the end of the RTL pipeline. > Isn't this similar to needing to fill a delay slot after a branch instruction? My recollection is that some SPARC and MIPS have to deal with that. -- Joel Sherrill, Ph.D. Director of Research & Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985
RE: Support for architectures without hardware interlocks
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Joel Sherrill Sent: Thursday, January 08, 2015 8:59 PM To: Eric Botcazou; Claudiu Zissulescu Cc: gcc@gcc.gnu.org; David Kang Subject: Re: Support for architectures without hardware interlocks On 1/8/2015 9:01 AM, Eric Botcazou wrote: >> I've worked on a gcc target that was porting an architecture without >> hardware interlock support. Basically, you need to emit nop >> operations to avoid possible hw conflicts. At the moment, this was >> done by patching the gcc scheduler to do so, Another issue to keep is >> to check for hardware conflicts across basic-block boundaries. And >> not the last, is to prohibit/avoid any instruction stream >> modification after scheduler (e.g., peephole optimizations etc.). > That's an overly complex approach, this usually can be done in a > simpler way with a machine-specific pass that runs at the end of the RTL > pipeline. > >>Isn't this similar to needing to fill a delay slot after a branch >>instruction? My recollection is that some SPARC and MIPS have to deal with >>that. I think this can be done at the machine description md file level with define_delay where the delay slot description can be described. Thanks & Regards Ajit -- Joel Sherrill, Ph.D. Director of Research & Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985
Re: Support for architectures without hardware interlocks
Handling the insertion of the nops at the end of RTL pipe needs to take also care of branch shortening optimizations, and filling delay slots. Probably for the given context (only FPU ops) may be a doable approach. //Claudiu On Thu, Jan 8, 2015 at 4:28 PM, Joel Sherrill wrote: > > On 1/8/2015 9:01 AM, Eric Botcazou wrote: >>> I've worked on a gcc target that was porting an architecture without >>> hardware interlock support. Basically, you need to emit nop operations >>> to avoid possible hw conflicts. At the moment, this was done by >>> patching the gcc scheduler to do so, Another issue to keep is to check >>> for hardware conflicts across basic-block boundaries. And not the >>> last, is to prohibit/avoid any instruction stream modification after >>> scheduler (e.g., peephole optimizations etc.). >> That's an overly complex approach, this usually can be done in a simpler way >> with a machine-specific pass that runs at the end of the RTL pipeline. >> > Isn't this similar to needing to fill a delay slot after a branch > instruction? My recollection > is that some SPARC and MIPS have to deal with that. > > -- > Joel Sherrill, Ph.D. Director of Research & Development > joel.sherr...@oarcorp.comOn-Line Applications Research > Ask me about RTEMS: a free RTOS Huntsville AL 35805 > Support Available(256) 722-9985 >
RE: Support for architectures without hardware interlocks
> On 1/8/2015 9:01 AM, Eric Botcazou wrote: > >> I've worked on a gcc target that was porting an architecture without > >> hardware interlock support. Basically, you need to emit nop > >> operations to avoid possible hw conflicts. At the moment, this was > >> done by patching the gcc scheduler to do so, Another issue to keep is > >> to check for hardware conflicts across basic-block boundaries. And > >> not the last, is to prohibit/avoid any instruction stream > >> modification after scheduler (e.g., peephole optimizations etc.). > > That's an overly complex approach, this usually can be done in a > > simpler way with a machine-specific pass that runs at the end of the > RTL pipeline. > > > Isn't this similar to needing to fill a delay slot after a branch > instruction? My recollection is that some SPARC and MIPS have to deal > with that. MIPS has two forms of dealing with hazards. Where the DS filler has not filled a DS then the branch output patterns have a print modifier that is printed as a nop. For other hazards (which are more like a lack of hardware interlocks) then there is a MIPS specific reorg pass that looks for hazards in the instruction stream and emits an appropriate amount of NOPs. See mips_avoid_hazard and related code to see roughly how it works. Matthew > > -- > Joel Sherrill, Ph.D. Director of Research & Development > joel.sherr...@oarcorp.comOn-Line Applications Research > Ask me about RTEMS: a free RTOS Huntsville AL 35805 > Support Available(256) 722-9985
Re: Support for architectures without hardware interlocks
> Handling the insertion of the nops at the end of RTL pipe needs to > take also care of branch shortening optimizations, and filling delay > slots. Probably for the given context (only FPU ops) may be a doable > approach. Right, you do it after delay slot filling and before branch shortening. The new Visium port does that for all instructions (see visium_reorg). -- Eric Botcazou
contributing to gcc
I have been working on changes to extend the functionality of the GCC's built in code coverage tool gcov. What steps would I need to go through to work on getting these additions to be added in to the upstream releases of GCC? Thanks, Zachary Tomkoski
will openacc 2.0 be merged into trunk?
Currently, OpenACC 2.0 is in gomp-4_0-branch, but this email: https://gcc.gnu.org/ml/gcc/2015-01/msg00032.html says that gcc 5.0 will enter stage 4 on Friday 16th January, and from that point onward, only bug fixing patches will be accepted. So will gomp-4_0-branch be able to be merged into the trunk before Friday 16 January?
Re: contributing to gcc
On 01/08/15 13:02, Zach Tomkoski wrote: I have been working on changes to extend the functionality of the GCC's built in code coverage tool gcov. What steps would I need to go through to work on getting these additions to be added in to the upstream releases of GCC? https://gcc.gnu.org/contribute.html
Cross compiling and multiple sysroot question
(Reposting from gcc-help since I didn't get any replies there.) I have a question about SYSROOT_SUFFIX_SPEC, MULTILIB_OSDIRNAMES, and multilib cross compilers. I was experimenting with a multilib cross compiler and was using SYSROOT_SUFFIX_SPEC to specify different sysroots for different multilibs, including big-endian and little-endian with 32 and 64 bits. Now lets say I create two sysroots: sysroot/be with a bin, lib, lib64, etc. directories sysroot/le with the same set of directories These would represent the sysroot of either a 64 bit big-endian or a 64 bit little-endian linux system that could also run 32 bit executables. I want my cross compiler to be able to generate code for either system. So I set these macros and SPECs: # m32 and be are defaults MULTILIB_OPTIONS = m64 mel # In makefile fragment MULTILIB_DIRNAMES = 64 el # In makefile fragment MULTILIB_OSDIRNAMES = m64=../lib64 # In makefile fragment SYSROOT_SUFFIX_SPEC = %{mel:/el;:/eb} # in header file What seems to be happening is that the search for system libraries like libc.so work fine. It looks in sysroot/be/lib or sysroot/be/lib64 or in the equivalent little-endian directories. I.e. it searches: /lib # 32 bits /lib/../lib64 # 64 bits But when it looks for libgcc_s.so or libstdc++.so it is searching: //lib# 32 bits //lib/../lib64 # 64 bits It does not take into account SYSROOT_SUFFIX_SPEC. In fact when I do my build with this setup the little-endian libgcc_s.so files wind up overwriting the big-endian libgcc_s.so files so two of my libgcc_s.so files are completely missing from the install area. Shouldn't SYSROOT_SUFFIX_SPEC be used for the gcc shared libraries as well as the sysroot areas? I.e. install and search for libgcc_s.so.1 in: /lib # 32 bits /lib/../lib64 # 64 bits Steve Ellcey sell...@imgtec.com
Re: Cross compiling and multiple sysroot question
On Thu, 8 Jan 2015, Steve Ellcey wrote: > So I set these macros and SPECs: > # m32 and be are defaults > MULTILIB_OPTIONS = m64 mel # In makefile fragment > MULTILIB_DIRNAMES = 64 el # In makefile fragment > MULTILIB_OSDIRNAMES = m64=../lib64 # In makefile fragment In my experience, for such cases it's best to list all multilibs explicitly in MULTILIB_OSDIRNAMES, and then to specify STARTFILE_PREFIX_SPEC as well along the lines of: #define STARTFILE_PREFIX_SPEC \ "%{mabi=32: /usr/local/lib/ /lib/ /usr/lib/} \ %{mabi=n32: /usr/local/lib32/ /lib32/ /usr/lib32/} \ %{mabi=64: /usr/local/lib64/ /lib64/ /usr/lib64/}" MULTILIB_OSDIRNAMES provides directory names used in two ways: relative to $target/lib/ in the GCC installation, and relative to lib/ and usr/lib/ in a sysroot. For the latter, we want names such as plain ../lib64, but these cannot be used outside the sysroot because different multilibs would be mapped to the same directory. Directories are searched both with and without the multilib suffix, so it suffices if the directory without the suffix is correct within the sysroot while the directory with the suffix doesn't exist. STARTFILE_PREFIX_SPEC is used to ensure that a correct unsuffixed directory is searched (instead of just lib/, usr/lib/ and those with the full name from MULTILIB_OSDIRNAMES appended) within the sysroot. > But when it looks for libgcc_s.so or libstdc++.so it is searching: > > //lib# 32 bits > //lib/../lib64 # 64 bits > > It does not take into account SYSROOT_SUFFIX_SPEC. In fact when I > do my build with this setup the little-endian libgcc_s.so files wind > up overwriting the big-endian libgcc_s.so files so two of my > libgcc_s.so files are completely missing from the install area. GCC never installs anything inside the sysroot (it could be a read-only mount of the target's root filesystem, for example). Listing all multilibs explicitly (multilib=dir or multilib=!dir) in MULTILIB_OSDIRNAMES allows you to ensure they don't overwrite each other. -- Joseph S. Myers jos...@codesourcery.com
gcc-4.8-20150108 is now available
Snapshot gcc-4.8-20150108 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20150108/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 219367 You'll find: gcc-4.8-20150108.tar.bz2 Complete GCC MD5=cf23edc1fc83eb2a54b42bfde00db6f1 SHA1=97cb02c52fe12a3ee24eae8a12d25d8d95d4394d Diffs from 4.8-20150101 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.