Bernd Schmidt writes: > On 04/15/2016 02:52 PM, Senthil Kumar Selvaraj wrote: >> >> For both testcases in the PR, reload fails to take into account that >> FP-SP elimination can no longer be performed, and tries to find reload >> regs for an rtx generated when FP-SP elimination was valid. >> >> 1. reload initializes elim table with FP->SP elimination enabled. >> 2. alter_reg for a pseudo allocates a stack slot for the pseudo, and sets >> reg_equiv_memory_loc to frame_pointer_rtx plus offset. It also sets >> something_was_spilled to true. >> 3. The main reload loop starts, and it resets something_was_spilled to false. >> 4. reload calls eliminate_regs for the pseudo and sets reg_equiv_address to >> (mem(SP + offset)). >> 5. calculate_needs_all_insns pushes a reload for SP (for the AVR target, >> SP cannot be a pointer reg - it needs to be reloaded into X Y or Z regs). >> 6. update_eliminables_and_spill calls targetm.frame_pointer_required, >> which returns true. That causes can_eliminate for FP->SP to be reset >> to zero, and FP to be added to bad_spill_regs_global. For the AVR, >> FP is Y, one of the 3 pointer regs. reload also notes that something >> has changed, and that the loop needs to run again. >> 7. reload still calls select_reload_regs, and find_regs fails to find a >> pointer reg to reload SP, which is unnecessary as FP->SP elimination >> had been disabled anyway in (6). >> >> IOW, reload fails to find pointer regs for an RTL expression that was >> created when FP->SP elimination was true, even after it turns out that >> the elimination can't be done after all. The patch tries to detect that >> - if it knows the loop is going to run again, it silences the failure. >> >> Also note that at a different point in the loop, the reload loop starts >> over if something_was_spilled (line 982-986). If set outside the reload >> loop by alter_reg, it gets reset at (3) - not sure why. I'd think a >> "continue" after update_eliminables_and_spill (line 1019-1022) would >> also work - haven't tested it though. > > That's what I was going to ask next. I think if that works for you, I > think that's an approach that would make more sense. > > However, it looks like after this call, and the other one in the same > loop, should probably be calling finish_spills. It looks like an > oversight in the existing code that this doesn't happen for the existing > continue. Please try adding it in both places.
Tried that out - that works too. Bootstrap and regression tests on x86_64 went fine, and there were no regressions for the avr target as well - the extra three passes that showed up with the initial patch show up now too. I couldn't put a continue before select_reload_regs though. save_call_clobbered_regs sets up a few things that are cleaned up in delete_caller_save_insns a little later in the loop. So I only exclude select_reload_regs - finish_spills and cleanup will happen in both cases. Does this patch look ok? If yes, could you commit it for me please? I don't have commit access. Regards Senthil gcc/ChangeLog 2016-04-28 Senthil Kumar Selvaraj <senthil_kumar.selva...@atmel.com> PR target/60040 * reload1.c (reload): Call finish_spills before restarting reload loop. Skip select_reload_regs if update_eliminables_and_spill returns true. gcc/testsuite/ChangeLog 2016-04-28 Sebastian Huber <sebastian.hu...@embedded-brains.de> Matthijs Kooijman <matth...@stdin.nl> Senthil Kumar Selvaraj <senthil_kumar.selva...@atmel.com> PR target/60040 * gcc.target/avr/pr60040-1.c: New. * gcc.target/avr/pr60040-2.c: New. diff --git gcc/reload1.c gcc/reload1.c index c2800f8..d6fcece 100644 --- gcc/reload1.c +++ gcc/reload1.c @@ -981,7 +981,8 @@ reload (rtx_insn *first, int global) /* If we allocated another stack slot, redo elimination bookkeeping. */ if (something_was_spilled || starting_frame_size != get_frame_size ()) { - update_eliminables_and_spill (); + if (update_eliminables_and_spill ()) + finish_spills (global); continue; } @@ -1021,10 +1022,12 @@ reload (rtx_insn *first, int global) did_spill = 1; something_changed = 1; } - - select_reload_regs (); - if (failure) - goto failed; + else + { + select_reload_regs (); + if (failure) + goto failed; + } if (insns_need_reload != 0 || did_spill) something_changed |= finish_spills (global); diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog index 65fe0d0..bbaa3a2 100644 --- gcc/testsuite/ChangeLog +++ gcc/testsuite/ChangeLog @@ -1,7 +1,3 @@ -2016-04-26 Bernd Schmidt <bschm...@redhat.com> - - * gcc.target/i386/lzcnt-1.c: Allow a different lzcntw output register. - 2016-04-25 Richard Biener <rguent...@suse.de> PR tree-optimization/70780 diff --git gcc/testsuite/gcc.target/avr/pr60040-1.c gcc/testsuite/gcc.target/avr/pr60040-1.c new file mode 100644 index 0000000..4fe296b --- /dev/null +++ gcc/testsuite/gcc.target/avr/pr60040-1.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-Os" } */ + +float dhistory[10]; +float test; + +float getSlope(float history[]) { + float sumx = 0; + float sumy = 0; + float sumxy = 0; + float sumxsq = 0; + float rate = 0; + int n = 10; + + int i; + for (i=1; i< 11; i++) { + sumx = sumx + i; + sumy = sumy + history[i-1]; + sumy = sumy + history[i-1]; + sumxsq = sumxsq + (i*i); + } + + rate = sumy+sumx+sumxsq; + return rate; +} + +void loop() { + test = getSlope(dhistory); +} diff --git gcc/testsuite/gcc.target/avr/pr60040-2.c gcc/testsuite/gcc.target/avr/pr60040-2.c new file mode 100644 index 0000000..c40d49f --- /dev/null +++ gcc/testsuite/gcc.target/avr/pr60040-2.c @@ -0,0 +1,112 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +typedef unsigned char __uint8_t; +typedef short unsigned int __uint16_t; +typedef long unsigned int __uint32_t; +typedef __uint8_t uint8_t ; +typedef __uint16_t uint16_t ; +typedef __uint32_t uint32_t ; +typedef __builtin_va_list __gnuc_va_list; +typedef __gnuc_va_list va_list; +typedef enum rtems_blkdev_request_op { + RTEMS_BLKDEV_REQ_READ, +} rtems_fdisk_segment_desc; +typedef struct rtems_fdisk_driver_handlers +{ + int (*blank) (const rtems_fdisk_segment_desc* sd, + uint32_t device, + uint32_t segment, + uint32_t offset, + uint32_t size); +} rtems_fdisk_driver_handlers; +typedef struct rtems_fdisk_page_desc +{ + uint16_t flags; + uint32_t block; +} rtems_fdisk_page_desc; +typedef struct rtems_fdisk_segment_ctl +{ + rtems_fdisk_page_desc* page_descriptors; + uint32_t pages_active; +} rtems_fdisk_segment_ctl; +typedef struct rtems_fdisk_segment_ctl_queue +{ +} rtems_fdisk_segment_ctl_queue; +typedef struct rtems_fdisk_device_ctl +{ + uint32_t flags; + uint8_t* copy_buffer; +} rtems_flashdisk; + +extern void rtems_fdisk_error (const char *, ...); +extern int rtems_fdisk_seg_write(const rtems_flashdisk*, + rtems_fdisk_segment_ctl*, + uint32_t, + const rtems_fdisk_page_desc* page_desc, + uint32_t); + +void rtems_fdisk_printf (const rtems_flashdisk* fd, const char *format, ...) +{ + { + va_list args; + __builtin_va_start(args,format); + } +} +static int +rtems_fdisk_seg_blank_check (const rtems_flashdisk* fd, + rtems_fdisk_segment_ctl* sc, + uint32_t offset, + uint32_t size) +{ + uint32_t device; + uint32_t segment; + const rtems_fdisk_segment_desc* sd; + const rtems_fdisk_driver_handlers* ops; + return ops->blank (sd, device, segment, offset, size); +} +static int +rtems_fdisk_seg_write_page_desc (const rtems_flashdisk* fd, + rtems_fdisk_segment_ctl* sc, + uint32_t page, + const rtems_fdisk_page_desc* page_desc) +{ + uint32_t offset = page * sizeof (rtems_fdisk_page_desc); + if ((fd->flags & (1 << 3))) + { + int ret = rtems_fdisk_seg_blank_check (fd, sc, + offset, + sizeof (rtems_fdisk_page_desc)); + } + return rtems_fdisk_seg_write (fd, sc, offset, + page_desc, sizeof (rtems_fdisk_page_desc)); +} +void +rtems_fdisk_recycle_segment (rtems_flashdisk* fd, + rtems_fdisk_segment_ctl* ssc, + rtems_fdisk_segment_ctl* dsc, + uint32_t *pages) +{ + int ret; + uint32_t spage; + uint32_t used = 0; + uint32_t active = 0; + { + rtems_fdisk_page_desc* spd = &ssc->page_descriptors[spage]; + { + rtems_fdisk_page_desc* dpd; + uint32_t dpage; + dpd = &dsc->page_descriptors[dpage]; + *dpd = *spd; + ret = rtems_fdisk_seg_write_page_desc (fd, + dsc, + dpage, dpd); + } + } + rtems_fdisk_printf (fd, "ssc end: %d-%d: p=%ld, a=%ld, u=%ld", + pages, active, used); + { + rtems_fdisk_error ("compacting: ssc pages not 0: %d", + ssc->pages_active); + } +} diff --git gcc/testsuite/gcc.target/i386/lzcnt-1.c gcc/testsuite/gcc.target/i386/lzcnt-1.c index c7054d1..f6240d1b 100644 --- gcc/testsuite/gcc.target/i386/lzcnt-1.c +++ gcc/testsuite/gcc.target/i386/lzcnt-1.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -mlzcnt " } */ -/* { dg-final { scan-assembler "lzcntw\[^\\n]*(%|)\[ad\]\[xi\]" } } */ +/* { dg-final { scan-assembler "lzcntw\[^\\n]*(%|)ax" } } */ #include <x86intrin.h> diff --git libcilkrts/ChangeLog libcilkrts/ChangeLog index 8fada8a..ed26a3a 100644 --- libcilkrts/ChangeLog +++ libcilkrts/ChangeLog @@ -1,9 +1,3 @@ -2016-04-26 Rainer Orth <r...@cebitec.uni-bielefeld.de> - - PR target/60290 - * Makefile.am (GENERAL_FLAGS): Add -funwind-tables. - * Makefile.in: Regenerate. - 2015-11-09 Igor Zamyatin <igor.zamya...@intel.com> PR target/66326 diff --git libcilkrts/Makefile.am libcilkrts/Makefile.am index 4f944dd..70538a2 100644 --- libcilkrts/Makefile.am +++ libcilkrts/Makefile.am @@ -43,9 +43,6 @@ GENERAL_FLAGS = -I$(top_srcdir)/include -I$(top_srcdir)/runtime -I$(top_srcdir)/ # Enable Intel Cilk Plus extension GENERAL_FLAGS += -fcilkplus -# Always generate unwind tables -GENERAL_FLAGS += -funwind-tables - AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99 AM_CPPFLAGS = $(GENERAL_FLAGS) AM_LDFLAGS = $(XLDFLAGS) diff --git libcilkrts/Makefile.in libcilkrts/Makefile.in index a25d1c6..629aa6a 100644 --- libcilkrts/Makefile.in +++ libcilkrts/Makefile.in @@ -371,11 +371,9 @@ ACLOCAL_AMFLAGS = -I .. -I ../config # GENERAL_FLAGS += -D_Cilk_spawn="" -D_Cilk_sync="" -D_Cilk_for=for # Enable Intel Cilk Plus extension - -# Always generate unwind tables GENERAL_FLAGS = -I$(top_srcdir)/include -I$(top_srcdir)/runtime \ -I$(top_srcdir)/runtime/config/$(config_dir) \ - -DIN_CILK_RUNTIME=1 -fcilkplus -funwind-tables + -DIN_CILK_RUNTIME=1 -fcilkplus AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99 AM_CPPFLAGS = $(GENERAL_FLAGS) AM_LDFLAGS = $(XLDFLAGS)