On Thu, 21 Jul 2016, Thomas Schwinge wrote: > Hmm. In an offloading configuration I see the following regression:
First of all: sorry about this (bah, this is fairly embarrassing, while I forgot to check offloading, I should have seen the fallout in check-c testing; might have tested the wrong source checkout). > [-PASS:-]{+FAIL:+} > libgomp.oacc-c/../libgomp.oacc-c-c++-common/reduction-cplx-dbl.c > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 [snip] > ptxas /tmp/ccmABlIh.o, line 1408; error : Label expected for forward > reference of '__reduction_lock' This is due to the following: 1. VAR_DECL for __reduction_lock is created on-demand when offloaded functions are compiled. 2. output_in_order does not care if new declarations appear while already-existing function bodies are emitted. It's followed by process_new_functions that handles only functions. 3. Normally process_pending_assemble_externals would have handled this, but the nvptx backend doesn't use ASM_OUTPUT_EXTERNAL (:/, I wonder why), so it's a no-op there. > C: > > [-PASS:-]{+WARNING: program timed out.+} > {+FAIL:+} gcc.dg/large-size-array-4.c (test for excess errors) > > This one times out when creating a huge *.s file: > > [...] > .global .align 8 .u64 name[2147483649] = { 0, 0, [...] > > Running with -ftoplevel-reorder (as implicitly enabled before), a *.s > file with just the preamble gets created. Are these behaviors correct? Yes, with -ftoplevel-reorder the huge variable is not emitted because it's unused. In PTX there's no way to emit such a huge initializer without spelling it all out, so the behavior after the patch is expected. I think the testcase uncovers a flaw in GCC's behavior: it tests for an error diagnostic, but it wouldn't be emitted on any 32-bit platform with -ftoplevel-reorder. So, while as a quick-fix it's possible to skip this test on NVPTX, it's better to make GCC diagnose this consistently and run the testcase with -fsyntax-only to avoid emitting the huge initializer. > [-PASS:-]{+FAIL:+} gcc.dg/pr16973.c (test for excess errors) This test needs "label_values" dg-require-effective-target annotation. > C++: [snip] > [-PASS:-]{+FAIL:+} g++.dg/debug/dwarf2/dwarf4-typedef.C -std=gnu++14 > (test for excess errors) > > These now all FAIL due to "sorry, unimplemented: target cannot support > nonlocal goto", which is "acceptable", but should be XFAILed. (Though, > that has generally not yet been done for C++.) Nonlocal gotos appear where a possibly-throwing call is followed by a destructor invocation (in other words, where unwinding after an exception is thrown needs to invoke destructors). Here -ftoplevel-reorder affects whether the compiler is able to deduce some calls as non-throwing (even at -O0). As NVPTX doesn't support C++ exceptions at all, you must be seeing an extreme amount of similar failure already. > [-PASS:-]{+FAIL:+} g++.dg/cpp1y/nsdmi-aggr1.C -std=c++14 (test for > excess errors) > > Now fails with: > > nvptx-as: circular reference in variable initializers Here -ftoplevel-reorder affects whether a static variable that needs a self-recursive initializer is emitted. This wouldn't work on NVPTX and passed by luck previously (if the variable wasn't unused, it would fail). > More "sorry, unimplemented: target cannot support nonlocal goto". But > why did these PASS before? Is this generally just a problem with -O0, or > are there any optimizations doing things differently whether > -ftoplevel-reorder is in effect or not, and we could perhaps shuffle some > things so that for nvptx we run into "sorry, unimplemented: target cannot > support nonlocal goto" less often? That is, I'm not clear on how > -fno-toplevel-reorder interacts with "nonlocal goto". I've provided an explanation above, but to put it in different words, -ftoplevel-reorder is a form of optimization (enabled by default except at -O0), and on nvptx it was implicitly enabled at -O0 previously, allowing some cleanups to happen and thus concealing some issues. The biggest of those, "cannot support nonlocal goto", is due to no possibility of C++ exceptions on NVPTX. Did you consider enabling -fno-exceptions by default for nvptx? > [-PASS:-]{+FAIL:+} g++.old-deja/g++.pt/crash55.C -std=c++14 (test for > excess errors) > > These now fail with: > > ptxas crash55.o, line 61; error : Arguments mismatch for instruction > 'mov' > ptxas crash55.o, line 63; error : Arguments mismatch for instruction > 'mov' > ptxas crash55.o, line 80; error : Label expected for argument 0 of > instruction 'call' > ptxas crash55.o, line 80; error : Function '_ZN3fooIcE1dEPS0_' not > declared in this scope This is due to a combination of bogus sjlj exception emission and a bug in nvptx-as that is not ready for such bogosity. I hope I've satisfactorily explained the failures you've pointed out (thanks for the data). I think I should leave the choice of what to do next (revert the patch or leave it in and install fixups where appropriate) up to you? Thanks. Alexander