On Tue, Dec 23, 2014 at 6:16 PM, John David Anglin <dave.ang...@bell.net> wrote: > On 23-Dec-14, at 7:28 PM, H.J. Lu wrote: > >> On Tue, Dec 23, 2014 at 3:55 PM, John David Anglin <dave.ang...@bell.net> >> wrote: >>> >>> On 23-Dec-14, at 5:37 PM, H.J. Lu wrote: >>> >>>> In this case, arguments are passed in registers. Isn't the >>>> optimization >>>> disabled for ia32, which passes arguments on stack, even before your >>>> change? >>> >>> >>> >>> It's not disabled in dse.c. Possibly, this occurs for some cases for >>> ia32 >>> in ix86_function_ok_for_sibcall. >>> >>> The problem in dse is in how to distinguish stores used for arguments >>> from >>> those used for general >>> manipulation of data. It seems the argument data for the call to bar in >>> the >>> testcase are copied through >>> frame memory on x86_64. >>> >>> We have two situations: >>> >>> /* This field is only used for the processing of const functions. >>> These functions cannot read memory, but they can read the stack >>> because that is where they may get their parms. We need to be >>> this conservative because, like the store motion pass, we don't >>> consider CALL_INSN_FUNCTION_USAGE when processing call insns. >>> Moreover, we need to distinguish two cases: >>> 1. Before reload (register elimination), the stores related to >>> outgoing arguments are stack pointer based and thus deemed >>> of non-constant base in this pass. This requires special >>> handling but also means that the frame pointer based stores >>> need not be killed upon encountering a const function call. >>> 2. After reload, the stores related to outgoing arguments can be >>> either stack pointer or hard frame pointer based. This means >>> that we have no other choice than also killing all the frame >>> pointer based stores upon encountering a const function call. >>> This field is set after reload for const function calls. Having >>> this set is less severe than a wild read, it just means that all >>> the frame related stores are killed rather than all the stores. */ >>> bool frame_read; >>> >>> Case 1 is incorrect for sibling calls as the call arguments are frame or >>> argument pointer based >>> when they are not passed in registers. >>> >>> When we don't have a const function, dse assumes: >>> >>> /* Every other call, including pure functions, may read any memory >>> that is not relative to the frame. */ >>> add_non_frame_wild_read (bb_info); >>> >>> Again, this is incorrect for sibling calls (i.e., dse in general assumes >>> that call arguments are register >>> or stack pointer based before reload). >>> >> >> This optimization is done on purpose to fix: >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 >> >> Unless it is also incorrect for x86, why not only disable it for >> your target without penalizing x86 through a target hook. > > > > I tend to think the old code was wrong for x86 but I recognize the lost > optimization. > > This isn't an optimization issue on hppa. It's a wrong code bug.
I run gcc.dg/pr55023.c on x86-64 and x86 without your fix. It works fine. Can you find a testcase, failing on x86-64 and x86, which is fixed by your change? If not, this bug doesn't affect x86 and you should find another way to fix your target problem without introducing a regression on x86. > Richard wrote regarding 44194: > "So we create a stack representation to copy it to the stack temporary > (which both > are unneeded). We should see that we can avoid the temporary at all as there > is no > aggregate use of the struct left. At least we should avoid the 2nd > temporary." > The temporaries are still there and they are the problem. > I don't see that this is a target issue. There were nothing more than > argument stores in > the testcase for the PR 55023 and they were deleted by dse. > Thanks. -- H.J.