> Date: Thu, 29 Apr 2010 08:55:56 +0200 (CEST) > From: "Jonas Paulsson" <d0...@student.lth.se>
> It feels good to know that the widening mults issue has been > resolved Yes, nice, and as late as last week too, though the patch was from February. > as > it was a bit of a disapointment I noted the erratic behaviour with GCC > 4.4.1. Perhaps you would care to comment on what to expect as a user now, > then? IIUC, it should Just Work. No, I haven't checked. Note that the fix was somewhat along the lines of what you wrote in your thesis IIUC; adding a specific pass to fix up separated operations. See <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29274> and <http://gcc.gnu.org/ml/gcc-patches/2010-02/msg00643.html>. BTW, my observation was from the 4.3 era. It's a regression, which explains why I hadn't noticed it with the 3.x version I used before that. A pity it was deemed too invasive to fix for 4.5. > Another issue that gave me porting problems was the SIMD memory accesses, > for e g doing a wide load into two adjacent narrow registers with one > instruction. This was resolved earlier on the mailinglist to not be > handleable on RTL, so I wonder now if anything has been done for this, as > it too seems rather reasonable, just like the widening loads? You wanted to load adjacent data in a wider mode that was then to be separately used in a mode half that size, but the registers had to be adjacent too? That's kind of the opposite problem to what's usually needed! If the use of the data was actually for the obvious wider mode (SI or V2HI), you'd just have to define the movsi or movv2hi pattern and it would be used, but that unfortunately seems not applicable in any way. I'm not sure that problem is of common interest I'm afraid, but if it can be resolved with a target-specific pass, there'd be reason to add a hook somewhat like TARGET_MACHINE_DEPENDENT_REORG, but earlier. But, did you check whether combine tried to match RTL that looked somewhat like: (parallel [(set (reg:HI 1) (mem:HI (plus:SI (reg:HI 3) (const_int 2)))) (set (reg:HI 2) (mem:HI (plus:SI (reg:HI 3) (const_int 4))))]) I.e. a parallel with the two loads where the addresses were adjacent? From gdb you inspect the calls to try_combine (IIRC). That insn could have been matched to a pattern like: (define_insn "*load_wide" [(set (match_operand:HI 0 "register_operand" "=d0,d1,d2") (match_operand:HI 1 "reg_plus_const_memory_operand" "m")) (set (match_operand:HI 2 "register_operand" "=d1,d2,d3") (match_operand:HI 3 "reg_plus_const_memory_operand" "m"))] "rtx_equal_p (XEXP (operands[3], 0), plus_constant (XEXP (operands[1]), 2))" "load_wide %0,%1") Just a WAG, there are reasons this would not match in the general case (for one, you'd want to try to match the opposite order too). Don't pay too much attention to the exact matching predicates, constraints and condition above. The point is just whether combine tried to generate and match a parallel with two valid loads, given source where there was obvious opportunity for it. That insn *could* then be caught with a pattern which would, through the right constraints coerce register allocation to make the right choices for the (initially separete) registers. In the example above, four registers are assumed to be valid as destination with the matching singleton constraints d0..d3. brgds, H-P