Using a umulhisi3

Michael Hope Wed, 03 Jun 2009 02:39:48 -0700

Hi there.  The architecture I'm working is a 32 bit, word based
machine with a 16x16 -> 32 unsigned multiply.  For some reason the
combine stage is converting the umulhisi3 into a mulsi3 and I'm not
sure how to track this down.


The test code is part of an alpha blend:

void blend(uint8_t* sb, uint8_t* db)
{
  uint16_t ia = 256 - *sb;
  uint16_t d = *db;

  *db = ((d * ia) >> 8) + *sb;
}

I've define the different multiplies in the .md file:
(define_insn "umulhisi3"
  [(set (match_operand:SI 0 "register_operand" "=r")
        (mult:SI (zero_extend:SI
                  (match_operand:HI 1 "register_operand" "%r"))
                 (zero_extend:SI
                  (match_operand:HI 2 "register_operand" "r"))))]
  ""
...

(define_insn "mulsi3"
  [(set (match_operand:SI 0 "register_operand" "=r")
        (mult:SI (match_operand:SI 1 "register_operand" "%r")
                 (match_operand:SI 2 "register_operand" "r")))]
   ""
...

Running at -O level optimisations gives the following in
umul.157r.outof_cfglayout, just before the combine stage:
---
(insn 3 6 4 2 umul.c:16 (set (reg/v/f:SI 28 [ sb ])
        (reg:SI 0 R10 [ sb ])) 8 {movsi} (expr_list:REG_DEAD (reg:SI 0
R10 [ sb ])
        (nil)))

(insn 4 3 5 2 umul.c:16 (set (reg/v/f:SI 29 [ db ])
        (reg:SI 1 R11 [ db ])) 8 {movsi} (expr_list:REG_DEAD (reg:SI 1
R11 [ db ])
        (nil)))

(note 5 4 8 2 NOTE_INSN_FUNCTION_BEG)

(insn 8 5 9 2 umul.c:17 (set (reg:SI 26 [ D.1217 ])
        (zero_extend:SI (mem:QI (reg/v/f:SI 28 [ sb ]) [0 S1 A8]))) 27
{zero_extendqisi2} (expr_list:REG_DEAD (reg/v/f:SI 28 [ sb ])
        (nil)))

(insn 9 8 10 2 umul.c:20 (set (reg:HI 30)
        (const_int 256 [0x100])) 1 {movhi_insn} (nil))

(insn 10 9 11 2 umul.c:20 (set (reg:SI 31)
        (minus:SI (subreg:SI (reg:HI 30) 0)
            (reg:SI 26 [ D.1217 ]))) 12 {subsi3} (expr_list:REG_DEAD (reg:HI 30)
        (nil)))

(insn 11 10 12 2 umul.c:20 (set (reg:SI 33)
        (zero_extend:SI (mem:QI (reg/v/f:SI 29 [ db ]) [0 S1 A8]))) 27
{zero_extendqisi2} (nil))

(insn 12 11 13 2 umul.c:20 (set (reg:HI 32)
        (subreg:HI (reg:SI 33) 0)) 1 {movhi_insn} (expr_list:REG_DEAD
(reg:SI 33)
        (nil)))

(insn 13 12 14 2 umul.c:20 (set (reg:SI 34)
        (mult:SI (zero_extend:SI (reg:HI 32))
            (zero_extend:SI (subreg:HI (reg:SI 31) 0)))) 14
{umulhisi3} (expr_list:REG_DEAD (reg:HI 32)
        (expr_list:REG_DEAD (reg:SI 31)
            (nil))))

(insn 14 13 15 2 umul.c:20 (set (reg:SI 35)
        (ashiftrt:SI (reg:SI 34)
            (const_int 8 [0x8]))) 21 {ashrsi3_const}
(expr_list:REG_DEAD (reg:SI 34)
        (nil)))

(insn 15 14 16 2 umul.c:20 (set (reg:QI 36)
        (subreg:QI (reg:SI 35) 0)) 0 {movqi_insn} (expr_list:REG_DEAD
(reg:SI 35)
        (nil)))

(insn 16 15 17 2 umul.c:20 (set (reg:SI 37)
        (plus:SI (reg:SI 26 [ D.1217 ])
            (subreg:SI (reg:QI 36) 0))) 11 {addsi3}
(expr_list:REG_DEAD (reg:QI 36)
        (expr_list:REG_DEAD (reg:SI 26 [ D.1217 ])
            (nil))))

(insn 17 16 0 2 umul.c:20 (set (mem:QI (reg/v/f:SI 29 [ db ]) [0 S1 A8])
        (subreg:QI (reg:SI 37) 0)) 0 {movqi_insn} (expr_list:REG_DEAD
(reg:SI 37)
        (expr_list:REG_DEAD (reg/v/f:SI 29 [ db ])
            (nil))))
---
The umulhisi3 has been correctly found and used at this stage.  In the
following combine stage however, it gets converted into a mulsi3.  The
.combine dump is attached.

The xtensa port is the closest match I can find as it is 32 bit, word
based, and has the umulhisi3.  It correctly keeps the 16 bit multiply.

Some other test cases like:
uint32_t mul(uint16_t a, uint16_t b)
{
    return a*b;
}

come through fine.  It might be something to do with the memory access.

How does the combine stage work?  It looks like it could get multiple
potential matches for a set of RTLs.  Does it use some type of costing
function to pick between them?  Can I tell combine that a umulhisi3 is
cheaper than a mulsi3?

Thanks for the earlier help on the post reload split to use the
accumulator - it's working well.

-- Michael

umul.i.159r.combine
Description: Binary data

Using a umulhisi3

Reply via email to