I'm porting gcc to a uni-core architecture (i.e., only one core).
There are 10 function units:
(1) 2 RISCs: the 2 RISC have the same capability and they can do load/store, full-word arithmetic/logic
operations, register move, ...
(2) 4 DSPs ( 2 MAC, 1 BSU, and 1 VFU):
* MAC: can do the multiply-accumulate, and SIMD arithmetic operations
* BSU: packing/unpacking, determine absolute value, average, ...
* VFU: packing/unpacking, swap, bit reverse, determine min/max, ...
(3) 4 CFUs (Customized Function Unit: do some MPEG4 decoding related operations):
* VLD CFU: MV/DC/AC decoder
* DCT/IDCT CFU: some instructions for DCT/IDCT * MC CFU: some instructions for motion compensation/estimation
* (didn't implemented yet)


There are 8 slots in a VLIW instruction bundle (i.e, can issue at most 8 
instructions in 1 cycle),
and the assembly language syntax looks like:
       "<op code> <function unit name>        <dst>, <src1>, <src2>, ..."

For example:
===================================[top]====================================
       mov     .risc0         r1, #25            \\
       ldw      .risc0         r2, [fp, #30]     \\
       addub  .mac0        d0, d4, d3        \\
       subub  .mac1        d11, d7, d4
       add      .risc0         r3, r1, r5
===================================[end]====================================
(The symbol "\\" means "parallel". The next instruction will be issued at the 
same cycle.)
The first 4 instructions are in the same VLIW bundle (issued in the first 
cycle),
and the last one instruction is in other VLIW bundle (issued in the next cycle).

I plan to schedule the instructions by the "pipeline description".
Currently I have three questions after I reading the Ch10 ~ 13 of GCC internals 
manual:
       (1) How can I output the parallel symbol "\\" in the final pass?
             It's obvious that I should append the "\\" to some instructions 
which are in the same bundle,
             but I didn't find out the corresponding target machine 
macros/hooks to do so.

(2) How can I fill the <function unit> field?
(Could the questions, (1) and (2), be solved by using the macro PRINT_OPERAND? )


       (3) Should I put only one machine instruction in each instruction 
pattern?
             In other platform portings, I saw there are more than 1 machine instructions 
in the "output templates".
             For example: "add\\t%Q0, %Q0, %Q2\;adc\\t%R0, %R0, %R2".
             Some of the output templates will call a C function to output many 
instructions which shouldn't have
             the same characteristics in the function unit pipeline.
             I'm worried that the "multi-instructions" output template will 
confuse the DFA
             and will casue many instructions in one of VLIW bundle slots.
             Should I split them by define_split and design the corresponding 
refined instruction patterns for them?



Reply via email to