Woon yung Liu <ysai...@yahoo.com> writes:
> Hi all,
> 
> Thank you all for the help so far. This is probably the final part of my
> efforts to complete support for the R5900 within GCC, but I need help
> this time because the existing homebrew GCC version has no support for
> this (despite what its README file says). Hence I have nothing to refer
> to.
> 
> The R5900 has support for a couple of floating-point arithmetic, with
> its FPU (COP1). The FPU instructions are something like these:
> MADD.S (rd <- ACC + rs * rt)
> MADDA.S (ACC <- ACC + rs * rt)
> MSUB.S  (rd <- ACC - rs * rt)
> MSUBA.S (ACC <- ACC - rs * rt)
> ADDA.S (ACC <- rs + rt)
> SUBA.S (ACC <- rs - rt)
> MULA.S (ACC <- rs * rt)
> 
> These instructions are similar to those floating-point instructions with
> similar-looking names, in normal MIPS processors. But they involve the
> R5900's FPU accumulator (ACC) register instead.
> I didn't find an explicit instruction to move values to/from the ACC
> register.

How wide is the ACC register on r5900? Is it just 64-bit or does it offer
wider precision?

The move in to ACC should be achievable with ADDA.S where rt == 0.0 and
move out with MADDA.S with rs and rt == 0.0. If ACC is wider than 64-bit
then there will be some rounding mode to account for.

You will also need to know if the madd/msub instructions are fused or
unfused (i.e. with or without intermediate rounding as GCC will use them
differently depending on the answer (see fma4 pattern).

> I've added instruction patterns (or added new alternatives to existing
> patterns) to the pattern file for the R5900, but I have not seen GCC
> emit any of these new instructions. I did add a new register (the ACC
> register), as well as a matching constraint and predicate.
> 
> Is there a proper way for me to debug this? Or is this not working,
> simply because GCC can't support such instructions on its own (it looks
> complicated to issue)?

Given that the base architecture has support for all the operations you
listed (using just FPRs) then there needs to be some way to indicate why
the accumulator based patterns should be used instead. (or you could
simply only use these patterns when targeting the FPU extension in
R5900). So... Under what circumstance is it preferable to use these
instructions?

Thanks,
Matthew

> 
> Thanks and regards,
> -W Y

Reply via email to