On Wed, Apr 27, 2016 at 12:43 AM, Ruiling Song <[email protected]> wrote: > v2: > fix build error.
Some documentation or description about this would be very welcome. I cannot honestly imagine anyone being able to review this without it, and I think I know what this patch is doing. :) Some questions I think your patch's commit message (as well as comments in the code) should answer: - What is a register bank? - What is a register bank conflict? - How are the register banks laid out? Does it differ per-generation of hardware? - What effect does a register bank conflict have? E.g., changes in instruction issue rate, instruction latency, ability to co-issue. - How does this patch attempt to avoid register bank conflicts? - In practice, what effect does this patch have on register bank conflicts? For instance, I might count the number of register bank conflicts in a collection of programs before and after the patch to demonstrate that it is effective. > Signed-off-by: Ruiling Song <[email protected]> > --- > backend/src/backend/gen_reg_allocation.cpp | 31 > ++++++++++++++++++++++++++++-- > 1 file changed, 29 insertions(+), 2 deletions(-) > > diff --git a/backend/src/backend/gen_reg_allocation.cpp > b/backend/src/backend/gen_reg_allocation.cpp > index 89c53d4..ce07f8a 100644 > --- a/backend/src/backend/gen_reg_allocation.cpp > +++ b/backend/src/backend/gen_reg_allocation.cpp > @@ -35,6 +35,7 @@ > #include <iomanip> > > > +#define HALF_REGISTER_FILE_OFFSET (32*64) > namespace gbe > { > > ///////////////////////////////////////////////////////////////////////////// > @@ -48,9 +49,10 @@ namespace gbe > */ > struct GenRegInterval { > INLINE GenRegInterval(ir::Register reg) : > - reg(reg), minID(INT_MAX), maxID(-INT_MAX) {} > + reg(reg), minID(INT_MAX), maxID(-INT_MAX), conflictReg(0) {} > ir::Register reg; //!< (virtual) register of the interval > int32_t minID, maxID; //!< Starting and ending points > + ir::Register conflictReg; // < has banck conflict with this register Typo: bank > }; > > typedef struct GenRegIntervalKey { > @@ -1052,7 +1054,17 @@ namespace gbe > // and the source is a scalar Dword. If that is the case, the byte > register > // must get 4byte alignment register offset. > alignment = (alignment + 3) & ~3; > - while ((grfOffset = ctx.allocate(size, alignment)) == 0) { > + > + bool direction = true; > + if (interval.conflictReg != 0) { > + // try to allocate conflict registers in top/bottom half. > + if (RA.contains(interval.conflictReg)) { > + if (RA.find(interval.conflictReg)->second < > HALF_REGISTER_FILE_OFFSET) { > + direction = false; > + } > + } > + } > + while ((grfOffset = ctx.allocate(size, alignment, direction)) == 0) { > const bool success = this->expireGRF(interval); > if (success == false) { > if (spillAtInterval(interval, size, alignment) == false) > @@ -1104,6 +1116,7 @@ namespace gbe > for (auto &insn : block.insnList) { > const uint32_t srcNum = insn.srcNum, dstNum = insn.dstNum; > insn.ID = insnID; > + bool is3SrcOp = insn.opcode == SEL_OP_MAD; Does Beignet not use other 3-src opcodes? LRP, BFI2, BFE, CSEL? Of course, MAD is the most important one. _______________________________________________ Beignet mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/beignet
