> Regarding the accumulation point, if we perform fusion and add the bias in 
> `int32` in the accumulator at the end, is it any different than preloading 
> the accumulator?

When preloading a negative bias, a signed 32 bit accumulator positive 
accumulate range is extended (before  overflow), for example.  Maybe the result 
from a post bias_add is the same for most implementations, but signed int 
overflow behavior is undefined in the C standards... so the order of bias_add 
operations might matter.  
 
I saw the bias preload used in some paper.  I'll  check my notes and see if I 
can find it.  



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3591#issuecomment-514302764

Reply via email to