The above example after annotation:
```
data
|                            |
sim_quantize(QINPUT) sim_quantize(QINPUT)
|                            |
add(bn_bias) 
|
...                     / 
|                        
add
```
data is usually output of previous conv2d. There are duplicated 
simulated_quantize. Followed add in both branches will convert the int8 to 
int32. So simulated_quantize + add in both branches which will be translated to 
`right_shift + cast(i8) + cast(i32)`
We use stop_fusion to ensure that previous conv2d result will be casted to int8 
before saving in global memory.

You will see the difference running quantized ResNet-50 v2.





---
[Visit 
Topic](https://discuss.tvm.ai/t/improving-quantization-accuracy-with-more-precise-bias/2444/9)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/9c1b4289f3c2f4ab9d9ff2f427d8e04f42aac1dc4c8734fd7507d9c1cb99825e).

Tianqi Chen, UW, Seattle, WA, 98105, United States
http://tracking.discuss.tvm.ai/tracking/unsubscribe?msgid=c2p7vXW3rOvqHpEkdLxa0w2

Reply via email to