@FrozenGene For the output_min and max, isn't the out_dtype enough? If its uint8, we can clamp at 0 and 255. If its int8, we can clamp at -128 and 127. I don't see any reason the values will be any different, unless you want to fuse the quantized relu in the quantized convolution from the starting itself. Please let me know if I am understanding something wrong. I think we should not fuse operators in the frontend and let Relay graph fusion take care of that.
Let's see what others think about this. @tqchen @yzhliu @ZihengJiang What are your thoughts on this? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-502383341