Hi

Could you be specific about the bugs? While we could use this for debug some 
particular errors as you describe I would think that in the general case you 
would want to rely on unit testing and conditional checks for very small 
numbers on the denominator if you can’t have a NaN. I think we should collect 
some examples before and study them carefully as fp artihmetic is tricky. I 
think is not common practice and also not portable to use signals and fp 
exceptions, as you mentioned.

Pedro

> On 9. Nov 2018, at 00:30, Lin Yuan <[email protected]> wrote:
> 
> Dear MXNet Community,
> 
> I recently found the NaN errors sometimes could be due to some
> divide-by-zero float number bugs in engine backend. However, by default,
> such an exception will not be thrown. I added a signal trap to catch this
> error (https://github.com/apache/incubator-mxnet/pull/13190) and caught a
> few exceptions when running the python unit test. But this only works for
> Linux OS.
> 
> I would like to get more feedback on the best practice to catch such bugs
> in the code and if we should enforce such checks in CI. Any comment is
> appreciated.
> 
> Best Regards,
> 
> Lin

Reply via email to