## Description In `tf.keras`, users could call `add_loss` method to create some non-standard loss function (when I say standard, I mean loss function that takes parameters other than `y_true` and `y_pred`), e.g. loss function that involves the input.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Layer#add_loss A practical example would be Bayesian Neural Network: ```python model = tf.keras.Sequential([ tfp.layers.DenseReparameterization(512, activation=tf.nn.relu), tfp.layers.DenseReparameterization(10), ]) logits = model(features) neg_log_likelihood = tf.nn.softmax_cross_entropy_with_logits( labels=labels, logits=logits) kl = sum(model.losses) loss = neg_log_likelihood + kl train_op = tf.train.AdamOptimizer().minimize(loss) ``` source: https://github.com/tensorflow/probability/blob/r0.8/tensorflow_probability/python/layers/dense_variational.py#L356 In this case, the loss is composed of two parts: classification error and the loss inside `DenseReparameterization`(which is the KL divergence between the posterior and prior of weights in each layer)(i.e. model.losses). This is achieved by utilizing `add_loss` method. _______________________________ However, this feature is currently not supported by Gluon. In order to implement it, I tired the following code : ```python class StochasticBlock(nn.HybridBlock): def __init__(self): super(StochasticBlock, self).__init__() self._losses = [] def add_loss(self, loss): self._losses.append(loss) @property def losses(self): collected_losses = [] collected_losses.extend(self._losses) for child in self._children.values(): if hasattr(child, '_losses'): collected_losses.extend(getattr(child, '_losses')) return collected_losses class DiagGaussian(StochasticBlock): def __init__(self): super(DiagGaussian, self).__init__() def hybrid_forward(self, F, loc, scale): log_variance = F.np.log(1e-20 + scale ** 2) KL = 0.5 * F.np.sum(1 + log_variance - loc ** 2 - F.np.exp(log_variance), axis=1) self.add_loss(KL) return (F.np.random.normal(loc, scale)) diagGaussian = DiagGaussian() loc = np.random.uniform(-10, 10, size=(2,2)) scale = np.random.uniform(size=(2,2)) diagGaussian.hybridize() print(diagGaussian(loc, scale)) print(diagGaussian.losses[0]) ``` It worked well, if not turning hybridize, otherwise the `losses[0]` would become `<_Symbol diaggaussian0_multiply_scalar0>` instead of some concrete value. I am actively looking for other solutions to this problem, a potential workaround would be forcing `losses` to be one of the block's output. Not sure if it is gonna work in `Sequential`, it's also super not elegant =_= Having this feature could bring huge convenience for the implementation of deep generative models (such as VAE.) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-mxnet/issues/17004
