.. batch_norm_training.rst: ################# BatchNormTraining ################# .. code-block:: cpp BatchNormTraining // Compute mean and variance from the input. Description =========== Inputs ------ +---------------------+-------------------------+------------------------------+ | Name | Element Type | Shape | +=====================+=========================+==============================+ | input | real | :math:(\bullet, C, \ldots) | +---------------------+-------------------------+------------------------------+ | gamma | same as input | :math:(C) | +---------------------+-------------------------+------------------------------+ | beta | same as input | :math:(C) | +---------------------+-------------------------+------------------------------+ Attributes ---------- +------------------+--------------------+--------------------------------------------------------+ | Name | Type | Notes | +==================+====================+========================================================+ | epsilon | double | Small bias added to variance to avoid division by 0. | +------------------+--------------------+--------------------------------------------------------+ Outputs ------- +---------------------+-------------------------+-----------------------------+ | Name | Element Type | Shape | +=====================+=========================+=============================+ | normalized | same as gamma | Same as input | +---------------------+-------------------------+-----------------------------+ | batch_mean | same as gamma | :math:(C) | +---------------------+-------------------------+-----------------------------+ | batch_variance | same as gamma | :math:(C) | +---------------------+-------------------------+-----------------------------+ The batch_mean and batch_variance outputs are computed per-channel from input. Mathematical Definition ======================= The axes of the input fall into two categories: positional and channel, with channel being axis 1. For each position, there are :math:C channel values, each normalized independently. Normalization of a channel sample is controlled by two values: * the batch_mean :math:\mu, and * the batch_variance :math:\sigma^2; and by two scaling attributes: :math:\gamma and :math:\beta. The values for :math:\mu and :math:\sigma^2 come from computing the mean and variance of input. .. math:: \mu_c &= \mathop{\mathbb{E}}\left(\mathtt{input}_{\bullet, c, \ldots}\right)\\ \sigma^2_c &= \mathop{\mathtt{Var}}\left(\mathtt{input}_{\bullet, c, \ldots}\right)\\ \mathtt{normlized}_{\bullet, c, \ldots} &= \frac{\mathtt{input}_{\bullet, c, \ldots}-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c Backprop ======== .. math:: [\overline{\texttt{input}}, \overline{\texttt{gamma}}, \overline{\texttt{beta}}]=\\ \mathop{\texttt{BatchNormTrainingBackprop}}(\texttt{input},\texttt{gamma},\texttt{beta},\texttt{mean},\texttt{variance},\overline{\texttt{normed_input}}). C++ Interface ============== .. doxygenclass:: ngraph::op::BatchNormTraining :project: ngraph :members: