BatchNormTraining

BatchNormTraining  // Compute mean and variance from the input.

Description

Inputs

Name Element Type Shape
input real \((\bullet, C, \ldots)\)
gamma same as input \((C)\)
beta same as input \((C)\)

Attributes

Name Type Notes
epsilon double Small bias added to variance to avoid division by 0.

Outputs

Name Element Type Shape
normalized same as gamma Same as input
batch_mean same as gamma \((C)\)
batch_variance same as gamma \((C)\)

The batch_mean and batch_variance outputs are computed per-channel from input.

Mathematical Definition

The axes of the input fall into two categories: positional and channel, with channel being axis 1. For each position, there are \(C\) channel values, each normalized independently.

Normalization of a channel sample is controlled by two values:

  • the batch_mean \(\mu\), and
  • the batch_variance \(\sigma^2\);

and by two scaling attributes: \(\gamma\) and \(\beta\).

The values for \(\mu\) and \(\sigma^2\) come from computing the mean and variance of input.

\[\begin{split}\mu_c &= \mathop{\mathbb{E}}\left(\mathtt{input}_{\bullet, c, \ldots}\right)\\ \sigma^2_c &= \mathop{\mathtt{Var}}\left(\mathtt{input}_{\bullet, c, \ldots}\right)\\ \mathtt{normlized}_{\bullet, c, \ldots} &= \frac{\mathtt{input}_{\bullet, c, \ldots}-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c\end{split}\]

Backprop

\[\begin{split}[\overline{\texttt{input}}, \overline{\texttt{gamma}}, \overline{\texttt{beta}}]=\\ \mathop{\texttt{BatchNormTrainingBackprop}}(\texttt{input},\texttt{gamma},\texttt{beta},\texttt{mean},\texttt{variance},\overline{\texttt{normed_input}}).\end{split}\]

C++ Interface

class BatchNormTraining : public ngraph::op::Op

Subclassed by ngraph::op::gpu::BatchNormTrainingWithStats

Public Functions

void validate_and_infer_types()

Throws if the node is invalid.