# BatchNormTraining¶

BatchNormTraining  // Compute mean and variance from the input.


## Description¶

### Inputs¶

Name Element Type Shape
input real $$(\bullet, C, \ldots)$$
gamma same as input $$(C)$$
beta same as input $$(C)$$

### Attributes¶

Name Type Notes
epsilon double Small bias added to variance to avoid division by 0.

### Outputs¶

Name Element Type Shape
normalized same as gamma Same as input
batch_mean same as gamma $$(C)$$
batch_variance same as gamma $$(C)$$

The batch_mean and batch_variance outputs are computed per-channel from input.

## Mathematical Definition¶

The axes of the input fall into two categories: positional and channel, with channel being axis 1. For each position, there are $$C$$ channel values, each normalized independently.

Normalization of a channel sample is controlled by two values:

• the batch_mean $$\mu$$, and
• the batch_variance $$\sigma^2$$;

and by two scaling attributes: $$\gamma$$ and $$\beta$$.

The values for $$\mu$$ and $$\sigma^2$$ come from computing the mean and variance of input.

$\begin{split}\mu_c &= \mathop{\mathbb{E}}\left(\mathtt{input}_{\bullet, c, \ldots}\right)\\ \sigma^2_c &= \mathop{\mathtt{Var}}\left(\mathtt{input}_{\bullet, c, \ldots}\right)\\ \mathtt{normlized}_{\bullet, c, \ldots} &= \frac{\mathtt{input}_{\bullet, c, \ldots}-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c\end{split}$

## Backprop¶

$\begin{split}[\overline{\texttt{input}}, \overline{\texttt{gamma}}, \overline{\texttt{beta}}]=\\ \mathop{\texttt{BatchNormTrainingBackprop}}(\texttt{input},\texttt{gamma},\texttt{beta},\texttt{mean},\texttt{variance},\overline{\texttt{normed_input}}).\end{split}$

## C++ Interface¶

class BatchNormTraining : public ngraph::op::Op

Subclassed by ngraph::op::gpu::BatchNormTrainingWithStats

Public Functions

void validate_and_infer_types()

Throws if the node is invalid.