BatchNormTraining¶
BatchNormTraining // Compute mean and variance from the input.
Description¶
Inputs¶
Name  Element Type  Shape 

input 
real  \((\bullet, C, \ldots)\) 
gamma 
same as input 
\((C)\) 
beta 
same as input 
\((C)\) 
Attributes¶
Name  Type  Notes 

epsilon 
double 
Small bias added to variance to avoid division by 0. 
Outputs¶
Name  Element Type  Shape 

normalized 
same as gamma 
Same as input 
batch_mean 
same as gamma 
\((C)\) 
batch_variance 
same as gamma 
\((C)\) 
The batch_mean
and batch_variance
outputs are computed perchannel from
input
.
Mathematical Definition¶
The axes of the input fall into two categories: positional and channel, with channel being axis 1. For each position, there are \(C\) channel values, each normalized independently.
Normalization of a channel sample is controlled by two values:
 the batch_mean \(\mu\), and
 the batch_variance \(\sigma^2\);
and by two scaling attributes: \(\gamma\) and \(\beta\).
The values for \(\mu\) and \(\sigma^2\) come from computing the
mean and variance of input
.
Backprop¶
C++ Interface¶

class
BatchNormTraining
: public ngraph::op::Op¶ Batchnorm for training operation.
Public Functions

const NodeTypeInfo &
get_type_info
() const¶ Returns the NodeTypeInfo for the node’s class. During transition to type_info, returns a dummy type_info for Node if the class has not been updated yet.

BatchNormTraining
(const Output<Node> &input, const Output<Node> &gamma, const Output<Node> &beta, double epsilon)¶  Parameters
input
: Must have rank >= 2, [., C, …]gamma
: gamma scaling for normalized value. [C]beta
: bias added to the scaled normalized value [C]epsilon
: Avoids divsion by 0 if input has 0 variance

BatchNormTraining
(double eps, const Output<Node> &gamma, const Output<Node> &beta, const Output<Node> &input)¶ In this version of BatchNorm:
MEAN AND VARIANCE: computed directly from the content of ‘input’.
OUTPUT VALUE: A tuple with the following structure: [0]  The normalization of ‘input’. [1]  The perchannel means of (prenormalized) ‘input’. [2]  The perchannel variances of (prenormalized) ‘input’.
AUTODIFF SUPPORT: yes: ‘generate_adjoints(…)’ works as expected.
SHAPE DETAILS: gamma: must have rank 1, with the same span as input’s channel axis. beta: must have rank 1, with the same span as input’s channel axis. input: must have rank >= 2. The second dimension represents the channel axis and must have a span of at least 1. output[0]: shall have the same shape as ‘input’. output[1]: shall have rank 1, with the same span as input’s channel axis. output[2]: shall have rank 1, with the same span as input’s channel axis.

void
validate_and_infer_types
()¶ Throws if the node is invalid.

const NodeTypeInfo &