A way to normalize and center activations between layers. See Normalization between layers
For very small batches, the mean and std vary strongly and will make the network unstable. Consider LayerNorms instead.
if the mean and std of the input is important information, then the model will loose that information if Batchnorms are applies.
How does it work?
- Normalize and center each batch within the batch. (mean of the batch, std of the batch)
2. Normalized data is usually good for stability etc, but if the networks output is not between 1 and zero, then it might benefit from a scaling factor and bias.
When to use it
Nearly always a good idea to use them. They help training speed and stabilize the model.
Use it directly after convolutional layer, or after fully connected layers, before ReLu layers.
When not to use it
- Small batches: If the batches are too small, the mean and variance cannot be trusted anymore and will lead to a loss of information.
Implementation
replace "3d" with "2d" if necessary
class DoubleConv3D(nn.Module):
def __init__(self, in_channels, out_channels):
super().__init__()
self.conv_op = nn.Sequential(
nn.Conv3d(in_channels, out_channels, kernel_size=3, padding=1),
nn.BatchNorm3d(out_channels),
nn.ReLU(inplace=True),
nn.Conv3d(out_channels, out_channels, kernel_size=3, padding=1),
nn.BatchNorm3d(out_channels),
nn.ReLU(inplace=True)
)
def forward(self, x):
return self.conv_op(x)