How do they work?
The filter finds the average duh.
When to use them?
It captures the general presence of features. It smooths out outliers, gets rid of spikes. For tasks where the general presence of features is more important than local extremes. Example: Sentiment analysis from images.
It can be interesting to use them with noisy data, but I would recommend preprocessing the data properly instead.
Use it if detecting global trends is important, like a slowly narrowing pipe for example.
Implementation
torch.nn.AvgPool2d(kernel_size=2)