How do they work?

Functionality

Similar to Convolutional Layers, they usually apply a filter onto a region, compressing it to a single value.

There are different types of filters (different types of pooling layers).

The filter finds the average duh.

When to use them?

It captures the general presence of features. It smooths out outliers, gets rid of spikes. For tasks where the general presence of features is more important than local extremes. Example: Sentiment analysis from images.

It can be interesting to use them with noisy data, but I would recommend preprocessing the data properly instead.

Use it if detecting global trends is important, like a slowly narrowing pipe for example.

Implementation

torch.nn.AvgPool2d(kernel_size=2)