Binarization is the process of converting numerical data into binary form, typically 0s and 1s. It involves setting a threshold value, and any data point above the threshold is marked as 1, while those below or equal to the threshold are marked as 0.

Why use it:

Mostly for simplification. If we know, that certain parts of the data are unnecessary, then we should remove them (setting them to 0). Furthermore it can help us remove noise, which might cause unwanted side effects. If the noise introduces some Bias, is also called De-Biasing data.

Example: Scanned Signatures:

Forged Signature:
Pasted image 20240910142539.png

Real Signature:
Pasted image 20240910142548.png

As you can see, the background is much more gray for the forged signature, than for the real one. This Clever Hans Artefact, needs to be removed!

First Step: Binarization

Pasted image 20240910142359.png

Okay great, the signature is already much cleaner. However the background is still easily detectable due to the binarised noise.