Pasted image 20241107151236.png

Divide an image into multiple parts or regions that belong to the same class.

if we only cared about the contours/edges, this is called Contour-Based Segmentation / Edge Detection.

Basics

Each (input) pixel gets classified into a class. For the above example the horse is a Binary classification with one class: Horse pixel or not horse pixel. Below we have more classes (trees, grass, road, sidewalk, people, cars, road signs).

Input: An image
Output/Label: A mask with the same dimensions as the input Image. It is called a mask, because it is meant to be an overlay, that "masks"/isolates part of the image. For the example of the horse, we would hide the background.

Technically, the horse masks would be 0 everywhere except for the horse where we would have 1s. The 1 is the class "horse", the 0 would be "not horse" or in this case, background.

State of the art

U-Net: Developed for medical image segmentation, it is widely used for image segmentation.

Unet Transformers have been shown to perform very well for image segmentation as well.

Medical CCTA scans