Used to transform input vectors into either lower or higher dimensional vectors with which the NN can work better.

Definition

# Class defintion
torch.nn.Embedding(num_embeddings, embedding_dim, ...)

num_embeddings: number of unique items in the dictionary
embedding_dim: The size/dimension each embedding vector will take. This size needs to be figured out case by case, it needs to match the input dimension of our model.

How to set these values?

Let's take an example: ["dog", "cat", "bird"]

This is our vocabulary. Do we have 3 distinct words. The num_embeddings is therefore 3.

The embedding_dim is unclear, it depends on the required input of our model. But that input dimension (d_model) could also depend on the best embedding. We are supposed to play around with these values.

for words the embedding would be lower dimensional, meaning it is a dimensionality reduction. For other types of data, sometimes its lower, sometimes it higher dimensional projections, depends on what the input data is.