Yes this is for the how to implement a transformer, but there will still be some theory. Implementing a transformer is hard, and it is vital to understand the theory.
You need to really know how to implement a normal NN and its theoretical foundations, see Model Implementation (Pytorch)
What makes up a transformer?
- Embeddings
- Self-Attention Mechanism
- Positional Encoding
- Transformer final layers, which layers I am using here is very important, and I need to figure out when to use which layer. related to Model Selection.
Pytorch implementation
Basically we are just putting the parts together. The code for this is explained in the parts, no need to repeat it here. I will just link a working transformer ipynb file.
![[simple_transformer.ipynb]]