- https://www.youtube.com/live/YwJ5b-HlqQU?feature=share
- Named entity recognition
- seq2seq
- transformers anatomy
- h2o hydrogen torch, created by grandmasters
- max length is number of tokens to would like to feed to model
- spacy helps to understand how computers understand language
- paper seq2seq
- neutral machine translation
- task is to perform translation from English to French
- Some models are based on just encoder or just decoder and some both
- Words go to encoder the encoder produces context and passes it to decoder
- depending on attention, the context gives weightage to the word
- Jay Almmar explained transformers by visualisations
- watch Yannic channel for how to read a paper
- Distil attention and augmented RNN
- Jay Almmar The illustrated transformer
- White paper of transformers stacks six encoders and six decoders
- Self attention decides the weightage of words based on context and tries to predict next word
- torch
- nn.labmi.ai
- pytorch has nn.Multihead attention