- Chapter 3 of book
- kaggle notebook, “iterate like a grandmaster”
- seq2seq
- named entity recognition
- hydrogen torch
- text token classification
- iirx spacy has displacy for these pretty ner visualisation
- get links from chat
- 2 lstm (encoder decoder) architecture to do language translation
- paper
- sayambutani com
- Some models are based only on encoder, like bert
- gpt series only based on decoder
- t5 uses both encoder n decoder
- how I read a paper by Yanic
- 6 encoder with different weights
- tensor2tensor model
- multihead attention in pytorch
- // divide n return an integer
- scale values before applying softmax
- connext paper to apply transformers to cnn is good
- transformers for time series data may not give those good results
- pytorch tabular by Manu
- matrix multiplication only works with numbers and not words
- transformers anatomy notebook
- demystifyong queries, keys and values