A word embedding is a vector with real numbers to describe or represent a certain word. #todo

  • Find word embedding of English to use as pretrained English lang bot

As the request and the response do not have the same number of words, a special architecture of recurrent network is used.

Basically, we have two recurrent networks in this architecture. The first recurrent network is called the encoder. It encodes the input sequence into an internal state, representing it as a vector with a fixed length. This state is used as an initial state for the second recurrent network, the decoder. The output of the decoder is fed into a simple neural network which has a softmax activation function. The output of this neural network is a probability distribution, created by the softmax activation function, over the whole vocabulary for each word in the output sequence