Sign in

PhD. AI/ML Researcher.

Transformers are popular nowadays. They are becoming state of the art in all domains since they first debuted in the paper “Attention is All You Need” by Vaswani et al. [1]. Although we just call them “transformer”, it is the attention mechanism as we first saw in the paper “Neural machine translation by jointly learning to align and translate” by Bahdanau et al. [2]. In this article, I’d like to explain what is attention mechanism, how transformers are effective in what they are doing, and the latest advancements in transformers.

Let’s start with the attention mechanism. In the original paper…


Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store