1 | %matplotlib inline |
Sequence-to-Sequence Modeling with nn.Transformer and TorchText
This is a tutorial on how to train a sequence-to-sequence model
that uses thenn.Transformer <https://pytorch.org/docs/master/nn.html?highlight=nn%20transformer#torch.nn.Transformer>
module.
PyTorch 1.2 release includes a standard transformer module based on the
paper Attention is All You
Need <https://arxiv.org/pdf/1706.03762.pdf>
The transformer model
has been proved to be superior in quality for many sequence-to-sequence
problems while being more parallelizable. The nn.Transformer
module
relies entirely on an attention mechanism (another module recently
implemented asnn.MultiheadAttention <https://pytorch.org/docs/master/nn.html?highlight=multiheadattention#torch.nn.MultiheadAttention>
)
to draw global dependencies between input and output. The nn.Transformer
module is now highly modularized such that a single component (like nn.TransformerEncoder <https://pytorch.org/docs/master/nn.html?highlight=nn%20transformerencoder#torch.nn.TransformerEncoder>
in this tutorial) can be easily adapted/composed.